Age | Commit message (Collapse) | Author |
|
does not block the signal. If all threads block the signal, we
delivered it to the main thread. This does not conform to POSIX.
If any thread unblocks the signal, it should be delivered immediately
to this thread.
Mark such signals pending at the process instead of a single thread.
Then any thread can handle it later.
OK kettenis@ guenther@
|
|
A malicious rpc.bootparamd could corrupt memory, but the kernel has
to trust the local network anyway in a diskless environment. Now
in case of an RPC error, the kernel will stop booting with a specific
panic.
OK claudio@ beck@
|
|
structure allows for better tracking of pending lock operations which is
essential in order to prevent a use-after-free once the underlying vnode is
gone.
Inspired by the lockf implementation in FreeBSD.
ok visa@
Reported-by: syzbot+d5540a236382f50f1dac@syzkaller.appspotmail.com
|
|
To protect the timehands we first need to protect the basis for all UTC
time in the kernel: the boottime.
Because the boottime can be changed at any time it needs to be versioned
along with the other members of the timehands to enable safe lockless reads
when using it for anything. So the global boottime timespec goes away and
the static boottimebin becomes a member of the timehands. Instead of reading
the global boottime you use one of two interfaces: binboottime(9) or
microboottime(9). nanoboottime(9) can trivially be added later, though there
are no consumers for it at the moment.
This introduces one small change in behavior. We used to advance the
reported boottime just before launching kernel threads from main().
This makes it look to userland like we "booted" moments before those
threads were launched. Because there is no longer a boottime global we
can no longer trivially do this from main(), so the boottime we report
to userspace via e.g. kern.boottime will now reflect whatever the time
was when we bootstrapped the timehands via inittodr(9). This is usually
no more than a minute before the kernel threads are launched from main().
The prior behavior can be restored by adding a new interface to the
timecounter layer in a future commit.
Based on FreeBSD r303387.
Discussed with mpi@ and visa@.
ok visa@
|
|
client and server.
OK beck@
|
|
client could crash the server.
OK tedu@
|
|
server could confuse the client file system code.
OK beck@
|
|
OK bluhm@
|
|
m_leadingspace() and m_trailingspace(). Convert all callers to call
directly the functions and remove the defines.
OK krw@, mpi@
|
|
put the algorithm into a new function m_calchdrlen(). Also set an
uninitialized m_len to 0 in NFS code.
OK claudio@
|
|
for sockets is non-blocking.
This allows us to G/C SS_NBIO. Having to keep the two flags in sync
in a mp-safe way is complicated.
This change introduce a behavior change in sosplice(), it can now
always block. However this should not matter much due to the socket
lock being taken beforhand.
ok bluhm@, benno@, visa@
|
|
ok beck@ deraadt@ guenther@ mpi@
|
|
OK mpi@
|
|
implementations. Rely on the VFS layer to do the checking.
OK mpi@, helg@
|
|
of mounted on directories.
OK guenther@, mpi@
|
|
unlocking the directory vnode.
OK mpi@, helg@
|
|
locking.
ok visa@, bluhm@
|
|
OK mpi@
|
|
Tested in bulks by many. ok visa@, beck@
|
|
nfsmount. Delay the free(9) of the nfs mount point data until
pending or sleeping timeouts have finished by running it on the
softclock thread.
OK visa@
|
|
unnecessary because curproc always does the locking.
OK mpi@
|
|
curproc that does the locking or unlocking, so the proc parameter
is pointless and can be dropped.
OK mpi@, deraadt@
|
|
count after unlocking. To improve consistency, use vput() instead of
VOP_UNLOCK() + vrele().
OK guenther@, mpi@, tedu@
|
|
- Use vput(9) instead of vrele(9) when a "locked" node is returned
by nfs_nget().
- Make sure VN_KNOTE() is always called with a valid reference.
- Add a missing PDIRUNLOCK in nfs_lookup()
These changes are mostly noops as long as nfs_lock()/unlock() do
nothing.
Tested by bluhm@, visa@ and myself.
ok visa@
|
|
nodes.
nfs_root() now returns a "locked" vnode, so vput(9) must be called to
release it. Note that this has currently no effect as nfs_lock/unlock
are still stubs.
This will prevent some lock odering problems with upcoming NFSnode
locking.
Tested by landry@, sthen@, visa@, naddy@ and myself.
From NetBSD with some tweaks, ok visa@
|
|
protect insertions in `nm_ntree'.
This will prevent a future lock ordering problem with NFSnode's lock.
ok tedu@, visa@
|
|
The account flag `ASU' will no longer be set but that makes suser()
mpsafe since it no longer mess with a per-process field.
No objection from millert@, ok tedu@, bluhm@
|
|
are pushed to disk. Dangling vnodes (unlinked files still in use) and
vnodes undergoing change by long-running syscalls are identified -- and
such filesystems are marked dirty on-disk while we are suspended (in case
power is lost, a fsck will be required). Filesystems without dangling or
busy vnodes are marked clean, resulting in faster boots following
"battery died" circumstances.
Tested by numerous developers, thanks for the feedback.
|
|
ok deraadt@, bluhm@
|
|
ok millert@ krw@
|
|
for blocks re-fetchable from the filesystem. However at reboot time,
filesystems are unmounted, and since processes lack backing store they
are killed. Since the scheduler is still running, in some cases init is
killed... which drops us to ddb [noted by bluhm]. Solution is to convert
filesystems to read-only [proposed by kettenis]. The tale follows:
sys_reboot() should pass proc * to MD boot() to vfs_shutdown() which
completes current IO with vfs_busy VB_WRITE|VB_WAIT, then calls VFS_MOUNT()
with MNT_UPDATE | MNT_RDONLY, soon teaching us that *fs_mount() calls a
copyin() late... so store the sizes in vfsconflist[] and move the copyin()
to sys_mount()... and notice nfs_mount copyin() is size-variant, so kill
legacy struct nfs_args3. Next we learn ffs_mount()'s MNT_UPDATE code is
sharp and rusty especially wrt softdep, so fix some bugs adn add
~MNT_SOFTDEP to the downgrade. Some vnodes need a little more help,
so tie them to &dead_vnops.
ffs_mount calling DIOCCACHESYNC is causing a bit of grief still but
this issue is seperate and will be dealt with in time.
couple hundred reboots by bluhm and myself, advice from guenther and
others at the hut
|
|
In particular, this allows SIOCGIF* requests to run in parallel.
lots of help & ok mpi, ok visa, sashan
|
|
invalid. But the compiler cannot know whether it has changed in
the meantime, so in the else case a bunch of variables would not
be initialized. Add a panic() there to change the compiler's
assumptions, the code should not be reached anyway.
found by clang -Wuninitialized; OK deraadt@
|
|
Finally protect the last `so_rcv' and `so_snd' accesses with the socket
lock.
ok visa@, bluhm@
|
|
all the callers to call m_freem(9).
Support from deraadt@ and tedu@, ok visa@, bluhm@
|
|
being brewed.
ok beck
|
|
an if vs the condition itself. weird contortions because of course the
lines want to be like 900 columns wide, but i think it's better now.
|
|
also move checks up sooner to prevent a (root) panic.
ok bluhm
|
|
Tested by Hrvoje Popovski, ok bluhm@
|
|
ok phessler@, visa@, bluhm@
|
|
Protect the fields modifieds by sosetopt() and simplify the dance
with the stars.
ok bluhm@
|
|
As a side effect, soconnect() and soconnect2() now expect a locked socket,
so update all the callers.
ok bluhm@
|
|
this mbuf was allocated by the first call. Fixes possible memory leak.
Found by Ilja Van Sprundel
OK bluhm@ deraadt@
|
|
While here document an abuse of parent socket's lock.
Problem reported by krw@, analysis and ok bluhm@
|
|
buffers.
This is one step towards unlocking TCP input path. Note that all the
functions asserting for the socket lock are not necessarilly MP-safe.
All the fields of 'struct socket' aren't protected.
Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to
tell when a filter needs to lock the underlying data structures. Logic
and name taken from NetBSD.
Tested by Hrvoje Popovski.
ok claudio@, bluhm@, mikeb@
|
|
bugs could easily result in use-after-free or double free. Introduce
m_freemp() which automatically resets the pointer before freeing
it. So we have less dangling pointers in the kernel.
OK krw@ mpi@ claudio@
|
|
Outside of USB, no code is executed in a softnet interrupt context. So
what's protecting NFS data structures is the KERNEL_LOCK().
But more importantly, since r1.114 of nfs_socket.c, the 'softnet' thread
is no longer executing NFS code.
ok visa@
|
|
ok bluhm@
|
|
Always defere soreceive() to an nfsd(8) process instead of doing it in
the 'softnet' thread. Avoiding this recursion ensure that we do not
introduce a new sleeping point by releasing and grabbing the netlock.
Tested by many, committing now in order to find possible performance
regression.
|
|
"good work" deraadt@, ok visa@
|