Age | Commit message (Collapse) | Author |
|
(both kernel and userland bits)
GENERIC + VFSLCKDEBUG is broken with it.
|
|
This flag is currently used to mark or unmark a vnode to actively
check vnode locking semantic (when compiled with VFSLCKDEBUG).
Currently, VLOCKSWORK flag isn't properly set for several FS
implementation which have full locking support. This commit enable
proper checking for them too (cd9660, udf, fuse, msdosfs, tmpfs).
Instead of using a particular flag, it directly check if
v_op->vop_islocked is nullop or not to activate or not the vnode
locking checks.
ok mpi@
|
|
use VOP_LOCK / VOP_UNLOCK wrappers.
VOP_LOCK() is prefered over vn_lock() here in order to keep equivalent code.
ok mpi@ visa@ (as part of larger diff)
|
|
It replaces spec_badop, fifo_badop, dead_badop and mfs_badop, which
are only calls to panic(9), to one unique function vop_generic_badop().
No intented behaviour changes (outside the panic message which isn't
the same).
ok mpi@
|
|
|
|
ok dlg@
|
|
We can simulate the current behavior without lbolt by sleeping for 1
second on the &nowake channel.
ok mpi@
|
|
Rename klist_{insert,remove}() to klist_{insert,remove}_locked().
These functions assume that the caller has locked the klist. The current
state of locking remains intact because the kernel lock is still used
with all klists.
Add new functions klist_insert() and klist_remove() that lock the klist
internally. This allows some code simplification.
OK mpi@
|
|
valid valuse of tv_sec but an invalid value for tv_nsec.
Noticed by guenther@. ok beck@ deraadt@
|
|
timestamps.
This issue was iscovered after rsync 3.2 changed behaviour on an NFS
mounted partition.. Change lifted from NetBSD (r 1.204).
ok beck@, kn@, deraadt@
|
|
we expected. Remove the `else' path from nfs_boot_init(). If
`nfsbootdevname' is not set something goes wrong and this is the panic
condition. Also we exclude the case where we can get `ifp' which we don't
expect.
OK mpi@
|
|
time_second(9) and time_uptime(9) are widely used in the kernel to
quickly get the system UTC or system uptime as a time_t. However,
time_t is 64-bit everywhere, so it is not generally safe to use them
on 32-bit platforms: you have a split-read problem if your hardware
cannot perform atomic 64-bit reads.
This patch replaces time_second(9) with gettime(9), a safer successor
interface, throughout the kernel. Similarly, time_uptime(9) is replaced
with getuptime(9).
There is a performance cost on 32-bit platforms in exchange for
eliminating the split-read problem: instead of two register reads you
now have a lockless read loop to pull the values from the timehands.
This is really not *too* bad in the grand scheme of things, but
compared to what we were doing before it is several times slower.
There is no performance cost on 64-bit (__LP64__) platforms.
With input from visa@, dlg@, and tedu@.
Several bugs squashed by visa@.
ok kettenis@
|
|
While here prefix kernel-only EV flags with two underbars.
Suggested by kettenis@, ok visa@
|
|
Adapt FS kqfilters to always return true when the flag is set and bypass
the polling mechanism of the NFS thread.
While here implement a write filter for NFS.
ok visa@
|
|
for example, with locking assertions.
OK mpi@, anton@
|
|
adding more filter properties without cluttering the struct.
OK mpi@, anton@
|
|
|
|
into read-only data segment.
OK deraadt@ tedu@
|
|
Introduce and use TIMEVAL_TO_NSEC() to convert SO_RCVTIMEO/SO_SNDTIMEO
specified values into nanoseconds. As a side effect it is now possible
to specify a timeout larger that (USHRT_MAX / 100) seconds.
To keep code simple `so_linger' now represents a number of seconds with
0 meaning no timeout or 'infinity'.
Yes, the 0 -> INFSLP API change makes conversions complicated as many
timeout holders are still memset()'d.
Inputs from cheloha@ and bluhm@, ok bluhm@
|
|
do not delete anything. So the safe variant of foreach is not
necessary.
OK mpi@ millert@ tedu@
|
|
unmount this list is traversed and the dirty vnodes are flushed to
disk. Forced unmount expects that the list is empty after flushing,
otherwise the kernel panics with "dangling vnode". As the write
to disk can sleep, new vnodes may be inserted. If softdep is
enabled, resolving the dependencies creates new dirty vnodes and
inserts them to the list. To fix the panic, let insmntque() insert
new vnodes at the tail of the list. Then vflush() will still catch
them while traversing the list in forward direction.
OK tedu@ millert@ visa@
|
|
ok bluhm@
|
|
make the structs const so that the data are put in .rodata.
OK mpi@, deraadt@, anton@, bluhm@
|
|
OK visa@
|
|
OK millert@ visa@ benno@
|
|
ok jca@
|
|
serializing both read/write operations using the existing file mutex.
The vnode lock still grants exclusive write access to the offset; the
mutex is only used to make the actual write atomic and prevent any
concurrent reader from observing intermediate values.
ok mpi@ visa@
|
|
|
|
|
|
|
|
https://marc.info/?l=openbsd-cvs&m=156277704122293&w=2
ok anton@
|
|
as part of the effort to unlock the kernel. Instead of relying on the
vnode lock, introduce a dedicated lock per file. Exclusive write access
is granted using the new foffset_enter and foffset_leave API. A
convenience function foffset_get is also available for threads that only
need to read the current offset.
The lock acquisition order in vn_write has been changed to match the one
in vn_read in order to avoid a potential deadlock. This change also gets
rid of a documented race in vn_read().
Inspired by the FreeBSD implementation.
With help and ok mpi@ visa@
|
|
does not block the signal. If all threads block the signal, we
delivered it to the main thread. This does not conform to POSIX.
If any thread unblocks the signal, it should be delivered immediately
to this thread.
Mark such signals pending at the process instead of a single thread.
Then any thread can handle it later.
OK kettenis@ guenther@
|
|
A malicious rpc.bootparamd could corrupt memory, but the kernel has
to trust the local network anyway in a diskless environment. Now
in case of an RPC error, the kernel will stop booting with a specific
panic.
OK claudio@ beck@
|
|
structure allows for better tracking of pending lock operations which is
essential in order to prevent a use-after-free once the underlying vnode is
gone.
Inspired by the lockf implementation in FreeBSD.
ok visa@
Reported-by: syzbot+d5540a236382f50f1dac@syzkaller.appspotmail.com
|
|
To protect the timehands we first need to protect the basis for all UTC
time in the kernel: the boottime.
Because the boottime can be changed at any time it needs to be versioned
along with the other members of the timehands to enable safe lockless reads
when using it for anything. So the global boottime timespec goes away and
the static boottimebin becomes a member of the timehands. Instead of reading
the global boottime you use one of two interfaces: binboottime(9) or
microboottime(9). nanoboottime(9) can trivially be added later, though there
are no consumers for it at the moment.
This introduces one small change in behavior. We used to advance the
reported boottime just before launching kernel threads from main().
This makes it look to userland like we "booted" moments before those
threads were launched. Because there is no longer a boottime global we
can no longer trivially do this from main(), so the boottime we report
to userspace via e.g. kern.boottime will now reflect whatever the time
was when we bootstrapped the timehands via inittodr(9). This is usually
no more than a minute before the kernel threads are launched from main().
The prior behavior can be restored by adding a new interface to the
timecounter layer in a future commit.
Based on FreeBSD r303387.
Discussed with mpi@ and visa@.
ok visa@
|
|
client and server.
OK beck@
|
|
client could crash the server.
OK tedu@
|
|
server could confuse the client file system code.
OK beck@
|
|
OK bluhm@
|
|
m_leadingspace() and m_trailingspace(). Convert all callers to call
directly the functions and remove the defines.
OK krw@, mpi@
|
|
put the algorithm into a new function m_calchdrlen(). Also set an
uninitialized m_len to 0 in NFS code.
OK claudio@
|
|
for sockets is non-blocking.
This allows us to G/C SS_NBIO. Having to keep the two flags in sync
in a mp-safe way is complicated.
This change introduce a behavior change in sosplice(), it can now
always block. However this should not matter much due to the socket
lock being taken beforhand.
ok bluhm@, benno@, visa@
|
|
ok beck@ deraadt@ guenther@ mpi@
|
|
OK mpi@
|
|
implementations. Rely on the VFS layer to do the checking.
OK mpi@, helg@
|
|
of mounted on directories.
OK guenther@, mpi@
|
|
unlocking the directory vnode.
OK mpi@, helg@
|
|
locking.
ok visa@, bluhm@
|
|
OK mpi@
|