summaryrefslogtreecommitdiff
path: root/sys/nfs
AgeCommit message (Collapse)Author
2021-10-20revert vnode: remove VLOCKSWORK and check locking when vop_islocked != nullopSebastien Marie
(both kernel and userland bits) GENERIC + VFSLCKDEBUG is broken with it.
2021-10-19vnode: remove VLOCKSWORK and check locking when vop_islocked != nullopSebastien Marie
This flag is currently used to mark or unmark a vnode to actively check vnode locking semantic (when compiled with VFSLCKDEBUG). Currently, VLOCKSWORK flag isn't properly set for several FS implementation which have full locking support. This commit enable proper checking for them too (cd9660, udf, fuse, msdosfs, tmpfs). Instead of using a particular flag, it directly check if v_op->vop_islocked is nullop or not to activate or not the vnode locking checks. ok mpi@
2021-10-19vnode: do not manipulate vnode lock directlySebastien Marie
use VOP_LOCK / VOP_UNLOCK wrappers. VOP_LOCK() is prefered over vn_lock() here in order to keep equivalent code. ok mpi@ visa@ (as part of larger diff)
2021-10-02vfs: merge *_badop to vop_generic_badopSebastien Marie
It replaces spec_badop, fifo_badop, dead_badop and mfs_badop, which are only calls to panic(9), to one unique function vop_generic_badop(). No intented behaviour changes (outside the panic message which isn't the same). ok mpi@
2021-03-11spellingJonathan Gray
2021-01-19nfs/nfs_boot.c: convert ifunit() to if_unit(9)mvs
ok dlg@
2021-01-02nfs: don't sleep on lboltcheloha
We can simulate the current behavior without lbolt by sleeping for 1 second on the &nowake channel. ok mpi@
2020-12-25Refactor klist insertion and removalVisa Hankala
Rename klist_{insert,remove}() to klist_{insert,remove}_locked(). These functions assume that the caller has locked the klist. The current state of locking remains intact because the kernel lock is still used with all klists. Add new functions klist_insert() and klist_remove() that lock the klist internally. This allows some code simplification. OK mpi@
2020-09-27In the previous commit, check tv_nsec, not tv_sec as VNOVAL is aMatthieu Herrb
valid valuse of tv_sec but an invalid value for tv_nsec. Noticed by guenther@. ok beck@ deraadt@
2020-09-27nfs_create: after an exclusive create rpc, make sure to updateMatthieu Herrb
timestamps. This issue was iscovered after rsync 3.2 changed behaviour on an NFS mounted partition.. Change lifted from NetBSD (r 1.204). ok beck@, kn@, deraadt@
2020-08-24According the code `nfsbootdevname' is always set to network device namemvs
we expected. Remove the `else' path from nfs_boot_init(). If `nfsbootdevname' is not set something goes wrong and this is the panic condition. Also we exclude the case where we can get `ifp' which we don't expect. OK mpi@
2020-06-24kernel: use gettime(9)/getuptime(9) in lieu of time_second(9)/time_uptime(9)cheloha
time_second(9) and time_uptime(9) are widely used in the kernel to quickly get the system UTC or system uptime as a time_t. However, time_t is 64-bit everywhere, so it is not generally safe to use them on 32-bit platforms: you have a split-read problem if your hardware cannot perform atomic 64-bit reads. This patch replaces time_second(9) with gettime(9), a safer successor interface, throughout the kernel. Similarly, time_uptime(9) is replaced with getuptime(9). There is a performance cost on 32-bit platforms in exchange for eliminating the split-read problem: instead of two register reads you now have a lockless read loop to pull the values from the timehands. This is really not *too* bad in the grand scheme of things, but compared to what we were doing before it is several times slower. There is no performance cost on 64-bit (__LP64__) platforms. With input from visa@, dlg@, and tedu@. Several bugs squashed by visa@. ok kettenis@
2020-06-11Rename poll-compatibility flag to better reflect what it is.Martin Pieuchot
While here prefix kernel-only EV flags with two underbars. Suggested by kettenis@, ok visa@
2020-06-08Use a new EV_OLDAPI flag to match the behavior of poll(2) and select(2).Martin Pieuchot
Adapt FS kqfilters to always return true when the flag is set and bypass the polling mechanism of the NFS thread. While here implement a write filter for NFS. ok visa@
2020-04-07Abstract the head of knote lists. This allows extending the lists,Visa Hankala
for example, with locking assertions. OK mpi@, anton@
2020-02-20Replace field f_isfd with field f_flags in struct filterops to allowVisa Hankala
adding more filter properties without cluttering the struct. OK mpi@, anton@
2020-01-21sys/nfs: misc. tsleep(9) -> tsleep_nsec(9); ok mpi@cheloha
2020-01-20struct vops is not modified during runtime so use const which moves eachClaudio Jeker
into read-only data segment. OK deraadt@ tedu@
2020-01-15Keep socket timeout intervals in nsecs and use them with tsleep_nsec(9).Martin Pieuchot
Introduce and use TIMEVAL_TO_NSEC() to convert SO_RCVTIMEO/SO_SNDTIMEO specified values into nanoseconds. As a side effect it is now possible to specify a timeout larger that (USHRT_MAX / 100) seconds. To keep code simple `so_linger' now represents a number of seconds with 0 meaning no timeout or 'infinity'. Yes, the 0 -> INFSLP API change makes conversions complicated as many timeout holders are still memset()'d. Inputs from cheloha@ and bluhm@, ok bluhm@
2020-01-14In nfs_clearcommit() the loops over mnt_vnodelist and v_dirtyblkhdAlexander Bluhm
do not delete anything. So the safe variant of foreach is not necessary. OK mpi@ millert@ tedu@
2020-01-10Convert the vnode list at the mount point into a tailq. DuringAlexander Bluhm
unmount this list is traversed and the dirty vnodes are flushed to disk. Forced unmount expects that the list is empty after flushing, otherwise the kernel panics with "dangling vnode". As the write to disk can sleep, new vnodes may be inserted. If softdep is enabled, resolving the dependencies creates new dirty vnodes and inserts them to the list. To fix the panic, let insmntque() insert new vnodes at the tail of the list. Then vflush() will still catch them while traversing the list in forward direction. OK tedu@ millert@ visa@
2020-01-08Convert infinite sleeps to tsleep_nsec(9).Martin Pieuchot
ok bluhm@
2019-12-31Use C99 designated initializers with struct filterops. In addition,Visa Hankala
make the structs const so that the data are put in .rodata. OK mpi@, deraadt@, anton@, bluhm@
2019-12-26Convert struct vfsops initializer to C99 style.Alexander Bluhm
OK visa@
2019-12-25Use FOREACH macro to iterate over mnt_vnodelist.Alexander Bluhm
OK millert@ visa@ benno@
2019-12-05Convert infinite sleeps to tsleep_nsec(9).Martin Pieuchot
ok jca@
2019-08-05Allow concurrent reads of the f_offset field of struct file byanton
serializing both read/write operations using the existing file mutex. The vnode lock still grants exclusive write access to the offset; the mutex is only used to make the actual write atomic and prevent any concurrent reader from observing intermediate values. ok mpi@ visa@
2019-07-25vinvalbuf(9): tlseep -> tsleep_nsec(9); ok millert@cheloha
2019-07-19vwaitforio(9): tsleep(9) -> tsleep_nsec(9); ok visa@cheloha
2019-07-19getblk(9): tsleep(9) -> tsleep_nsec(9); ok visa@cheloha
2019-07-12Revert anton@ changes about read/write unlockingsolene
https://marc.info/?l=openbsd-cvs&m=156277704122293&w=2 ok anton@
2019-07-10Make read/write of the f_offset field belonging to struct file MP-safe;anton
as part of the effort to unlock the kernel. Instead of relying on the vnode lock, introduce a dedicated lock per file. Exclusive write access is granted using the new foffset_enter and foffset_leave API. A convenience function foffset_get is also available for threads that only need to read the current offset. The lock acquisition order in vn_write has been changed to match the one in vn_read in order to avoid a potential deadlock. This change also gets rid of a documented race in vn_read(). Inspired by the FreeBSD implementation. With help and ok mpi@ visa@
2019-05-13When killing a process, the signal is handled by any thread thatAlexander Bluhm
does not block the signal. If all threads block the signal, we delivered it to the main thread. This does not conform to POSIX. If any thread unblocks the signal, it should be delivered immediately to this thread. Mark such signals pending at the process instead of a single thread. Then any thread can handle it later. OK kettenis@ guenther@
2019-01-22The kernel interpreted bogus lengths in RPC calls during NFS boot.Alexander Bluhm
A malicious rpc.bootparamd could corrupt memory, but the kernel has to trust the local network anyway in a diskless environment. Now in case of an RPC error, the kernel will stop booting with a specific panic. OK claudio@ beck@
2019-01-21Introduce a dedicated entry point data structure for file locks. This new dataanton
structure allows for better tracking of pending lock operations which is essential in order to prevent a use-after-free once the underlying vnode is gone. Inspired by the lockf implementation in FreeBSD. ok visa@ Reported-by: syzbot+d5540a236382f50f1dac@syzkaller.appspotmail.com
2019-01-19Move boottime into the timehands.cheloha
To protect the timehands we first need to protect the basis for all UTC time in the kernel: the boottime. Because the boottime can be changed at any time it needs to be versioned along with the other members of the timehands to enable safe lockless reads when using it for anything. So the global boottime timespec goes away and the static boottimebin becomes a member of the timehands. Instead of reading the global boottime you use one of two interfaces: binboottime(9) or microboottime(9). nanoboottime(9) can trivially be added later, though there are no consumers for it at the moment. This introduces one small change in behavior. We used to advance the reported boottime just before launching kernel threads from main(). This makes it look to userland like we "booted" moments before those threads were launched. Because there is no longer a boottime global we can no longer trivially do this from main(), so the boottime we report to userspace via e.g. kern.boottime will now reflect whatever the time was when we bootstrapped the timehands via inittodr(9). This is usually no more than a minute before the kernel threads are launched from main(). The prior behavior can be restored by adding a new interface to the timecounter layer in a future commit. Based on FreeBSD r303387. Discussed with mpi@ and visa@. ok visa@
2019-01-18Check for negative length in NFS strings. This affects both, theAlexander Bluhm
client and server. OK beck@
2019-01-18Check for negative length integers in NFS server. A maliciousAlexander Bluhm
client could crash the server. OK tedu@
2019-01-18Check for negative length integers in NFS client. A maliciousAlexander Bluhm
server could confuse the client file system code. OK beck@
2018-11-30Switch MH_ALIGN to m_align which is the same.Claudio Jeker
OK bluhm@
2018-11-09M_LEADINGSPACE() and M_TRAILINGSPACE() are just wrappers forClaudio Jeker
m_leadingspace() and m_trailingspace(). Convert all callers to call directly the functions and remove the defines. OK krw@, mpi@
2018-09-10Instead of calculating the mbuf packet header length here and there,Alexander Bluhm
put the algorithm into a new function m_calchdrlen(). Also set an uninitialized m_len to 0 in NFS code. OK claudio@
2018-07-30Use FNONBLOCK instead of SS_NBIO to check/indicate that the I/O modeMartin Pieuchot
for sockets is non-blocking. This allows us to G/C SS_NBIO. Having to keep the two flags in sync in a mp-safe way is complicated. This change introduce a behavior change in sosplice(), it can now always block. However this should not matter much due to the socket lock being taken beforhand. ok bluhm@, benno@, visa@
2018-07-09Nuke unused define 'nfsm_writereply()'.Kenneth R Westerback
ok beck@ deraadt@ guenther@ mpi@
2018-07-02Use more list macros for v_dirtyblkhd.Alexander Bluhm
OK mpi@
2018-06-21Drop redundant "node == parent node" checks from VOP_RMDIR()Visa Hankala
implementations. Rely on the VFS layer to do the checking. OK mpi@, helg@
2018-06-13Make the VFS layer responsible for preventing the deletionVisa Hankala
of mounted on directories. OK guenther@, mpi@
2018-06-07Make callers of VOP_CREATE(9) and VOP_MKNOD(9) responsible forVisa Hankala
unlocking the directory vnode. OK mpi@, helg@
2018-06-06Pass the socket to sounlock(), this prepare the terrain for per-socketMartin Pieuchot
locking. ok visa@, bluhm@
2018-05-27Drop unnecessary `p' parameter from vget(9).Visa Hankala
OK mpi@