summaryrefslogtreecommitdiff
path: root/sys/kern/vfs_subr.c
AgeCommit message (Collapse)Author
2021-10-25Revert commitid: ufM9BcSbXqfLpzBH;Claudio Jeker
Move vfs_stall_barrier() from the fd layer into vn_lock() and the vfs layer. In some cases it can result in a deadlock while suspending. Discussed with mpi@ and deraadt@
2021-10-24use NULL not 0 for pointer values in kernJonathan Gray
ok semarie@
2021-10-23Sprinkle uvm_obj_destroy() over UVM object recycling code.Martin Pieuchot
For now, only assert that the tree of pages is empty in uvm_obj_destroy(). This will soon be used to free the per-UVM object lock. While here call uvm_obj_init() when new vnodes are allocated instead of in uvn_attach(). Because vnodes and there associated UVM object are currently never freed, it isn't easy to know where/when to garbage collect the associated lock. So simply check that the reference of a given object is 0 when uvn_attach(). Tested by many as part of a bigger diff. ok kettenis@
2021-10-23Assert that the KERNEL_LOCK() is held in vref(9).Martin Pieuchot
This is a guard against pushing the lock too far in UVM's vnode land. ok beck@
2021-10-21Move vfs_stall_barrier() from the fd layer into vn_lock() and the vfs layer.Claudio Jeker
vfs stalling is used by suspend/resume and by vmt(4) to stall any filesystem operation from altering the state on disk. All these operations will call vn_lock and be stalled. Adjust vfs_stall_barrier() to allow the lock owner to still progress so that suspend can sync the filesystems after stalling vfs operation. OK mpi@
2021-10-20revert vnode: remove VLOCKSWORK and check locking when vop_islocked != nullopSebastien Marie
(both kernel and userland bits) GENERIC + VFSLCKDEBUG is broken with it.
2021-10-19vnode: remove VLOCKSWORK and check locking when vop_islocked != nullopSebastien Marie
This flag is currently used to mark or unmark a vnode to actively check vnode locking semantic (when compiled with VFSLCKDEBUG). Currently, VLOCKSWORK flag isn't properly set for several FS implementation which have full locking support. This commit enable proper checking for them too (cd9660, udf, fuse, msdosfs, tmpfs). Instead of using a particular flag, it directly check if v_op->vop_islocked is nullop or not to activate or not the vnode locking checks. ok mpi@
2021-08-31Swap lock flags so that LK_EXCLUSIVE is first like in all other places.Claudio Jeker
2021-04-28Introduce a global vnode_mtx and use it to make vn_lock() safe to be calledClaudio Jeker
without the KERNEL_LOCK. This moves VXLOCK and VXWANT to a mutex protected v_lflag field and also v_lockcount is protected by this mutex. The vn_lock() dance is overly complex and all of this should probably replaced by a proper lock on the vnode but such a diff is a lot more complex. This is an intermediate step so that at least some calls can be modified to grab the KERNEL_LOCK later or not at all. OK mpi@
2021-01-29Use NULL instead of 0 to clear v_socket pointer (which actually clears allClaudio Jeker
of the v_un pointers). OK jsg@ mvs@
2020-08-23Remove unused debug_syncprt, improve debug sysctl handlingkn
"syncprt" is unused since kern/vfs_syscalls.c r1.147 from 2008. Adding new debug sysctls is a bit opaque and looking at kern/kern_sysctl.c the only visible difference between used and stub ctldebug structs in the debugvars[] array is their extern keyword, indicating that it is defined elsewhere. sys/sysctl.h declares all debugN members as extern upfront, but these declarations are not needed. Remove the unused debug sysctl, rename the only remaining one to something meaningful and remove forward declarations from /sys/sysctl.h; this way, adding new debug sysctls is a matter of adding extern and coming up with a name, which is nicer to read on its own and better to grep for. OK mpi
2020-08-22Move sysctl(2) CTL_DEBUG from DEBUG to new DEBUG_SYSCTLkn
Adding "debug.my-knob" sysctls is really helpful to select different code paths and/or log on demand during runtime without recompile, but as this code is under DEBUG, lots of other noise comes with it which is often undesired, at least when looking at specific subsystems only. Adding globals to the kernel and breaking into DDB to change them helps, but that does not work over SSH, hence the need for debug sysctls. Introduces DEBUG_SYSCTL to make use of the "debug" MIB without the rest of DEBUG; it's DEBUG_SYSCTL and not SYSCTL_DEBUG because it's not a general option for all of sysctl(2). OK gnezdo
2020-03-27Relax the lockcount assertion in vputonfreelist(). Back when I fixedanton
several problems with the vnode exclusive lock implementation, I overlooked the fact that a vnode can be in a state where the usecount is zero while the holdcount still being positive. There could still be threads waiting on the vnode lock in uvn_io() as long as the holdcount is positive. "go ahead" mpi@ Reported-by: syzbot+767d6deb1a647850a0ca@syzkaller.appspotmail.com
2020-02-13Move the LK_DRAIN logic from VOP_LOCK() to vclean() the only caller ofClaudio Jeker
VOP_LOCK with LK_DRAIN. This simplifies VOP_LOCK() a fair bit. OK visa@
2020-01-20struct vops is not modified during runtime so use const which moves eachClaudio Jeker
into read-only data segment. OK deraadt@ tedu@
2020-01-10Convert the vnode list at the mount point into a tailq. DuringAlexander Bluhm
unmount this list is traversed and the dirty vnodes are flushed to disk. Forced unmount expects that the list is empty after flushing, otherwise the kernel panics with "dangling vnode". As the write to disk can sleep, new vnodes may be inserted. If softdep is enabled, resolving the dependencies creates new dirty vnodes and inserts them to the list. To fix the panic, let insmntque() insert new vnodes at the tail of the list. Then vflush() will still catch them while traversing the list in forward direction. OK tedu@ millert@ visa@
2019-12-30In vcount() a safe loop over vnodes was commited to 4.4BSD in 1994.Alexander Bluhm
This is not necessary as the loop is restarted after vgone(). Switch to SLIST_FOREACH without _SAFE. OK visa@
2019-12-27Convert the speclisth hash buckets into SLIST macros. This makesAlexander Bluhm
the vnode alias code more readable. OK visa@
2019-12-26Fix white spaces.Alexander Bluhm
2019-12-08Convert infinite sleeps to tsleep_nsec(9).Martin Pieuchot
ok visa@, jca@
2019-08-26When a thread tries to exclusively lock a vnode, the same thread mustanton
ensure that any other thread currently trying to acquire the underlying vnode lock has observed that the same vnode is about to be exclusively locked. Such threads must then sleep until the exclusive lock has been released and then try to acquire the lock again. Otherwise, exclusive access to the vnode cannot be guaranteed. Thanks to naddy@ and visa@ for testing; ok visa@ Reported-by: syzbot+374d0e7e2400004957f7@syzkaller.appspotmail.com
2019-07-25vinvalbuf(9): tlseep -> tsleep_nsec(9); ok millert@cheloha
2019-07-19vwaitforio(9): tsleep(9) -> tsleep_nsec(9); ok visa@cheloha
2019-06-28Skip VFS barrier lock during normal operation to reduce overhead.Visa Hankala
This removes a system-wide serialization point, which might help finding timing-related bugs. OK deraadt@ anton@
2019-06-09Add a temporary workaround to make removal of giant files betterBob Beck
mlarkin@ noticed we would freeze while removing enormous files because of the amount of work done to invalidate buffers on unlink. This adds a temporary workaround to ensure we give up the lock and yield while doing this. The longer term answer will be to move these buffers to another list and not do the work here. ok deraadt@
2019-04-19Add a subsystem lock for vfs_lockf.c. This enables calling lf_advlock()Visa Hankala
and lf_purgelocks() without the kernel lock. OK anton@ mpi@
2019-04-02Restrict which filesystems are available for swap. This rules outVisa Hankala
obvious misconfigurations that cannot work. OK mpi@ tedu@
2019-02-17if a write fails, we mark the buffer invalid and throw it away. this canTed Unangst
lead to lost errors, where a later fsync will return success. to fix this, set a flag on the vnode indicating a past error has occurred, and return an error for future fsync calls. ok bluhm deraadt visa
2019-01-21Introduce a dedicated entry point data structure for file locks. This new dataanton
structure allows for better tracking of pending lock operations which is essential in order to prevent a use-after-free once the underlying vnode is gone. Inspired by the lockf implementation in FreeBSD. ok visa@ Reported-by: syzbot+d5540a236382f50f1dac@syzkaller.appspotmail.com
2018-12-23Rectify some issues with the noperm mount flag; the root vnode was notMartin Natano
protected properly and files without any x bit set were accidentaly considered executable when checked with access(2). Issues found and reported by deraadt, halex, reyk, tb ok deraadt
2018-12-07free(9) sizes for netcred.Martin Pieuchot
ok visa@
2018-09-29Use atomic operations to update vfc_refcount. Change the field's typeVisa Hankala
to unsigned int. OK deraadt@
2018-09-26Move the allocating and freeing of mount points intoVisa Hankala
dedicated functions. OK deraadt@ mpi@
2018-09-22Harmonize spacing after ellipses in displayed messages.Frederic Cambus
We were using spacing after ellipses in an inconsistent way in the installer. Standardize on using "... " everywhere and take into account the cursor position while we are waiting for the task to complete: the cursor is now always positioned after the last dot, and the space is added when displaying completion confirmation. While there, also take cursor position into account in vfs_shutdown(), and remove the extra leading space before ticks in dhclient. OK deraadt@
2018-09-17Simplify VFS initialization.Visa Hankala
Because loadable kernel modules are no longer, there is no need to register or unregister filesystem implementations at runtime. Remove vfs_register() and vfs_unregister(), and make vfsinit() call vfs_init routines directly. Replace the linked list of vfsconf structs with the vfsconflist[] array. OK mpi@ bluhm@
2018-09-16Move vfsconf lookup code into dedicated functions.Visa Hankala
OK bluhm@
2018-07-13Unveiling unveil(2).Bob Beck
This brings unveil into the tree, disabled by default - Currently this will return EPERM on all attempts to use it until we are fully certain it is ready for people to start using, but this now allows for others to do more tweaking and experimentation. Still needs to send the unveil's across forks and execs before fully enabling. Many thanks to robert@ and deraadt@ for extensive testing. ok deraadt@
2018-07-02Use more list macros for v_dirtyblkhd.Alexander Bluhm
OK mpi@
2018-06-06The function dounmount() traverses the mnt_list in forward directionAlexander Bluhm
to call vfs_busy() for all nested mount points. vfs_stall() called vfs_busy() in reverser order for all mount points. Change the direction of the latter to resolve the lock order conflict. OK visa@
2018-06-04Add VB_DUPOK to suppress witness(4) warning of concurrent mount locks.Philip Guenther
Use that in three places: - vfs_stall() - sys_mount() - dounmount()'s MNT_FORCE-does-recursive-unmounts case ok deraadt@ visa@
2018-05-27Drop unnecessary `p' parameter from vget(9).Visa Hankala
OK mpi@
2018-05-08When looping over mount points, the FOREACH SAVE macro is not save.Alexander Bluhm
The loop variable mp is protected by vfs_busy() so that it cannot be unmounted. But the next mount point nmp could be unmounted while VFS_SYNC() sleeps. As the loop in vfs_stall() does not destroy the mount point, TAILQ_FOREACH_REVERSE without _SAVE is the correct macro to use. OK deraadt@ visa@
2018-05-08Move the vfs stall "barrier" logic to a function. FREF() will soonMartin Pieuchot
change and this has nothing to do with it. ok visa@, bluhm@
2018-05-07Print the vp pointer in the vinvalbuf() panic strings.Alexander Bluhm
OK mpi@
2018-05-02Remove proc from the parameters of vn_lock(). The parameter isVisa Hankala
unnecessary because curproc always does the locking. OK mpi@
2018-04-28Clean up the parameters of VOP_LOCK() and VOP_UNLOCK(). It is alwaysVisa Hankala
curproc that does the locking or unlocking, so the proc parameter is pointless and can be dropped. OK mpi@, deraadt@
2018-03-07Remounting files systems read-only does not work reliably. ThereAlexander Bluhm
are corner cases where ffs may leak blocks. So better revert and unmount all file systems at reboot. The "init died" panic will be fixed in a different way. OK deraadt@
2018-02-10Syncronize filesystems to disk when suspending. Each mountpoint's vnodesTheo de Raadt
are pushed to disk. Dangling vnodes (unlinked files still in use) and vnodes undergoing change by long-running syscalls are identified -- and such filesystems are marked dirty on-disk while we are suspended (in case power is lost, a fsck will be required). Filesystems without dangling or busy vnodes are marked clean, resulting in faster boots following "battery died" circumstances. Tested by numerous developers, thanks for the feedback.
2017-12-14Don't bother using DETACH_FORCE for the softraid luns at rebootTheo de Raadt
time; the aggressive mountpoint destruction seems to hit insane use-after-frees when we are already far on the way down.
2017-12-14Give vflush_vnode() a hint about vnodes we don't need to account as "busy".Theo de Raadt
Change mountpoint to RDONLY a little later. Seems to improve the rw->ro transition a bit.