summaryrefslogtreecommitdiff
path: root/sys/kern/kern_descrip.c
AgeCommit message (Collapse)Author
2022-12-05zap a pile of dangling tabsTheo de Raadt
2022-08-14remove unneeded includes in sys/kernJonathan Gray
ok mpi@ miod@
2022-01-20Shifting signed integers left by 31 is undefined behavior in C.Alexander Bluhm
found by kubsan; joint work with tobhe@; OK miod@
2021-10-25Revert commitid: ufM9BcSbXqfLpzBH;Claudio Jeker
Move vfs_stall_barrier() from the fd layer into vn_lock() and the vfs layer. In some cases it can result in a deadlock while suspending. Discussed with mpi@ and deraadt@
2021-10-21Move vfs_stall_barrier() from the fd layer into vn_lock() and the vfs layer.Claudio Jeker
vfs stalling is used by suspend/resume and by vmt(4) to stall any filesystem operation from altering the state on disk. All these operations will call vn_lock and be stalled. Adjust vfs_stall_barrier() to allow the lock owner to still progress so that suspend can sync the filesystems after stalling vfs operation. OK mpi@
2020-06-11Move FRELE() outside fdplock in dup*(2) code. This avoids a potentialVisa Hankala
lock order issue with the file close path. The FRELE() can trigger the close path during dup*(2) if another thread manages to close the file descriptor simultaneously. This race is possible because the file reference is taken before the file descriptor table is locked for write access. Vitaliy Makkoveev agrees OK anton@ mpi@
2020-03-13In order to unlock flock(2), make writes to the f_iflags field of structanton
file atomic. This also gets rid of the last kernel lock protected field in the scope of struct file. ok mpi@ visa@
2020-02-26Release the file descriptor table lock before calling closef()Visa Hankala
in finishdup(). This makes the order of operations similar to that of fdrelease() and removes a case where lock ordering might cause problems. OK anton@, mpi@
2020-02-18Move setting of UF_EXCLOSE file descriptor flag inside finishdup().Visa Hankala
This makes it easier to release fdplock before calling closef(). OK mpi@, anton@
2020-02-05Move kernel locking inside knote_fdclose() from finishdup() andVisa Hankala
fdrelease(). This makes the upper layer of file descriptor closing free of KERNEL_LOCK() when the process does not use kqueue. The kernel locking around fdremove() and knote_fdclose() is no longer needed because kqueue_register() checks if there has been a race with file descriptor close. Moreover, the locking became ineffective against these races when filterops callbacks were allowed to sleep. OK anton@, mpi@
2020-02-01Make writes to the f_flag field of `struct file' MP-safe using atomicanton
operations. Since the type of f_flag must change in order to use the atomic(9) API, reorder the struct in order to avoid padding; as pointed out by tedu@. ok mpi@ visa@
2020-01-08Skip fdplock when freeing a file descriptor table. The lock is notVisa Hankala
necessary because other threads cannot access the data structure. This fixes the following lock order issue: witness: lock order reversal: 1st 0xfffffd81d821d248 fdlock (&newfdp->fd_fd.fd_lock) 2nd 0xffff800000fe45b8 primlk (&prime_fpriv->lock) lock order "&prime_fpriv->lock"(rwlock) -> "&newfdp->fd_fd.fd_lock"(rwlock) first seen at: #0 witness_checkorder+0x449 #1 rw_enter_write+0x43 #2 dma_buf_fd+0x8c #3 drm_gem_prime_handle_to_fd+0xed #4 drmioctl+0xdc #5 VOP_IOCTL+0x55 #6 vn_ioctl+0x64 #7 sys_ioctl+0x2f6 #8 syscall+0x389 #9 Xsyscall+0x128 lock order "&newfdp->fd_fd.fd_lock"(rwlock) -> "&prime_fpriv->lock"(rwlock) first seen at: #0 witness_checkorder+0x449 #1 rw_enter_write+0x43 #2 drm_gem_object_release_handle+0x5e #3 idr_for_each+0xee #4 drm_gem_release+0x1f #5 drmclose+0x144 #6 spec_close+0x213 #7 VOP_CLOSE+0x49 #8 vn_closefile+0x9b #9 fdrop+0x8b #10 closef+0xaf #11 fdfree+0xd4 #12 exit1+0x1cf #13 sys_exit+0x16 #14 syscall+0x389 #15 Xsyscall+0x128 OK mpi@
2020-01-08Unify handling of ioctls FIOSETOWN/SIOCSPGRP/TIOCSPGRP andVisa Hankala
FIOGETOWN/SIOCGPGRP/TIOCGPGRP. Do this by determining the meaning of the ID parameter inside the sigio code. Also add cases for FIOSETOWN and FIOGETOWN where there have been TIOCSPGRP and TIOCGPGRP before. These changes allow removing the ID translation from sys_fcntl() and sys_ioctl(). Idea from NetBSD OK mpi@, claudio@
2020-01-06Make kqlist part of filedesc and serialize access to it using fdplock.Visa Hankala
This choice of locking is guided by knote_fdclose(). OK mpi@, anton@
2020-01-03Fix a file descriptor close race in kqueue_register()Visa Hankala
After inserting a knote, check that the associated file descriptor still references the same file. Remove the knote if the descriptor has changed because otherwise the kqueue becomes inconsistent with the file descriptor table. There is an analogous race in fcntl(F_SETLK). It is already handled, but the code can be simplified by using the same check as in kqueue_register(). Fix inspired by DragonFly BSD OK mpi@, anton@
2019-08-05Allow concurrent reads of the f_offset field of struct file byanton
serializing both read/write operations using the existing file mutex. The vnode lock still grants exclusive write access to the offset; the mutex is only used to make the actual write atomic and prevent any concurrent reader from observing intermediate values. ok mpi@ visa@
2019-07-15Do not relock fdp in fdrelease(). This prevents unnecessary lockingVisa Hankala
in the common case. OK mpi@
2019-07-12Revert anton@ changes about read/write unlockingsolene
https://marc.info/?l=openbsd-cvs&m=156277704122293&w=2 ok anton@
2019-07-10Make read/write of the f_offset field belonging to struct file MP-safe;anton
as part of the effort to unlock the kernel. Instead of relying on the vnode lock, introduce a dedicated lock per file. Exclusive write access is granted using the new foffset_enter and foffset_leave API. A convenience function foffset_get is also available for threads that only need to read the current offset. The lock acquisition order in vn_write has been changed to match the one in vn_read in order to avoid a potential deadlock. This change also gets rid of a documented race in vn_read(). Inspired by the FreeBSD implementation. With help and ok mpi@ visa@
2019-07-03Lock the kernel when removing file descriptors from the descriptorVisa Hankala
table. This should prevent a race with kevent when unlocked code closes file descriptors that are fully set up. OK mpi@
2019-06-26Return EINVAL, not EBADF for fcntl(fd, F_GETLK) of a non-vnode.Todd C. Miller
Matches the recent F_SETLK change, POSIX and the man page.
2019-06-25Return EINVAL not EBADF when trying to lock a non-vnode.Todd C. Miller
This behavior matches POSIX and our own fnctl(2) man page. OK anton@ deraadt@
2019-06-21Make resource limit access MP-safe. So far, the copy-on-write sharingVisa Hankala
of resource limit structs has been done between processes. By applying copy-on-write also between threads, threads can read rlimits in a nearly lock-free manner. Inspired by code in DragonFly BSD and FreeBSD. OK mpi@, agreement from jmatthew@ and anton@
2019-05-13dup2(n,n) would rlimit check before handling the n==n shortcut,Theo de Raadt
and incorrectly return EBADF when n>curlim. ok millert guenther tedu
2018-11-05trace struct flock; ok visa@anton
2018-08-24Remove all knotes from a file descriptor before closing the file inVisa Hankala
fdfree(). This fixes a resource leak with cyclic kqueue references and prevents a kernel stack exhaustion scenario with long kqueue chains. OK mpi@
2018-08-21Use explicit fd indexing to access fd_ofiles, to clarify the code.Visa Hankala
OK mpi@
2018-08-20Make fnew() return a new file with only one reference. This makesVisa Hankala
the API more logical. OK kettenis@ mpi@
2018-08-19Remove a stale/obvious comment.Visa Hankala
OK mpi@
2018-08-10Update fd_freefile when filtering/closing kqueue descriptors in fdcopy().Joel Sing
Prior to r1.153 of kern_descrip.c, the kqueue descriptors were removed using fdremove(), which reset fd_freefile as appropriate. The new code simply avoids adding the descriptor to the new table, however this means that fd_freefile can be left with an incorrect value, resulting in a file descriptor allocation "hole". Restore the previous behavour by lowering fd_freefile as appropriate when dropping descriptors. Issue found via golang regress tests. ok deraadt@ mpi@ visa@
2018-07-10Move socket & pipe specific logic in their ioctl handler.Martin Pieuchot
ok visa@, tb@
2018-07-07Fix an argument type error that happens when translating fcntl(F_SETOWN)Visa Hankala
to ioctl(TIOCSPGRP). The ioctl handlers expect a pointer to an int, so read the argument into a local int variable and pass the variable's address to the handler instead of referencing SCARG(uap, arg) directly. OK guenther@, mpi@
2018-07-02Update the file reference count field `f_count' using atomic operationsVisa Hankala
instead of using a mutex for update serialization. Use a per-fdp mutex to manage updating of file instance pointers in the `fd_ofiles' array to let fd_getfile() acquire file references safely with concurrent file reference releases. OK mpi@
2018-07-02Assert that fdp is locked in fdalloc().Visa Hankala
OK mpi@
2018-07-01Lock the file descriptor table when accessing the `fd_ofileflags' array.Visa Hankala
This prevents the array from being freed too early. In the function unp_internalize(), the locking also ensures the per-fdp flags stay coherent with the file instance. OK mpi@
2018-06-27Raise file_pool's IPL to prevent deadlocks with the newly unlockedVisa Hankala
system calls. OK mpi@
2018-06-26Remove a duplicate fd_used() call. The new file descriptor passedVisa Hankala
to dupfdopen() has already been registered with fd_used() in fdalloc(). The duplicate call distorted the number of open file descriptors returned by getdtablecount(2) if a file was opened via /dev/fd/. While there, assert that the file instance should already be in the file list. OK mpi@
2018-06-25Implement DRI3/prime support. This allows graphics buffers to be passedMark Kettenis
between processes using file descriptors. This provides an alternative to eporting them with guesable 32-bit IDs. This implementation does not (yet) allow sharing of graphics buffers between GPUs. ok mpi@, visa@
2018-06-25Introduce fnew(), a function to initialize a `struct file'.Martin Pieuchot
Commiting now to help refactoring of DRI3 and diskmap rewrite. ok visa@, kettenis@ as part of a larger diff.
2018-06-24Use atomic operations for updating `numfiles'. This makes the file countVisa Hankala
tracking work without locks. OK kettenis@, deraadt@
2018-06-20Unlock sendmsg(2) and sendto(2).Martin Pieuchot
These syscalls can now be executed w/o the KERNEL_LOCK() depending on the kind of socket. The current solution uses a single global mutex to serialize access to, and reference count, 'struct file'. ok visa@, kettenis@
2018-06-18Put file descriptors on shared data structures when they are completelyMartin Pieuchot
setup, take 3. LARVAL fd still exist, but they are no longer marked with a flag and no longer reachable via `fd_ofiles[]' or the global linked list. This allows us to simplifies a lot code grabbing new references to fds. All of this is now possible because dup2(2) refuses to clone LARVAL fds. Note that the `fdplock' could now be release in all open(2)-like syscalls, just like it is done in accept(2). With inputs from Mathieu Masson, visa@, guenther@ and art@ Previous version ok bluhm@, ok visa@, sthen@
2018-06-17Move kqueue related fields from struct filedesc to struct kqueue. Solves a panicanton
in knote_processexit() that can occur when the filedesc belonging to the process already has been freed. Similiar work has been done in: - FreeBSD (commit bc1805c6e871c178d0b6516c3baa774ffd77224a) - DragonFlyBSD (commit ccafe911a3aa55fd5262850ecfc5765cd31a56a2) Thanks to tb@ for testing. ok kettenis@ mpi@ visa@
2018-06-05Revert introduction of fdinsert(), a sanitify check triggers whenMartin Pieuchot
closing a LARVAL file. Found the hardway by sthen@.
2018-06-02Add an assert that makes explicit that finishdup() should receiveVisa Hankala
an inserted fp. OK mpi@
2018-06-02Put file descriptors on shared data structures when they are completelyMartin Pieuchot
setup. LARVAL fd still exist, but they are no longer marked with a flag and no longer reachable via `fd_ofiles[]'. This allows us to simplifies a lot code grabbing new references to fds. All of this is now possible because dup2(2) refuses to clone LARVAL fds. Note that the `fdplock' could now be release in all open(2)-like syscalls, just like it is done in accept(2). With inputs from Mathieu -, visa@, guenther@ and art@ ok visa@, bluhm@
2018-05-31Use IPL_MPFLOOR for mutexes that can be taken w/ and w/o the KERNEL_LOCK().Martin Pieuchot
From Mathieu <naabed at poolp.org>, ok visa@, tb@
2018-05-29`f_mtx' must block interrupts as long as it is taken w/ and w/o theMartin Pieuchot
KERNEL_LOCK(). Otherwise a deadlock can occur as found the hardway by tb@. ok tb@, kettenis@, visa@
2018-05-28Returns EBUSY if dup2(2) is called for a LARVAL file.Martin Pieuchot
This prevents a panic due to a double free if a program exits after having called accept(2) and dup2(2) on the same fd but without the corresponding connect(5). It will also allows us to simplify file descriptor locking. The error code has been choosed to match Linux's behavior. Pointed by Mathieu on tech@ after a discussion with guenther@. ok visa@
2018-05-08Change fd_iterfile() to not return imature fps instead of skipping themMartin Pieuchot
later. ok bluhm@, visa@