summaryrefslogtreecommitdiff
path: root/sys/kern
AgeCommit message (Collapse)Author
2022-06-12Allow sleeping while clearing a sleep timeoutVisa Hankala
Since sys/kern/kern_timeout.c r1.84, timeout_barrier() has used sleeping with soft-interrupt-driven timeouts. Adjust the sleep machinery so that the timeout clearing can block in sleep_finish(). This adds one step of recursion inside sleep_finish(). However, the sleep queue handling does not recurse because sleep_finish() completes it before calling timeout_del_barrier(). This fixes the following panic: panic: kernel diagnostic assertion "(p->p_flag & P_TIMEOUT) == 0" failed: file "sys/kern/kern_synch.c", line 373 Stopped at db_enter+0x10: popq %rbp db_enter() at db_enter+0x10 panic() at panic+0xbf __assert() at __assert+0x25 sleep_setup() at sleep_setup+0x1d8 cond_wait() at cond_wait+0x46 timeout_barrier() at timeout_barrier+0x109 timeout_del_barrier() at timeout_del_barrier+0xa2 sleep_finish() at sleep_finish+0x16d tsleep() at tsleep+0xb2 sys_nanosleep() at sys_nanosleep+0x12d syscall() at syscall+0x374 OK mpi@ dlg@
2022-06-12kqueue: Fix missing wakeupVisa Hankala
While one thread is running kqueue_scan(), another thread can begin scanning the same kqueue, observe that the event queue is empty, and go to sleep. If the first thread re-inserts a knote for re-processing, the second thread can miss the newly pending event. Wake up the kqueue after a re-insert to correct this. This fixes a Go test hang that jsing@ tracked down to kqueue. Tested in snaps for a week. OK jsing@ mpi@
2022-06-06Simplify solock() and sounlock(). There is no reason to return a valueClaudio Jeker
for the lock operation and to pass a value to the unlock operation. sofree() still needs an extra flag to know if sounlock() should be called or not. But sofree() is called less often and mostly without keeping the lock. OK mpi@ mvs@
2022-06-02Stop hiding a few assertions behind the opt-in LOCKF_DIAGNOSTIC option.Anton Lindqvist
This code has already been exercised quite extensively by syzkaller and got decent test coverage.
2022-06-01Fix ambiguity with lock range endVisa Hankala
When the user requests a lock range that ends at LLONG_MAX, replace the end point with the special EOF value -1. This avoids ambiguity with lf_end in lf_split(). The ambiguity could result in a broken data structure. This change is visible to userspace in a corner case. When a lock range has been requested with an end point at absolute position LLONG_MAX, fcntl(F_GETLK) returns l_len == 0, instead of a positive value, for that range. This seems consistent with FreeBSD and Linux. OK anton@ Reported-by: syzbot+c93afea6c27a3fa3af39@syzkaller.appspotmail.com
2022-06-01Fix lock range start when l_whence == SEEK_END and l_len < 0.Visa Hankala
OK anton@
2022-05-30Replace selwakeup() with KNOTE() in pipe event activation.Visa Hankala
Recommit the reverted change selectively so that only pipes are affected. Leave sockets untouched for now.
2022-05-28oops, wrong value in previous commitTheo de Raadt
2022-05-2864K of locked memory should be enough for anyone (until we hear a goodTheo de Raadt
reason why) discussed with many, ok millert
2022-05-23Respect RLIMIT_FSIZE when extending a file via truncat(2)/ftruncate(2).Todd C. Miller
This refactors the commin parts of sys_truncate() and sys_ftruncate() into dotruncate(). If the new size of the file is larger than the RLIMIT_FSIZE limit _and_ the file is being extended, not truncated, return EFBIG. Adapted from a diff by Piotr Durlej. With help from and OK by deraadt@ guenther@.
2022-05-16regenVitaliy Makkoveev
2022-05-16Unlock umask(2). sys_umask() only modifies `fd_cmask', whichVitaliy Makkoveev
modification is already protected by `fd_lock' rwlock(9). ok bluhm@
2022-05-13Use the process ps_mtx to protect the process sigacts structure.Claudio Jeker
With this cursig(), postsig() and trapsignal() become safe to be called without KERNEL_LOCK. As a side-effect sleep with PCATCH no longer needs the KERNEL_LOCK either. Since sending a signal can happen from interrupt context raise the ps_mtx IPL to high. Feedback from mpi@ and kettenis@ OK kettenis@
2022-05-12During coredumps only a single thread should be active, check thisClaudio Jeker
by checking that it is a single threaded process or that ps_single is set. OK mpi@
2022-05-12kqueue: Fix race condition in knote_remove()Visa Hankala
Always fetch the knlist array pointer at the start of every iteration in knote_remove(). This prevents the use of a stale pointer after another thread has simultaneously reallocated the kq_knlist array. Reported and tested by and OK jsing@
2022-05-10make the CPU frequency scaling duration relative to the loadSolene Rapenne
in the pre-change behavior, if the CPU frequency is raised, it will stay up for 5 cycles minimum (with one cycle being run every 100ms). With this change, the time to keep the frequency raised is incremented at each cycle up to 5. This mean short load need triggering the frequency increase will last less than the current minimum of 500ms. this only affect the automatic mode when on battery, extending the battery life for most interactive use scenarios and idling loads. tested by many with good results ok ketennis@
2022-05-10Our read/write lock implementation was not fair to writers. WhenAlexander Bluhm
multiple IP forwarding threads were processing packets and holding the shared net lock, the exclusive net lock was blocked permanently. This could result in ping times well above 10 seconds. Add the RWLOCK_WRWANT bit to the check mask of readers. Then they cannot grab the lock if a writer is also waiting. This logic was already present in revision 1.3, but got lost during refactoring. When exiting the lock, there exists a race when the RWLOCK_WRWANT bit gets deleted. Add a comment that was present until revision 1.8 to document it. The race itself is not easy to fix and had no impact during testing. OK sashan@
2022-05-09Revert "Replace selwakeup() with KNOTE() in pipe and socket event activation."Visa Hankala
The commit caused hangs with NFS. Reported by ajacoutot@ and naddy@
2022-05-06Replace selwakeup() with KNOTE() in kqueue event activation.Visa Hankala
The deferred activation can now run in an MP-safe task queue.
2022-05-06Replace selwakeup() with KNOTE() in pipe and socket event activation.Visa Hankala
OK mpi@
2022-05-05Using mutex initializer for static variable does not compile withAlexander Bluhm
witness. Make ratecheck mutex global. Reported-by: syzbot+9864ba1338526d0e8aca@syzkaller.appspotmail.com
2022-05-04Introduce mutex for ratecheck(9) and ppsratecheck(9). A globalAlexander Bluhm
mutex with spl high for all function calls is used for now. It protects the lasttime and curpps parameter. This solution is MP safe for the usual use case, allows progress, and can be optimized later. Remove a useless #if 1 while there. OK claudio@
2022-05-01regenTed Unangst
2022-05-01no need to test for toupper function in awkTed Unangst
ok cheloha millert miod
2022-04-30Enforce proper memory ordering in refcnt_rele() and refcnt_finalize()Visa Hankala
Make refcnt_rele() and refcnt_finalize() order memory operations so that preceding loads and stores happen before 1->0 transition. Also ensure that loads and stores that depend on the transition really begin only after the transition has occurred. Otherwise the object destructor might not see the object's latest state. OK bluhm@
2022-04-27Remove the lock if an identical overlapping one is already present.Anton Lindqvist
Preventing a use after free discovered by syzkaller. ok visa@ Reported-by: syzbot+a2649c1d77e9d2463f33@syzkaller.appspotmail.com Reported-by: syzbot+182df9087f5f182daa44@syzkaller.appspotmail.com Reported-by: syzbot+46d03139d7ed5e81ed2f@syzkaller.appspotmail.com Reported-by: syzbot+892e886a6113db341da1@syzkaller.appspotmail.com
2022-04-27vgone() is vgonel() with curproc as 2nd argument. Use vgonel() like theClaudio Jeker
other call in vop_generic_revoke(). OK semarie@
2022-04-26Bump __mp_lock_spinout to INT_MAX.Dave Voutila
The previous value set years ago was causing amd64 kernels to spin out when run with MP_LOCKDEBUG during boot. ok kettenis@
2022-04-11Keep `fdp' locked until we finish the second loop of unp_externalize().Vitaliy Makkoveev
This prevents descriptors from being closed concurrently on receiver side. ok bluhm@ claudio@
2022-04-07Fix kernel builds with pseudo-device rdTheo Buehler
Make the cf_attach member of struct cfdata const and sprinkle a few const into subr_autoconf.c to make this work. Fixes the compilation of sys/dev/rd.c with newly const rd_ca. ok miod (who had a similar diff)
2022-04-02Update an old commentMike Larkin
The old comment only mentioned that tty_nmea was used for time, but subsequently position data was added to this line discipline.
2022-04-02whitespace fixMike Larkin
2022-03-31Move knote_processexit() call from exit1() to the reaper().Todd C. Miller
This fixes a problem where NOTE_EXIT could be received before the process was officially a zombie and thus not immediately waitable. OK deraadt@ visa@
2022-03-27sys/vnode.h cleanup for vnode_hold_list, vnode_free_list, struct freelstSebastien Marie
vnode_hold_list and vnode_free_list aren't used outside kern/vfs_subr.c move `struct freelst` where used in kern/vfs_subr.c no intented behaviour changes. survived a release(8) build. ok millert@
2022-03-25add an exception to the CPU_ID_AA64ISAR0 in pledged applications so thatRobert Nagy
libcrypto can access this sysctl on arm64 without restrictions to determine cpu features ok deraadt@, kettenis@
2022-03-21Header netinet/in_pcb.h includes sys/mutex.h now. Recommit mutexAlexander Bluhm
for PCB tables. It does not break userland build anymore. pf_socket_lookup() calls in_pcbhashlookup() in the PCB layer. To run pf in parallel, make parts of the stack MP safe. Protect the list and hashes in the PCB tables with a mutex. Note that the protocol notify functions may call pf via tcp_output(). As the pf lock is a sleeping rw_lock, we must not hold a mutex. To solve this for now, collect these PCBs in inp_notify list and protect it with exclusive netlock. OK sashan@
2022-03-18Cleanup reference counting. Remove #ifdef DIAGNOSTIC to keep theAlexander Bluhm
code similar in non DIAGNOSTIC case. Rename refcnt variable to refs for consistency with r_refs. Add KASSERT() in refcnt_finalize(). OK visa@
2022-03-18Use the refcnt API with struct plimit.Visa Hankala
OK bluhm@ dlg@
2022-03-17Use the refcnt API with struct ucred.Visa Hankala
OK bluhm@
2022-03-16Remove an unneeded include.Visa Hankala
2022-03-16Use the refcnt API in kqueue.Visa Hankala
OK dlg@ bluhm@
2022-03-16Add refcnt_shared() and refcnt_read()Visa Hankala
refcnt_shared() checks whether the object has multiple references. When refcnt_shared() returns zero, the caller is the only reference holder. refcnt_read() returns a snapshot of the counter value. refcnt_shared() suggested by dlg@. OK dlg@ mvs@
2022-03-14Unbreak the tree, revert commitid aZ8fm4iaUnTCc0ulTheo Buehler
This reverts the commit protecting the list and hashes in the PCB tables with a mutex since the build of sysctl(8) breaks, as found by kettenis. ok sthen
2022-03-14pf_socket_lookup() calls in_pcbhashlookup() in the PCB layer. ToAlexander Bluhm
run pf in parallel, make parts of the stack MP safe. Protect the list and hashes in the PCB tables with a mutex. Note that the protocol notify functions may call pf via tcp_output(). As the pf lock is a sleeping rw_lock, we must not hold a mutex. To solve this for now, collect these PCBs in inp_notify list and protect it with exclusive netlock. OK sashan@
2022-03-11Revert part of rev 1.293. Using cursig() to deliver masked signalsClaudio Jeker
to the debugger can cause a loop between the debugger and cursig() if the signal is masked. cursig() has no way to know which signal was already delivered to the debugger and so it delivers the same signal over and over again. Instead handle traps to masked signals directly in trapsignal. This is what rev 1.293 was mostly about. If SIGTRAP was masked by the process breakpoints no longer worked since the signal deliver to the debugger did not happen. Doing this case in trapsignal solves both the problem with the loop and the delivery of masked traps. Problem reported and fix tested by matthieu@ OK kettenis@ mpi@
2022-03-10Use atomic load and store functions to access refcnt and waitAlexander Bluhm
variables. Although not necessary everywhere, using atomic functions exclusively for variables marked as atomic is clearer. OK mvs@ visa@
2022-02-25Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.comPhilip Guenther
Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
2022-02-25add setrtable to pledge("id"). from Matthew MartinTed Unangst
ok deraadt
2022-02-25Move pr_attach and pr_detach to a new structure pr_usrreqs that canPhilip Guenther
then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this. Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts. ok mvs@ bluhm@
2022-02-24regenVitaliy Makkoveev