summaryrefslogtreecommitdiff
path: root/sys/kern/kern_sig.c
AgeCommit message (Collapse)Author
3 daysDo not clear P_WSLEEP in ptsigna's SIGCONT handling. cursig() no longerClaudio Jeker
stops threads while called during the sleep transition and so there is no need to clear P_WSLEEP. OK mpi@
7 daysNo need to call unsleep() if p_wchan is NULL.Claudio Jeker
OK mpi@
11 daysAdjust deep logic in cursig() to handle sig_stop specially.Claudio Jeker
If any other signal is pending the stop signal should be deferred. Now cursig() uses ffs() to select the signal and so higher numbered signals like SIGUSR1 would be ignored when going to sleep. So handle default stop signals specially in the deep case, stash them and only use them if no other signal is pending. Fix for signal-stress regress (problem reported by anton@) With and OK mpi@
11 daysRemove unreachable check for orphaned process groups in cursig.Claudio Jeker
setsigctx() now does this check and clears sig_stop in that case and instead set sig_ignore. So the check in cursig that is based on sig_stop can never be true. OK mpi@
2024-11-06Factor out the ptrace trap into proc_trap() and simplify the signalClaudio Jeker
delivery in cursig() a lot since most of that is no longer needed. On top of this properly handle sending a blocked signal from gdb to the debugged process by putting the signal into to proc p_siglist. OK kettenis@
2024-11-05Unlock ptsignal by using the ps_mtx instead of KERNEL_LOCK to ensureClaudio Jeker
the process is not modified during signal delivery. This also unlocks psignal and prsignal since those are simple wrappers around ptsignal. OK mpi@
2024-11-05remove VATTR_NULL() define, directly call vattr_null()Jonathan Gray
There used to be a predefined null vattr for !DIAGNOSTIC but that was removed in vnode.h rev 1.84 in 2007. ok semarie@ miod@
2024-11-04Properly handle stop signals in cursig if deep.Claudio Jeker
In setsigctx() set sig_stop to 1 if the process should be stopped. In cursig() still return early if deep but then in sleep_signal_check() use this information to call proc_stop and stop the proc. This should fix the problem in the waitid regress test. OK mpi@
2024-10-22Protect the ps_pgrp pointer by either the KERNEL_LOCK or the ps_mtx.Claudio Jeker
This should be enough to be on the safe side when unlocking ptsignal where a pr->ps_pgrp->pg_jobc == 0 check happens. OK mpi@ kettenis@
2024-10-17Shortcut cursig when called during sleep setup.Claudio Jeker
Add deep flag as function argument which is used by the sleep API but nowhere else. Both calls to sleep_signal_check() should skip the ugly bits of cursig(). In cursig() if deep once it is clear a signal will be taken keep the signal on the thread siglist and return. sleep_signal_check() will then return EINTR or ERESTART based on the signal context. There is no reason to do more in this special case. Especially stop/cont and the ptrace trap must be skipped here. Once the call makes it to userret the signal will be picked up again and handled in a safe location. Stopping singals need some additional logic since we don't want to abort the sleep just to stop a process. Since our SIGSTOP handling requires a major rewrite this will be posponed until then. OK mpi@
2024-10-15Indicate that a process has stopped by setting PS_STOPPED flagClaudio Jeker
The checks in dowait6 and orphanpg using ps_mainproc are flawed and fail if the mainproc called pthread_exit before the other threads. Adding the flag in proc_stop_sweep is racy but the best we have right now. This fixes regress/sys/kern/signal/sig-stop3. OK mpi@
2024-10-09Clear ps_xsig when continuing after a PS_TRACED stop.Claudio Jeker
Also remove the ps_xsig handling in setrunnable() it is in the wrong spot and causes signals to be delivered over and over again. Attaching to an already stopped process is affected by this. The SIGSTOP sent by ptrace is now ignored in ptsignal() and as a result gdb will hang in wait4() until a SIGCONT is delivered to the process. After that all works as usual. OK mpi@
2024-10-09Convert prsignal() into a real functionClaudio Jeker
Also do not use ps_mainproc as the thread the signal is send to. Sending a signal to ps_mainproc may not work reliably if it already exited. Use TAILQ_FIRST(&pr->ps_threads) instead but first check that the process has not yet entered exit1(). OK mpi@
2024-10-01Adjust ptrace interface to properly suport single threaded continue.Claudio Jeker
Introduce P_TRACESINGLE flag to instruct the trapped thread to not wakeup the other threads (via single_thread_clear). This must be done like this since ptrace must wake just the single thread to ensure it runs first and gets the ps_xsig value from ptrace. Modern gdb depends on this for multi-threaded processes, when a breakpoint is hit gdb fixes up the trapping instruction and then single steps over it with only that thread. After that single step gdb continues with all threads. If all threads are run like now it is possible that one of the other threads hits a breakpoint before the single step is done which results in an assertion in gdb (because that is not expected). OK mpi@
2024-08-10spelling; ok claudio@Jonathan Gray
2024-08-06Stop using KERNEL_LOCK to protect the per process kqueue listClaudio Jeker
Instead of the KERNEL_LOCK use the ps_mtx for most operations. If the ps_klist is modified an additional global rwlock (kqueue_ps_list_lock) is required. This includes the knotes with NOTE_FORK and NOTE_EXIT since in either cases a ps_klist is changed. In the NOTE_FORK | NOTE_TRACK case the call to kqueue_register() can sleep this is why a global rwlock is used. Adjust the reaper() to call knote_processexit() without KERNEL_LOCK. Double lock idea from visa@ OK mvs@
2024-07-29Move the signal related kqueue filters to kern_event.c.Claudio Jeker
Since proc and signal filters share the same klist it makes sense to keep them together. OK mvs@
2024-07-29Replace per thread P_CONTINUED with per process PS_CONTINUED flagClaudio Jeker
dowait6() can only look at per process state so switch this over. Right now SIGCONT handling in ptsignal is recursive and not quite right but this is a step in the right direction. It fixes dowait6() handling for multithreaded processes where the main thread exited. OK mpi@
2024-07-24KASSERT that the ps_single proc has P_SUSPSINGLE cleared.Claudio Jeker
Requested by kettenis@ and guenther@
2024-07-22Rename PS_STOPPED to PS_STOPPING. I want to use PS_STOPPED to indicateClaudio Jeker
that a process has been stopped so make room for that. OK kettenis@
2024-07-10Kill the runfast and run label and inline those bits. No functional change.Claudio Jeker
OK mpi@
2024-07-09Reshuffle the switch cases in ptsignal and single_thread_set to beClaudio Jeker
in the order needed for future changes. No functional change. OK mpi@
2024-06-03Remove the now unsued s argument to SCHED_LOCK and SCHED_UNLOCK.Claudio Jeker
The SPL level is not tacked by the mutex and we no longer need to track this in the callers. OK miod@ mlarkin@ tb@ jca@
2024-05-22In the big p_stat switch in ptsignal do not call return but insteadClaudio Jeker
use one of the gotos. In this case goto out with mask and prop set to 0. OK jca@
2024-05-20Rework interaction between sleep API and exit1() and start unlocking ps_threadsClaudio Jeker
This diff adjusts how single_thread_set() accounts the threads by using ps_threadcnt as initial value and counting all threads out that are already parked. In single_thread_check call exit1() before decreasing ps_singlecount this is now done in exit1(). exit1() and thread_fork() ensure that ps_threadcnt is updated with the pr->ps_mtx held and in exit1() also account for exiting threads since exit1() can sleep. OK mpi@
2024-05-08Rework how action SIG_HOLD is handled in ptsignal.Claudio Jeker
Since we want to unlock sigsuspend, ptsignal needs to double check in the SSLEEP case that the signal being delivered is still masked or unmasked. Remove the early return for action SIG_HOLD so that the SSLEEP case can properly recheck the sigmask. On top of this update siglist only in one place at the end of ptsignal this now includes the clearing of signals for the SA_CONT and SA_STOP cases. OK mpi@
2024-05-07In Rev 1.296 the update of the siglist was moved to the end of ptsignal().Claudio Jeker
One atomic_clearbits_int() hiding in SSTOP was missed when converting all the exceptions that cleared the siglist again. Instead of clearing the bits the mask needs to be set to 0 so that it is properly ignored. OK mpi@
2024-04-18If a proc has P_WEXIT set do not stop it, let it exit since it is alreadyClaudio Jeker
mostly dead. This is more like belts and suspenders since a proc in exit1() will not receive signals anymore and so proc_stop() should not be reachable. This is even the case when sigexit() is called and a coredump() is happening. OK mpi@
2024-04-10Unlock dosigsuspend() and with that some aspects of ppoll and pselectClaudio Jeker
Change p_sigmask from atomic back to non-atomic updates. All changes to p_sigmask are only allowed by curproc (the owner). There is no need for atomic instructions here. p_sigmask is mostly accessed by curproc with the exception of ptsignal(). In ptsignal() p_sigmask is now only read once unless a SSLEEP proc gets the signal. In that case recheck the p_sigmask before wakeup to ensure that no unnecessary wakeup happens. Add some KASSERT(p == curproc) to ensure this precondition. sigabort() is special since it is also called by ddb but apart from that only works for curproc. With and OK mvs@ OK mpi@
2024-03-30Prevent a recursion inside wakeup(9) when scheduler tracepoints are enabled.Martin Pieuchot
Tracepoints like "sched:enqueue" and "sched:unsleep" were called from inside the loop iterating over sleeping threads as part of wakeup_proc(). When such tracepoints were enabled they could result in another wakeup(9) possibly corrupting the sleepqueue. Rewrite wakeup(9) in two stages, first dequeue threads from the sleepqueue then call setrunnable() and possible tracepoints for each of them. This requires moving unsleep() outside of setrunnable() because it messes with the sleepqueue. ok claudio@
2024-02-25New accounting flag ABTCFI to indicate signal SIGILL + code ILL_BTCFITheo de Raadt
has occurred in the process. ok various people
2024-01-17Fix core file writing when a file map into memory has later been truncatedKurt Miller
to be smaller than the mapping. Record which memory segments are backed by vnodes while walking the uvm map and later suppress EFAULT errors caused by the underlying file being truncated. okay miod@
2023-10-06In sys___thrsigdivert() switch tsleep_nsec() to use the nowake identClaudio Jeker
channel instead of inventing an own one. OK kettenis@ mvs@
2023-09-29Extend single_thread_set() mode with additional flag attributes.Claudio Jeker
The mode can now be or-ed with SINGLE_DEEP or SINGLE_NOWAIT to alter the behaviour of single_thread_set(). This allows explicit control of the SINGLE_DEEP behaviour. If SINGLE_DEEP is set the deep flag is passed to the initial check call and by that the check will error out instead of suspending (SINGLE_UNWIND) or exiting (SINGLE_EXIT). The SINGLE_DEEP flag is required in calls to single_thread_set() outside of userret. E.g. at the start of sys_execve because the proc is not allowed to call exit1() in that location. SINGLE_NOWAIT skips the wait at the end of single_thread_set() and therefor returns BEFORE all threads have been parked. Currently this is only used by the ptrace code and should not be used anywhere else. Not waiting for all threads to settle is asking for trouble. This solves an issue by using SINGLE_UNWIND in the coredump case where the code should actually exit in case another thread crashed moments earlier. Also the SINGLE_UNWIND in pledge_fail() is now marked SINGLE_DEEP since the call to pledge_fail() is for sure not at the kernel boundary. OK mpi@
2023-09-19Before coredump or in pledge_fail use SINGLE_UNWIND to stop all threads.Claudio Jeker
SINGLE_UNWIND unwinds to the kernel boundary. On the other hand SINGLE_SUSPEND will sleep inside tsleep(9) and other sleep functions. Since the code will exit1() very soon after it is better to already unwind. Now one could argue that for coredumps all threads should stop asap to get a clean dump. Using SINGLE_UNWIND the sleep will fail with ERESTART and no copyout should happen in that case. This is a bit of a workaround since SINGLE_SUSPEND has a small race where single_thread_wait() returns before all threads are really stopped. When SINGLE_EXIT is called quickly after this can blow up inside sleep_finish. Reported-by: syzbot+3ef066fcfaf991f2ac2c@syzkaller.appspotmail.com OK mpi@ kettenis@
2023-09-13Revert commitid: yfAefyNWibUyjkU2, ESyyH5EKxtrXGkS6 and itscfpFvJLOj8mHB;Claudio Jeker
The change to the single thread API results in crashes inside exit1() as found by Syzkaller. There seems to be a race in the exit codepath. What exactly fails is not really clear therefor revert for now. This should fix the following Syzkaller reports: Reported-by: syzbot+38efb425eada701ca8bb@syzkaller.appspotmail.com Reported-by: syzbot+ecc0e8628b3db39b5b17@syzkaller.appspotmail.com and maybe more. Reverted commits: ---------------------------- Protect ps_single, ps_singlecnt and ps_threadcnt by the process mutex. The single thread API needs to lock the process to enter single thread mode and does not need to stop the scheduler. This code changes ps_singlecount from a count down to zero to ps_singlecnt which counts up until equal to ps_threadcnt (in which case all threads are properly asleep). Tested by phessler@, OK mpi@ cheloha@ ---------------------------- Change how ps_threads and p_thr_link are locked away from using SCHED_LOCK. The per process thread list can be traversed (read) by holding either the KERNEL_LOCK or the per process ps_mtx (instead of SCHED_LOCK). Abusing the SCHED_LOCK for this makes it impossible to split up the scheduler lock into something more fine grained. Tested by phessler@, ok mpi@ ---------------------------- Fix SCHED_LOCK() leak in single_thread_set() In the (q->p_flag & P_WEXIT) branch is a continue that did not release the SCHED_LOCK. Refactor the code a bit to simplify the places SCHED_LOCK is grabbed and released. Reported-by: syzbot+ea26d351acfad3bb3f15@syzkaller.appspotmail.com OK kettenis@
2023-09-09Fix SCHED_LOCK() leak in single_thread_set()Claudio Jeker
In the (q->p_flag & P_WEXIT) branch is a continue that did not release the SCHED_LOCK. Refactor the code a bit to simplify the places SCHED_LOCK is grabbed and released. Reported-by: syzbot+ea26d351acfad3bb3f15@syzkaller.appspotmail.com OK kettenis@
2023-09-08Change how ps_threads and p_thr_link are locked away from using SCHED_LOCK.Claudio Jeker
The per process thread list can be traversed (read) by holding either the KERNEL_LOCK or the per process ps_mtx (instead of SCHED_LOCK). Abusing the SCHED_LOCK for this makes it impossible to split up the scheduler lock into something more fine grained. Tested by phessler@, ok mpi@
2023-09-04Protect ps_single, ps_singlecnt and ps_threadcnt by the process mutex.Claudio Jeker
The single thread API needs to lock the process to enter single thread mode and does not need to stop the scheduler. This code changes ps_singlecount from a count down to zero to ps_singlecnt which counts up until equal to ps_threadcnt (in which case all threads are properly asleep). Tested by phessler@, OK mpi@ cheloha@
2023-08-16Move SCHED_LOCK after sleep_signal_check.Claudio Jeker
sleep_signal_check() is there to look for pending signals / single thread requests which were posted before sleep_setup() finished. Once p_stat is set to SSLEEP the wakeup and delivery of signals is taken care of by ptsignal and single_thread_set(). Moving the SCHED_LOCK further down allows to cleanup cursig() and to remove a SCHED_LOCK recursion in single_thread_check(). OK mpi@
2023-08-13Fix P_WSLEEP handling when continuing SSTOP-ed processesClaudio Jeker
When continuing a process on the sleep queue just let it switch to p_stat = SSLEEP even when P_WSLEEP is set. Once a proc is SSTOP-ed in sleep_finish() a valid sleep point has been reached and there is no need to make the process runnable again (which results in some hairy race conditions). Instead simply clear P_WSLEEP since a stopped proc reached the sleep state and there is no race with wakeup() anymore. OK mpi@
2023-08-11Move the single_thread_check() to the start of userret().Claudio Jeker
This way threads stopped by SINGLE_SUSPEND will check for pending signals right after being released instead of returning to userland first. The same order of check is already used in sleep_signal_check(). OK mpi@
2023-07-14struct sleep_state is no longer used, remove it.Claudio Jeker
Also remove the priority argument to sleep_finish() the code can use the p_flag P_SINTR flag to know if the signal check is needed or not. OK cheloha@ kettenis@ mpi@
2023-07-11Rework sleep_setup()/sleep_finish() to no longer hold the scheduler lockClaudio Jeker
between calls. Instead of forcing an atomic operation across multiple calls use a three step transaction. 1. setup sleep state by calling sleep_setup() 2. recheck sleep condition to ensure that the event did not fire before sleep_setup() registered the proc onto the sleep queue 3. call sleep_finish() to either sleep or keep on running based on the step 2 outcome and any possible signal delivery To make this work wakeup from signals, single thread api and wakeup(9) need to be aware if a process is between step 1 and step 3 so that the process is not enqueued back onto the runqueue while going to sleep. Introduce the p_flag P_WSLEEP to detect this situation. On top of this remove the spl dance in msleep() which is no longer required. It is ok to process interrupts between step 1 and 3. OK mpi@ cheloha@
2023-07-10Allow unveiled programs to dump core (in the default, classic, into . way)Theo de Raadt
by passing BYPASSUNVEIL just for this vnode. The coredump() code is quite careful, so this will be fine. ok kn kettenis semarie
2023-06-28First step at removing struct sleep_state.Claudio Jeker
Pass the timeout and sleep priority not only to sleep_setup() but also to sleep_finish(). With that sls_timeout and sls_catch can be removed from struct sleep_state. The timeout is now setup first thing in sleep_finish() and no longer as last thing in sleep_setup(). This should not cause a noticeable difference since the code run between sleep_setup() and sleep_finish() is minimal. OK kettenis@
2023-04-03Reduce indent in single_thread_check_locked() by inverting initialClaudio Jeker
if () check which just returns. OK mpi@
2023-02-10Adjust knote(9) APIVisa Hankala
Make knote(9) lock the knote list internally, and add knote_locked(9) for the typical situation where the list is already locked. Remove the KNOTE(9) macro to simplify the API. Manual page OK jmc@ OK mpi@ mvs@
2023-01-31On systems without xonly mmu hardware-enforcement, we can still mitigateTheo de Raadt
against classic BROP with a range-checking wrapper in front of copyin() and copyinstr() which ensures the userland source doesn't overlap the main program text, ld.so text, signal tramp text (it's mapping is hard to distinguish so it comes along for the ride), or libc.so text. ld.so tells the kernel libc.so text range with msyscall(2). The range checking for 2-4 elements is done without locking (because all 4 ranges are immutable!) and is inexpensive. write(sock, &open, 400) now fails with EFAULT. No programs have been discovered which require reading their own text segments with a system call. On a machine without mmu enforcement, a test program reports the following: userland kernel ld.so readable unreadable mmap xz unreadable unreadable mmap x readable readable mmap nrx readable readable mmap nwx readable readable mmap xnwx readable readable main readable unreadable libc unmapped? readable unreadable libc mapped readable unreadable ok kettenis, additional help from miod
2023-01-02Add tfind_user(), for getting a proc* given a user-space TID andPhilip Guenther
the process* that it should be part of. Use that in clock_get{time,res}(), thrkill(), and ptrace(). ok jca@ miod@ mpi@ mvs@