src - OpenBSD base system

Age	Commit message (Collapse)	Author
3 days	Do not clear P_WSLEEP in ptsigna's SIGCONT handling. cursig() no longer	Claudio Jeker
	stops threads while called during the sleep transition and so there is no need to clear P_WSLEEP. OK mpi@
7 days	No need to call unsleep() if p_wchan is NULL.	Claudio Jeker
	OK mpi@
11 days	Adjust deep logic in cursig() to handle sig_stop specially.	Claudio Jeker
	If any other signal is pending the stop signal should be deferred. Now cursig() uses ffs() to select the signal and so higher numbered signals like SIGUSR1 would be ignored when going to sleep. So handle default stop signals specially in the deep case, stash them and only use them if no other signal is pending. Fix for signal-stress regress (problem reported by anton@) With and OK mpi@
11 days	Remove unreachable check for orphaned process groups in cursig.	Claudio Jeker
	setsigctx() now does this check and clears sig_stop in that case and instead set sig_ignore. So the check in cursig that is based on sig_stop can never be true. OK mpi@
2024-11-06	Factor out the ptrace trap into proc_trap() and simplify the signal	Claudio Jeker
	delivery in cursig() a lot since most of that is no longer needed. On top of this properly handle sending a blocked signal from gdb to the debugged process by putting the signal into to proc p_siglist. OK kettenis@
2024-11-05	Unlock ptsignal by using the ps_mtx instead of KERNEL_LOCK to ensure	Claudio Jeker
	the process is not modified during signal delivery. This also unlocks psignal and prsignal since those are simple wrappers around ptsignal. OK mpi@
2024-11-05	remove VATTR_NULL() define, directly call vattr_null()	Jonathan Gray
	There used to be a predefined null vattr for !DIAGNOSTIC but that was removed in vnode.h rev 1.84 in 2007. ok semarie@ miod@
2024-11-04	Properly handle stop signals in cursig if deep.	Claudio Jeker
	In setsigctx() set sig_stop to 1 if the process should be stopped. In cursig() still return early if deep but then in sleep_signal_check() use this information to call proc_stop and stop the proc. This should fix the problem in the waitid regress test. OK mpi@
2024-10-22	Protect the ps_pgrp pointer by either the KERNEL_LOCK or the ps_mtx.	Claudio Jeker
	This should be enough to be on the safe side when unlocking ptsignal where a pr->ps_pgrp->pg_jobc == 0 check happens. OK mpi@ kettenis@
2024-10-17	Shortcut cursig when called during sleep setup.	Claudio Jeker
	Add deep flag as function argument which is used by the sleep API but nowhere else. Both calls to sleep_signal_check() should skip the ugly bits of cursig(). In cursig() if deep once it is clear a signal will be taken keep the signal on the thread siglist and return. sleep_signal_check() will then return EINTR or ERESTART based on the signal context. There is no reason to do more in this special case. Especially stop/cont and the ptrace trap must be skipped here. Once the call makes it to userret the signal will be picked up again and handled in a safe location. Stopping singals need some additional logic since we don't want to abort the sleep just to stop a process. Since our SIGSTOP handling requires a major rewrite this will be posponed until then. OK mpi@
2024-10-15	Indicate that a process has stopped by setting PS_STOPPED flag	Claudio Jeker
	The checks in dowait6 and orphanpg using ps_mainproc are flawed and fail if the mainproc called pthread_exit before the other threads. Adding the flag in proc_stop_sweep is racy but the best we have right now. This fixes regress/sys/kern/signal/sig-stop3. OK mpi@
2024-10-09	Clear ps_xsig when continuing after a PS_TRACED stop.	Claudio Jeker
	Also remove the ps_xsig handling in setrunnable() it is in the wrong spot and causes signals to be delivered over and over again. Attaching to an already stopped process is affected by this. The SIGSTOP sent by ptrace is now ignored in ptsignal() and as a result gdb will hang in wait4() until a SIGCONT is delivered to the process. After that all works as usual. OK mpi@
2024-10-09	Convert prsignal() into a real function	Claudio Jeker
	Also do not use ps_mainproc as the thread the signal is send to. Sending a signal to ps_mainproc may not work reliably if it already exited. Use TAILQ_FIRST(&pr->ps_threads) instead but first check that the process has not yet entered exit1(). OK mpi@
2024-10-01	Adjust ptrace interface to properly suport single threaded continue.	Claudio Jeker
	Introduce P_TRACESINGLE flag to instruct the trapped thread to not wakeup the other threads (via single_thread_clear). This must be done like this since ptrace must wake just the single thread to ensure it runs first and gets the ps_xsig value from ptrace. Modern gdb depends on this for multi-threaded processes, when a breakpoint is hit gdb fixes up the trapping instruction and then single steps over it with only that thread. After that single step gdb continues with all threads. If all threads are run like now it is possible that one of the other threads hits a breakpoint before the single step is done which results in an assertion in gdb (because that is not expected). OK mpi@
2024-08-10	spelling; ok claudio@	Jonathan Gray

2024-08-06	Stop using KERNEL_LOCK to protect the per process kqueue list	Claudio Jeker
	Instead of the KERNEL_LOCK use the ps_mtx for most operations. If the ps_klist is modified an additional global rwlock (kqueue_ps_list_lock) is required. This includes the knotes with NOTE_FORK and NOTE_EXIT since in either cases a ps_klist is changed. In the NOTE_FORK \| NOTE_TRACK case the call to kqueue_register() can sleep this is why a global rwlock is used. Adjust the reaper() to call knote_processexit() without KERNEL_LOCK. Double lock idea from visa@ OK mvs@
2024-07-29	Move the signal related kqueue filters to kern_event.c.	Claudio Jeker
	Since proc and signal filters share the same klist it makes sense to keep them together. OK mvs@
2024-07-29	Replace per thread P_CONTINUED with per process PS_CONTINUED flag	Claudio Jeker
	dowait6() can only look at per process state so switch this over. Right now SIGCONT handling in ptsignal is recursive and not quite right but this is a step in the right direction. It fixes dowait6() handling for multithreaded processes where the main thread exited. OK mpi@
2024-07-24	KASSERT that the ps_single proc has P_SUSPSINGLE cleared.	Claudio Jeker
	Requested by kettenis@ and guenther@
2024-07-22	Rename PS_STOPPED to PS_STOPPING. I want to use PS_STOPPED to indicate	Claudio Jeker
	that a process has been stopped so make room for that. OK kettenis@
2024-07-10	Kill the runfast and run label and inline those bits. No functional change.	Claudio Jeker
	OK mpi@
2024-07-09	Reshuffle the switch cases in ptsignal and single_thread_set to be	Claudio Jeker
	in the order needed for future changes. No functional change. OK mpi@
2024-06-03	Remove the now unsued s argument to SCHED_LOCK and SCHED_UNLOCK.	Claudio Jeker
	The SPL level is not tacked by the mutex and we no longer need to track this in the callers. OK miod@ mlarkin@ tb@ jca@
2024-05-22	In the big p_stat switch in ptsignal do not call return but instead	Claudio Jeker
	use one of the gotos. In this case goto out with mask and prop set to 0. OK jca@
2024-05-20	Rework interaction between sleep API and exit1() and start unlocking ps_threads	Claudio Jeker
	This diff adjusts how single_thread_set() accounts the threads by using ps_threadcnt as initial value and counting all threads out that are already parked. In single_thread_check call exit1() before decreasing ps_singlecount this is now done in exit1(). exit1() and thread_fork() ensure that ps_threadcnt is updated with the pr->ps_mtx held and in exit1() also account for exiting threads since exit1() can sleep. OK mpi@
2024-05-08	Rework how action SIG_HOLD is handled in ptsignal.	Claudio Jeker
	Since we want to unlock sigsuspend, ptsignal needs to double check in the SSLEEP case that the signal being delivered is still masked or unmasked. Remove the early return for action SIG_HOLD so that the SSLEEP case can properly recheck the sigmask. On top of this update siglist only in one place at the end of ptsignal this now includes the clearing of signals for the SA_CONT and SA_STOP cases. OK mpi@
2024-05-07	In Rev 1.296 the update of the siglist was moved to the end of ptsignal().	Claudio Jeker
	One atomic_clearbits_int() hiding in SSTOP was missed when converting all the exceptions that cleared the siglist again. Instead of clearing the bits the mask needs to be set to 0 so that it is properly ignored. OK mpi@
2024-04-18	If a proc has P_WEXIT set do not stop it, let it exit since it is already	Claudio Jeker
	mostly dead. This is more like belts and suspenders since a proc in exit1() will not receive signals anymore and so proc_stop() should not be reachable. This is even the case when sigexit() is called and a coredump() is happening. OK mpi@
2024-04-10	Unlock dosigsuspend() and with that some aspects of ppoll and pselect	Claudio Jeker
	Change p_sigmask from atomic back to non-atomic updates. All changes to p_sigmask are only allowed by curproc (the owner). There is no need for atomic instructions here. p_sigmask is mostly accessed by curproc with the exception of ptsignal(). In ptsignal() p_sigmask is now only read once unless a SSLEEP proc gets the signal. In that case recheck the p_sigmask before wakeup to ensure that no unnecessary wakeup happens. Add some KASSERT(p == curproc) to ensure this precondition. sigabort() is special since it is also called by ddb but apart from that only works for curproc. With and OK mvs@ OK mpi@
2024-03-30	Prevent a recursion inside wakeup(9) when scheduler tracepoints are enabled.	Martin Pieuchot
	Tracepoints like "sched:enqueue" and "sched:unsleep" were called from inside the loop iterating over sleeping threads as part of wakeup_proc(). When such tracepoints were enabled they could result in another wakeup(9) possibly corrupting the sleepqueue. Rewrite wakeup(9) in two stages, first dequeue threads from the sleepqueue then call setrunnable() and possible tracepoints for each of them. This requires moving unsleep() outside of setrunnable() because it messes with the sleepqueue. ok claudio@
2024-02-25	New accounting flag ABTCFI to indicate signal SIGILL + code ILL_BTCFI	Theo de Raadt
	has occurred in the process. ok various people
2024-01-17	Fix core file writing when a file map into memory has later been truncated	Kurt Miller
	to be smaller than the mapping. Record which memory segments are backed by vnodes while walking the uvm map and later suppress EFAULT errors caused by the underlying file being truncated. okay miod@
2023-10-06	In sys___thrsigdivert() switch tsleep_nsec() to use the nowake ident	Claudio Jeker
	channel instead of inventing an own one. OK kettenis@ mvs@
2023-09-29	Extend single_thread_set() mode with additional flag attributes.	Claudio Jeker
	The mode can now be or-ed with SINGLE_DEEP or SINGLE_NOWAIT to alter the behaviour of single_thread_set(). This allows explicit control of the SINGLE_DEEP behaviour. If SINGLE_DEEP is set the deep flag is passed to the initial check call and by that the check will error out instead of suspending (SINGLE_UNWIND) or exiting (SINGLE_EXIT). The SINGLE_DEEP flag is required in calls to single_thread_set() outside of userret. E.g. at the start of sys_execve because the proc is not allowed to call exit1() in that location. SINGLE_NOWAIT skips the wait at the end of single_thread_set() and therefor returns BEFORE all threads have been parked. Currently this is only used by the ptrace code and should not be used anywhere else. Not waiting for all threads to settle is asking for trouble. This solves an issue by using SINGLE_UNWIND in the coredump case where the code should actually exit in case another thread crashed moments earlier. Also the SINGLE_UNWIND in pledge_fail() is now marked SINGLE_DEEP since the call to pledge_fail() is for sure not at the kernel boundary. OK mpi@
2023-09-19	Before coredump or in pledge_fail use SINGLE_UNWIND to stop all threads.	Claudio Jeker
	SINGLE_UNWIND unwinds to the kernel boundary. On the other hand SINGLE_SUSPEND will sleep inside tsleep(9) and other sleep functions. Since the code will exit1() very soon after it is better to already unwind. Now one could argue that for coredumps all threads should stop asap to get a clean dump. Using SINGLE_UNWIND the sleep will fail with ERESTART and no copyout should happen in that case. This is a bit of a workaround since SINGLE_SUSPEND has a small race where single_thread_wait() returns before all threads are really stopped. When SINGLE_EXIT is called quickly after this can blow up inside sleep_finish. Reported-by: syzbot+3ef066fcfaf991f2ac2c@syzkaller.appspotmail.com OK mpi@ kettenis@
2023-09-13	Revert commitid: yfAefyNWibUyjkU2, ESyyH5EKxtrXGkS6 and itscfpFvJLOj8mHB;	Claudio Jeker
	The change to the single thread API results in crashes inside exit1() as found by Syzkaller. There seems to be a race in the exit codepath. What exactly fails is not really clear therefor revert for now. This should fix the following Syzkaller reports: Reported-by: syzbot+38efb425eada701ca8bb@syzkaller.appspotmail.com Reported-by: syzbot+ecc0e8628b3db39b5b17@syzkaller.appspotmail.com and maybe more. Reverted commits: ---------------------------- Protect ps_single, ps_singlecnt and ps_threadcnt by the process mutex. The single thread API needs to lock the process to enter single thread mode and does not need to stop the scheduler. This code changes ps_singlecount from a count down to zero to ps_singlecnt which counts up until equal to ps_threadcnt (in which case all threads are properly asleep). Tested by phessler@, OK mpi@ cheloha@ ---------------------------- Change how ps_threads and p_thr_link are locked away from using SCHED_LOCK. The per process thread list can be traversed (read) by holding either the KERNEL_LOCK or the per process ps_mtx (instead of SCHED_LOCK). Abusing the SCHED_LOCK for this makes it impossible to split up the scheduler lock into something more fine grained. Tested by phessler@, ok mpi@ ---------------------------- Fix SCHED_LOCK() leak in single_thread_set() In the (q->p_flag & P_WEXIT) branch is a continue that did not release the SCHED_LOCK. Refactor the code a bit to simplify the places SCHED_LOCK is grabbed and released. Reported-by: syzbot+ea26d351acfad3bb3f15@syzkaller.appspotmail.com OK kettenis@
2023-09-09	Fix SCHED_LOCK() leak in single_thread_set()	Claudio Jeker
	In the (q->p_flag & P_WEXIT) branch is a continue that did not release the SCHED_LOCK. Refactor the code a bit to simplify the places SCHED_LOCK is grabbed and released. Reported-by: syzbot+ea26d351acfad3bb3f15@syzkaller.appspotmail.com OK kettenis@
2023-09-08	Change how ps_threads and p_thr_link are locked away from using SCHED_LOCK.	Claudio Jeker
	The per process thread list can be traversed (read) by holding either the KERNEL_LOCK or the per process ps_mtx (instead of SCHED_LOCK). Abusing the SCHED_LOCK for this makes it impossible to split up the scheduler lock into something more fine grained. Tested by phessler@, ok mpi@
2023-09-04	Protect ps_single, ps_singlecnt and ps_threadcnt by the process mutex.	Claudio Jeker
	The single thread API needs to lock the process to enter single thread mode and does not need to stop the scheduler. This code changes ps_singlecount from a count down to zero to ps_singlecnt which counts up until equal to ps_threadcnt (in which case all threads are properly asleep). Tested by phessler@, OK mpi@ cheloha@
2023-08-16	Move SCHED_LOCK after sleep_signal_check.	Claudio Jeker
	sleep_signal_check() is there to look for pending signals / single thread requests which were posted before sleep_setup() finished. Once p_stat is set to SSLEEP the wakeup and delivery of signals is taken care of by ptsignal and single_thread_set(). Moving the SCHED_LOCK further down allows to cleanup cursig() and to remove a SCHED_LOCK recursion in single_thread_check(). OK mpi@
2023-08-13	Fix P_WSLEEP handling when continuing SSTOP-ed processes	Claudio Jeker
	When continuing a process on the sleep queue just let it switch to p_stat = SSLEEP even when P_WSLEEP is set. Once a proc is SSTOP-ed in sleep_finish() a valid sleep point has been reached and there is no need to make the process runnable again (which results in some hairy race conditions). Instead simply clear P_WSLEEP since a stopped proc reached the sleep state and there is no race with wakeup() anymore. OK mpi@
2023-08-11	Move the single_thread_check() to the start of userret().	Claudio Jeker
	This way threads stopped by SINGLE_SUSPEND will check for pending signals right after being released instead of returning to userland first. The same order of check is already used in sleep_signal_check(). OK mpi@
2023-07-14	struct sleep_state is no longer used, remove it.	Claudio Jeker
	Also remove the priority argument to sleep_finish() the code can use the p_flag P_SINTR flag to know if the signal check is needed or not. OK cheloha@ kettenis@ mpi@
2023-07-11	Rework sleep_setup()/sleep_finish() to no longer hold the scheduler lock	Claudio Jeker
	between calls. Instead of forcing an atomic operation across multiple calls use a three step transaction. 1. setup sleep state by calling sleep_setup() 2. recheck sleep condition to ensure that the event did not fire before sleep_setup() registered the proc onto the sleep queue 3. call sleep_finish() to either sleep or keep on running based on the step 2 outcome and any possible signal delivery To make this work wakeup from signals, single thread api and wakeup(9) need to be aware if a process is between step 1 and step 3 so that the process is not enqueued back onto the runqueue while going to sleep. Introduce the p_flag P_WSLEEP to detect this situation. On top of this remove the spl dance in msleep() which is no longer required. It is ok to process interrupts between step 1 and 3. OK mpi@ cheloha@
2023-07-10	Allow unveiled programs to dump core (in the default, classic, into . way)	Theo de Raadt
	by passing BYPASSUNVEIL just for this vnode. The coredump() code is quite careful, so this will be fine. ok kn kettenis semarie
2023-06-28	First step at removing struct sleep_state.	Claudio Jeker
	Pass the timeout and sleep priority not only to sleep_setup() but also to sleep_finish(). With that sls_timeout and sls_catch can be removed from struct sleep_state. The timeout is now setup first thing in sleep_finish() and no longer as last thing in sleep_setup(). This should not cause a noticeable difference since the code run between sleep_setup() and sleep_finish() is minimal. OK kettenis@
2023-04-03	Reduce indent in single_thread_check_locked() by inverting initial	Claudio Jeker
	if () check which just returns. OK mpi@
2023-02-10	Adjust knote(9) API	Visa Hankala
	Make knote(9) lock the knote list internally, and add knote_locked(9) for the typical situation where the list is already locked. Remove the KNOTE(9) macro to simplify the API. Manual page OK jmc@ OK mpi@ mvs@
2023-01-31	On systems without xonly mmu hardware-enforcement, we can still mitigate	Theo de Raadt
	against classic BROP with a range-checking wrapper in front of copyin() and copyinstr() which ensures the userland source doesn't overlap the main program text, ld.so text, signal tramp text (it's mapping is hard to distinguish so it comes along for the ride), or libc.so text. ld.so tells the kernel libc.so text range with msyscall(2). The range checking for 2-4 elements is done without locking (because all 4 ranges are immutable!) and is inexpensive. write(sock, &open, 400) now fails with EFAULT. No programs have been discovered which require reading their own text segments with a system call. On a machine without mmu enforcement, a test program reports the following: userland kernel ld.so readable unreadable mmap xz unreadable unreadable mmap x readable readable mmap nrx readable readable mmap nwx readable readable mmap xnwx readable readable main readable unreadable libc unmapped? readable unreadable libc mapped readable unreadable ok kettenis, additional help from miod
2023-01-02	Add tfind_user(), for getting a proc* given a user-space TID and	Philip Guenther
	the process* that it should be part of. Use that in clock_get{time,res}(), thrkill(), and ptrace(). ok jca@ miod@ mpi@ mvs@