src - OpenBSD base system

Age	Commit message (Collapse)	Author
2021-09-09	Add THREAD_PID_OFFSET to tracepoint arguments that pass a TID to userland.	Martin Pieuchot
	Bring these values in sync with the `tid' builtin which already include the offset. This is necessary to build script comparing them, like: tracepoint:sched:enqueue { @ts[arg0] = nsecs; } tracepoint:sched:on__cpu /@ts[tid]/ { latency = nsecs - @ts[tid]; } Discussed with and ok bluhm@
2021-08-02	Don't call cpu_setperf() when reading hw.setperf.	Theo Buehler
	"makes perfect sense to me" chris ok gnezdo jca
2021-05-10	Revert previous, it introduced a regression with breakpoints in gdb.	Martin Pieuchot

2021-05-06	Refactor routines to stop/unstop processes and save the corresponding signal.	Martin Pieuchot
	- Move the "hack" involving P_SINTR to avoid grabbing the SCHED_LOCK() recursively closer to where it is necessary, in proc_stop() - Introduce proc_unstop(), the symmetric routine to proc_stop(), which manipulates `ps_xsig' and use it whenever a SSTOPed thread needs to be awaken. - Manipulate `ps_xsig' only in proc_stop/unstop() ok kettenis@
2020-12-10	Use sysctl_int_bounded for sysctl_hwsetperf	gnezdo
	Removed some trailing whitespace while there. ok gkoehler@
2020-10-15	Stop asserting that the NET_LOCK() shouldn't be held in yield().	Martin Pieuchot
	This create too many false positive when setting pool_debug=2. Prodded by deraadt@, ok mvs@
2020-05-30	In automatic performance mode on systems with offline CPUs because of SMT	solene
	mitigation the algorithm was still accounting the offline CPUs, leading to a code path that would never be reached. This should allow better frequency scaling on systems with many CPUs. The frequency should scale up if one of two condition is true. - if at least one CPU has less than 25% of idle cpu time - if the average of all idle time is under 33% The second condition was never met because offline CPU are always accounted as 100% idle. A bit more explanations about the auto scaling in case someone want to improve this later: When one condition is met, CPU frequency is set to maximum and a counter set to 5, then the function will be run again 100ms later and decrement the counter if both conditions are not met anymore. Once the counter reach 0 the frequency is set to minimum. This mean that it can take up to 100ms to scale up and up to 500ms to scale down. ok brynet@ looks good tedu@
2020-01-30	Split `p_priority' into `p_runpri' and `p_slppri'.	Martin Pieuchot
	Using different fields to remember in which runqueue or sleepqueue threads currently are will make it easier to split the SCHED_LOCK(). With this change, the (potentially boosted) sleeping priority is no longer overwriting the thread priority. This let us get rids of the logic required to synchronize `p_priority' with `p_usrpri'. Tested by many, ok visa@
2020-01-21	Import dt(4) a driver and framework for Dynamic Profiling.	Martin Pieuchot
	The design is fairly simple: events, in the form of descriptors on a ring, are being produced in any kernel context and being consumed by a userland process reading /dev/dt. Code and hooks are all guarded under '#if NDT > 0' so this commit shouldn't introduce any change as long as dt(4) is disable in GENERIC. ok kettenis@, visa@, jasper@, deraadt@
2019-12-11	Replace p_xstat with ps_xexit and ps_xsig	Philip Guenther
	Convert those to a consolidated status when needed in wait4(), kevent(), and sysctl() Pass exit code and signal separately to exit1() (This also serves as prep for adding waitid(2)) ok mpi@
2019-11-04	Restore the old way of dispatching dead procs through idle proc.	Visa Hankala
	The new way needs more thought.
2019-11-02	Move dead procs to the reaper queue immediately after context switch.	Visa Hankala
	This eliminates a forced context switch to the idle proc. In addition, sched_exit() no longer needs to sum proc runtime because mi_switch() will do it. OK mpi@ a while ago
2019-11-01	Kill resched_proc() and instead call need_resched() when a thread is	Martin Pieuchot
	added to the runqueue of a CPU. This fix out-of-sync cases when the priority of a thread wasn't reflecting the runqueue it was sitting in leading to unnecessary context switch. ok visa@
2019-10-15	Reduce the number of places where `p_priority' and `p_stat' are set.	Martin Pieuchot
	This refactoring will help future scheduler locking, in particular to shrink the SCHED_LOCK(). No intended behavior change. ok visa@
2019-07-15	Stop calling resched_proc() after changing the nice(3) value of a process.	Martin Pieuchot
	Changing the scheduling priority of a process happens rarely, so it isn't strictly necessary to update the current priority of every threads instantly. Moreover resched_proc() isn't well suited to perform this action: it doesn't consider the state of each thread nor move them to another runqueue. ok visa@
2019-07-08	Untangle code setting the scheduling priority of a thread.	Martin Pieuchot
	- `p_estcpu' and `p_usrpri' represent the priority and are now only set in a single function. - Call resched_proc() after updating the priority and stop calling it from schedclock() since `spc_curpriority' should match curproc's priority. - Rename updatepri() to match decay_cpu() and stop updating per-thread member. - Merge two resched_proc() in one inside setrunnable(). Tweak and ok visa@
2019-06-01	Revert to using the SCHED_LOCK() to protect time accounting.	Martin Pieuchot
	It currently creates a lock ordering problem because SCHED_LOCK() is taken by hardclock(). That means the "priorities" of a thread should be moved out of the SCHED_LOCK() first in order to make progress. Reported-by: syzbot+8e4863b3dde88eb706dc@syzkaller.appspotmail.com via anton@ as well as by kettenis@
2019-05-31	Use a per-process mutex to protect time accounting instead of SCHED_LOCK().	Martin Pieuchot
	Note that hardclock(9) still increments p_{u,s,i}ticks without holding a lock. ok visa@, cheloha@
2019-05-25	Do not account spinning time as running time when a thread crosses a	Martin Pieuchot
	tick boundary of schedlock(). This reduces the contention on the SCHED_LOCK() when the current thread is already spinning. Prompted by deraadt@, ok visa@
2019-02-26	Introduce safe memory reclamation, a mechanism for reclaiming shared	Visa Hankala
	objects that readers can access without locking. This provides a basis for read-copy-update operations. Readers access SMR-protected shared objects inside SMR read-side critical section where sleeping is not allowed. To reclaim an SMR-protected object, the writer has to ensure mutual exclusion of other writers, remove the object's shared reference and wait until read-side references cannot exist any longer. As an alternative to waiting, the writer can schedule a callback that gets invoked when reclamation is safe. The mechanism relies on CPU quiescent states to determine when an SMR-protected object is ready for reclamation. The <sys/smr.h> header additionally provides an implementation of singly- and doubly-linked lists that can be used together with SMR. These lists allow lockless read access with a concurrent writer. Discussed with many OK mpi@ sashan@
2019-01-28	Stop accounting/updating priorities for Idle threads.	Martin Pieuchot
	Idle threads are never placed on the runqueue so their priority doesn't matter. This fixes an accounting bug where top(1) would report a high CPU usage for Idle threads of secondary CPUs right after booting. That's because schedcpu() would give 100% CPU time to the Idle thread until "real" threads get scheduled on the corresponding CPU. Issue reported by bluhm@, ok visa@, kettenis@
2019-01-06	Fix unsafe use of ptsignal() in mi_switch().	Visa Hankala
	ptsignal() has to be called with the kernel lock held. As ensuring the locking in mi_switch() is not easy, and deferring the signaling using the task API is not possible because of lock order issues in mi_switch(), move the CPU time checking into a periodic timer where the kernel can be locked without issues. With this change, each process has a dedicated resource check timer. The timer gets activated only when a CPU time limit is set. Because the checking is not done as frequently as before, some precision is lost. Use of timers adapted from FreeBSD. OK tedu@ Reported-by: syzbot+2f5d62256e3280634623@syzkaller.appspotmail.com
2017-12-04	Use _kernel_lock_held() instead of __mp_lock_held(&kernel_lock).	Martin Pieuchot
	ok visa@
2017-02-14	Convert most of the manual checks for CPU hogging to sched_pause().	Martin Pieuchot
	The distinction between preempt() and yield() stays as it is usueful to know if a thread decided to yield by itself or if the kernel told him to go away. ok tedu@, guenther@
2017-02-09	Do no select a CPU to execute the current thread when being preempt()ed.	Martin Pieuchot
	Calling sched_choosecpu() at this moment often result in moving the thread to a different CPU. This does not help the scheduler and creates a domino effect, resulting in kernel thread moving to other CPUs. Tested by many without performance impact. Simon Mages measured a small performance improvement and a smaller variance with an http proxy. Discussed with kettenis@, ok martijn@, beck@, visa@
2017-01-25	Enable the NET_LOCK(), take 2.	Martin Pieuchot
	Recursions are currently known and marked a XXXSMP. Please report any assert to bugs@
2016-03-09	Correct some comments and definitions, from Michal Mazurek.	Martin Pieuchot

2015-11-08	keep all the setperf timeout(9) handling in one place; ok tedu@	Christian Weisgerber

2015-03-14	Remove some includes include-what-you-use claims don't	Jonathan Gray
	have any direct symbols used. Tested for indirect use by compiling amd64/i386/sparc64 kernels. ok tedu@ deraadt@
2014-12-13	yet more mallocarray() changes.	Doug Hogan
	ok tedu@ deraadt@
2014-11-12	take a few more ticks to actually throttle down. hopefully helps in	Ted Unangst
	situations where e.g. web browsing is cpu intense but intermittently idle. subject to further refinement and tuning.
2014-11-03	pass size argument to free()	Theo de Raadt
	ok doug tedu
2014-10-17	cpu_setperf and perflevel must remain exposed, otherwise a bunch of	Theo de Raadt
	MD code needs excess #ifndef SMALL_KERNEL
2014-10-17	redo the performance throttling in the kernel.	Ted Unangst
	introduce a new sysctl, hw.perfpolicy, that governs the policy. when set to anything other than manual, hw.setperf then becomes read only. phessler was heading in this direction, but this is slightly different. :)
2014-07-04	Track whether a process is a zombie or not yet fully built via flags	Philip Guenther
	PS_{ZOMBIE,EMBRYO} on the process instead of peeking into the process's thread data. This eliminates the need for the thread-level SDEAD state. Change kvm_getprocs() (both the sysctl() and kvm backends) to report the "most active" scheduler state for the process's threads. tweaks kettenis@ feedback and ok matthew@
2014-05-15	Move from struct proc to process the reference-count-holding pointers	Philip Guenther
	to the process's vmspace and filedescs. struct proc continues to keep copies of the pointers, copying them on fork, clearing them on exit, and (for vmspace) refreshing on exec. Also, make uvm_swapout_threads() thread aware, eliminating p_swtime in kernel. particular testing by ajacoutot@ and sebastia@
2013-06-03	Convert some internal APIs to use timespecs instead of timevals	Philip Guenther
	ok matthew@ deraadt@
2013-06-02	Use long long and %lld for printing tv_sec values	Philip Guenther
	ok deraadt@
2013-03-28	do not include machine/cpu.h from a .c file; it is the responsibility of	Theo de Raadt
	.h files to pull it in, if needed ok tedu
2012-07-09	Tedu old comment concerning cpu affinity which does not apply anymore.	Christiano F. Haesbaert
	ok blambert@ krw@ tedu@ miod@
2012-03-23	Make rusage totals, itimers, and profile settings per-process instead	Philip Guenthe
	of per-rthread. Handling of per-thread tick and runtime counters inspired by how FreeBSD does it. ok kettenis@
2012-02-20	First steps for making ptrace work with rthreads:	Philip Guenthe
	- move the P_TRACED and P_INEXEC flags, and p_oppid, p_ptmask, and p_ptstat member from struct proc to struct process - sort the PT_* requests into those that take a PID vs those that can also take a TID - stub in PT_GET_THREAD_FIRST and PT_GET_THREAD_NEXT ok kettenis@
2011-07-07	Functions used in files other than where they are defined should be	Philip Guenthe
	declared in .h files, not in each .c. Apply that rule to endtsleep(), scheduler_start(), updatepri(), and realitexpire() ok deraadt@ tedu@
2011-07-06	Stop using the P_BIGLOCK flag to figure out when we should release the	Artur Grabowski
	biglock in mi_switch and just check if we're holding the biglock. The idea is that the first entry point into the kernel uses KERNEL_PROC_LOCK and recursive calls use KERNEL_LOCK. This assumption is violated in at least one place and has been causing confusion for lots of people. Initial bug report and analysis from Pedro. kettenis@ beck@ oga@ thib@ dlg@ ok
2011-03-07	The scheduling 'nice' value is per-process, not per-thread, so move it	Philip Guenthe
	into struct process. ok tedu@ deraadt@
2010-09-24	Add stricter asserts to DIAGNOSTIC kernels to help catch mutex and	Matthew Dempsky
	rwlock misuse. In particular, this commit makes the following changes: 1. i386 and amd64 now count the number of active mutexes so that assertwaitok(9) can detect attempts to sleep while holding a mutex. 2. i386 and amd64 check that we actually hold mutexes when passed to mtx_leave(). 3. Calls to rw_exit*() now call rw_assert_{rd,wr}lock() as appropriate. ok krw@, oga@; "sounds good to me" deraadt@; assembly bits double checked by pirofti@
2010-06-30	This comment is unnecessarily confusing.	Artur Grabowski

2010-01-03	Use atomic operations to access the per-cpu scheduler flags.	Mark Kettenis

2009-04-14	Some tweaks to the cpu affinity code.	Artur Grabowski
	- Split up choosing of cpu between fork and "normal" cases. Fork is very different and should be treated as such. - Instead of implicitly choosing a cpu in setrunqueue, do it outside where it actually makes sense. - Just because a cpu is marked as idle doesn't mean it will be soon. There could be a thundering herd effect if we call wakeup from an interrupt handler, so subtract cpus with queued processes when deciding which cpu is actually idle. - some simplifications allowed by the above. kettenis@ ok (except one bugfix that was not in the intial diff)
2009-03-23	Processor affinity for processes.	Artur Grabowski
	- Split up run queues so that every cpu has one. - Make setrunqueue choose the cpu where we want to make this process runnable (this should be refined and less brutal in the future). - When choosing the cpu where we want to run, make some kind of educated guess where it will be best to run (very naive right now). Other: - Set operations for sets of cpus. - load average calculations per cpu. - sched_is_idle() -> curcpu_is_idle() tested, debugged and prodded by many@