src - OpenBSD base system

Age	Commit message (Collapse)	Author
2017-12-14	make sched_barrier use cond_wait/cond_signal.	David Gwynne
	previously the code was using a percpu flag to manage the sleeps/wakeups, which means multiple threads waiting for a barrier on a cpu could race. moving to a cond struct on the stack fixes this. while here, get rid of the sbar taskq and just use systqmp instead. the barrier tasks are short, so there's no real downside. ok mpi@
2017-12-14	Don't bother using DETACH_FORCE for the softraid luns at reboot	Theo de Raadt
	time; the aggressive mountpoint destruction seems to hit insane use-after-frees when we are already far on the way down.
2017-12-14	Give vflush_vnode() a hint about vnodes we don't need to account as "busy".	Theo de Raadt
	Change mountpoint to RDONLY a little later. Seems to improve the rw->ro transition a bit.
2017-12-14	i forgot to convert timeout_proc_barrier to cond_signal	David Gwynne

2017-12-14	replace the bare sleep state handling in barriers with wait cond code	David Gwynne

2017-12-14	add code to provide simple wait condition handling.	David Gwynne
	this will be used to replace the bare sleep_state handling in a bunch of places, starting with the barriers.
2017-12-12	sync	Theo de Raadt

2017-12-12	pledge()'s 2nd argument becomes char *execpromises, which becomes the	Theo de Raadt
	pledge for a new execve image immediately upon start. Also introduces "error" which makes violations return -1 ENOSYS instead of killing the program ("error" may not be handed to a setuid/setgid program, which may be missing/ignoring syscall return values and would continue with inconsistant state) Discussion with many florian has used this to improve the strictness of a daemon
2017-12-11	Format the vnode lists of ddb show mount properly in columns.	Alexander Bluhm
	OK krw@
2017-12-11	In uvm Chuck decided backing store would not be allocated proactively	Theo de Raadt
	for blocks re-fetchable from the filesystem. However at reboot time, filesystems are unmounted, and since processes lack backing store they are killed. Since the scheduler is still running, in some cases init is killed... which drops us to ddb [noted by bluhm]. Solution is to convert filesystems to read-only [proposed by kettenis]. The tale follows: sys_reboot() should pass proc * to MD boot() to vfs_shutdown() which completes current IO with vfs_busy VB_WRITE\|VB_WAIT, then calls VFS_MOUNT() with MNT_UPDATE \| MNT_RDONLY, soon teaching us that *fs_mount() calls a copyin() late... so store the sizes in vfsconflist[] and move the copyin() to sys_mount()... and notice nfs_mount copyin() is size-variant, so kill legacy struct nfs_args3. Next we learn ffs_mount()'s MNT_UPDATE code is sharp and rusty especially wrt softdep, so fix some bugs adn add ~MNT_SOFTDEP to the downgrade. Some vnodes need a little more help, so tie them to &dead_vnops. ffs_mount calling DIOCCACHESYNC is causing a bit of grief still but this issue is seperate and will be dealt with in time. couple hundred reboots by bluhm and myself, advice from guenther and others at the hut
2017-12-10	Move SB_SPLICE, SB_WAIT and SB_SEL to `sb_flags', serialized by solock().	Martin Pieuchot
	SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in contexts related to kqueue(2) where we'd like to avoid grabbing solock(). While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and csignal() to mark which remaining functions need to be addressed in the socket layer. ok visa@, bluhm@
2017-12-09	More precision in pledge sysctl report	Theo de Raadt

2017-12-04	Change __mp_lock_held() to work with an arbitrary CPU info structure and	Martin Pieuchot
	extend ddb(4) "ps /o" output to print which CPU is currently holding the KERNEL_LOCK(). Tested by dhill@, ok visa@
2017-12-04	Use _kernel_lock_held() instead of __mp_lock_held(&kernel_lock).	Martin Pieuchot
	ok visa@
2017-11-28	Raise the IPL of the sbar taskq to avoid lock order issues	Visa Hankala
	with the kernel lock. Fixes a deadlock seen by Hrvoje Popovski and dhill@. OK mpi@, dhill@
2017-11-28	deadproc_mutex is only taken _before_ kernel_lock; exclude it from	Philip Guenther
	WITNESS checking as (our) witness code isn't smart enough to let that by. ok visa@
2017-11-28	sync	Philip Guenther

2017-11-28	Delete fktrace(2). The consequences of it were not thought through	Philip Guenther
	sufficiently and at least one horrific security hole was the result. ok deraadt@ beck@
2017-11-27	Fix comment typo	Philip Guenther

2017-11-24	add timeout_barrier, which is like intr_barrier and taskq_barrier.	David Gwynne
	if you're trying to free something that a timeout is using, you have to wait for that timeout to finish running before doing the free. timeout_del can stop a timeout from running in the future, but it doesn't know if a timeout has finished being scheduled and is now running. previously you could know that timeouts are not running by simply masking softclock interrupts on the cpu running the kernel. however, code is now running outside the kernel lock, and timeouts can run in a thread instead of softclock. timeout_barrier solves the first problem by taking the kernel lock and then masking softclock interrupts. that is enough to ensure that any further timeout processing is waiting for those resources to run again. the second problem is solved by having timeout_barrier insert work into the thread. when that work runs, that means all previous work running in that thread has completed. fixes and ok visa@, who thinks this will be useful for his work too.
2017-11-23	Constify protocol tables and remove an assert now that ip_deliver() is	Martin Pieuchot
	mp-safe. ok bluhm@, visa@
2017-11-23	We want `sb_flags' to be protected by the socket lock rather than the	Martin Pieuchot
	KERNEL_LOCK(), so change asserts accordingly. This is now possible since sblock()/sbunlock() are always called with the socket lock held. ok bluhm@, visa@
2017-11-17	permit IPV6_V6ONLY in sockopt	Aaron Bieber
	OK deraadt@
2017-11-14	Push the NET_LOCK into ifioctl() and use the NET_RLOCK in ifioctl_get().	Theo Buehler
	In particular, this allows SIOCGIF* requests to run in parallel. lots of help & ok mpi, ok visa, sashan
2017-11-14	Fix the initial check of the checkorder and lock operations	Visa Hankala
	so that statically initialized locks get properly enrolled to the validator. OK mpi@
2017-11-14	remove MALLOC_DEBUG	David Gwynne
	the code has rotted, and obviously hasnt been used for ages. it is also hard to make mpsafe. if we need something like this again it would be better to do it from scratch. ok tedu@ visa@
2017-11-13	add taskq_barrier	David Gwynne
	taskq_barrier guarantees that any task that was running on the taskq has finished by the time taskq_barrier returns. it is similar to intr_barrier. this is needed for use in ifq_barrier as part of an upcoming change.
2017-11-04	raw_init() is dead and <net/raw_cb.h> doesn't need to be included there.	Martin Pieuchot

2017-11-04	Make it possible for multiple threads to enter kqueue_scan() in parallel.	Martin Pieuchot
	This is a requirement to use a sleeping lock inside kqueue filters. It is now possible, but not recommended, to sleep inside ``f_event''. Threads iterating over the list of pending events are now recognizing and skipping other threads' markers. knote_acquire() and knote_release() must be used to "own" a knote to make sure no other thread is sleeping with a reference on it. Acquire and marker logic taken from DragonFly but the KERNEL_LOCK() is still serializing the execution of the kqueue code. This also enable the NET_LOCK() in socket filters. Tested by abieber@ & juanfra@, run by naddy@ in a bulk, ok visa@, bluhm@
2017-11-02	Move PRU_DETACH out of pr_usrreq into per proto pr_detach	Florian Obser
	functions to pave way for more fine grained locking. Suggested by, comments & OK mpi
2017-10-30	Let witness(4) differentiate between taskq mutexes to avoid	Visa Hankala
	reporting an error in a scenario like the following: 1. mtx_enter(&tqa->tq_mtx); 2. IRQ 3. mtx_enter(&tqb->tq_mtx); Found by Hrvoje Popovski, OK mpi@
2017-10-29	Move NET_{,UN}LOCK into individual slowtimo functions.	Florian Obser
	Direction suggested by mpi OK mpi, visa
2017-10-24	Use membar_enter_after_atomic(9) amd membar_exit_before_atomic(9).	Martin Pieuchot
	Micro-optimization useful to x86 archs where the cmpxchg{q,l} instruction used by rw_enter(9) and rw_exit(9) already include an implicit memory barrier. From Mateusz Guzik, ok visa@, mikeb@, kettenis@
2017-10-17	Add a machine-independent implementation for the mplock.	Visa Hankala
	This reduces code duplication and makes it easier to instrument lock primitives. The MI mplock uses the ticket lock code that has been in use on amd64, i386 and sparc64. These are the architectures that now switch to the MI code. The lock_machdep.c files are unhooked from the build but not removed yet, in case something goes wrong. OK mpi@, kettenis@
2017-10-17	Print the pid of the most recent program that failed to send a log	Martin Pieuchot
	via sendsyslog(2) along with the corresponding errno. Help when troubleshooting which program is triggering an error, like an overflow. ok bluhm@
2017-10-14	Split sys_ptrace() by request type:	Philip Guenther
	- control operations: trace_me, attach, detach, step, kill, continue. Manipulate process relation/state or send a signal - kernel-state get/set: thread list, event mask, trace state. About the process and don't require target to be stopped, need copyin/out - user-state get/set: memory, register, window cookie. Often thread-specific, require target to be stopped, need copyin/out sys_ptrace() changes to handle request checking, copyin/out to kernel buffers with size check and zeroing, and dispatching to the routines above for the real work. This simplfies the permission checks and copyin/out handling and will simplify lock handling in the future. Inspired in part by FreeBSD. ok mpi@ visa@
2017-10-12	Print the word pledge in the kernel log when there is a violation.	Alexander Bluhm
	This should make it easier to figure out what is going on. Note that the pledgecode it shows is only a guess which pledge(2) might help. OK deraadt@ semarie@
2017-10-12	Use a temporary variable in rw_status() to dereference only once the	Martin Pieuchot
	volatile member of the struct. Not forcing a memory read on every access, 3 in this function, might reduce cache traffic in some cases. Micro-optimization and diff provided by Mateusz Guzik. ok visa@
2017-10-12	Move sysctl_mq() where it can safely mess with mbuf queue internals.	Martin Pieuchot
	ok visa@, bluhm@, deraadt@
2017-10-11	Move `kq_count' increase/decrease close to the corresponding TAILQ_*	Martin Pieuchot
	insert/remove operation. No functionnal change for the moment. However this helps to make this code mp-safe. Note that markers are still not, and wont be, counted. ok visa@, jsing@, bluhm@
2017-10-11	Move kq_kev from struct kqueue to the stack.	Martin Pieuchot
	It turns this set of events per-thread without having to lock anything. From Dragonfly 10f6680a4f6684751aaae0965abfe140f19e9231 ok kettenis@, visa@, bluhm@
2017-10-09	Reduces the scope of the NET_LOCK() in sysctl(2) path.	Martin Pieuchot
	Exposes per-CPU counters to real parrallelism. ok visa@, bluhm@, jca@
2017-10-09	Make _kernel_lock_held() always succeed after panic(9).	Martin Pieuchot
	ok visa@
2017-10-07	In "tty", permitting TIOCSTART is fine	Theo de Raadt

2017-10-07	permit SYS___set_tcb, upcoming code will require this	Theo de Raadt

2017-09-29	New ddb(4) command: kill.	Martin Pieuchot
	Send an uncatchable SIGABRT to the process specified by the pid argument. Useful in case of CPU exhaustion to kill the DoSing process and generate a core for later inspection. ok phessler@, visa@, kettenis@, miod@
2017-09-27	guenther sleep-commited the version without #ifdefs	Theo de Raadt

2017-09-27	amd64 needs FS.base values (the TCB pointer) to be validated, as noncanonical	Philip Guenther
	addresses will cause a fault on load by the kernel. Problem observed by Maxime Villard ok kettenis@ deraadt@
2017-09-25	sendsyslog should take a const char * everywhere.	Marc Espie
	okay bluhm@, deraadt@
2017-09-15	Coverity complains that top == NULL was checked and further down	Alexander Bluhm
	top->m_pkthdr.len was accessed without check. See CID 1452933. In fact top cannot be NULL there and the condition was always false. m_getuio() did never reserve space for the header. The correct check is m == top to find the first mbuf. OK visa@