src - OpenBSD base system

Age	Commit message (Collapse)	Author
2021-06-04	regen	mvs

2021-06-04	Unlock connect(2). Again.	mvs
	ok mpi@
2021-06-02	Use the same logic in all copies of gpt_chk_mbr(), relaxing the	Kenneth R Westerback
	media length check to allow EFI GPT partitions to be smaller that the entire disk. Consistently use GPTSECTOR instead of randomly tossing in some literal '1's. ok kettenis@
2021-06-02	Enable pool cache on knote pool	Visa Hankala
	Use the pool cache to reduce the overhead of memory management in function kqueue_register(). When EV_ADD is given, kqueue_register() pre-allocates a knote to avoid potential sleeping in the middle of the critical section that spans from knote lookup to insertion. However, the pre-allocation is useless if the lookup finds a matching knote. The cost of knote allocation will become significant with kqueue-based poll(2) and select(2) because the frequency of allocation will increase. Most of the cost appears to come from the locking inside the pool. The pool cache amortizes it by using CPU-local caches of free knotes as buffers. OK dlg@ mpi@
2021-06-02	regen	mvs

2021-06-02	Unlock setrtable(2). Local copy of `ps_rtableid' used to make checks	mvs
	consistent. ok mpi@
2021-06-02	kernel: introduce per-CPU panic(9) message buffers	cheloha
	Add a 512-byte buffer (ci_panicbuf) to each cpu_info struct on each platform for use by panic(9). The first panic on a given CPU writes its message to this buffer. Subsequent panics on a given CPU print the panic message to the console but do not modify the buffer. This aids debugging in two cases: - If 2+ CPUs panic simultaneously there is no risk of garbled messages in the panic buffer. - If a CPU panics and then the operator causes a second panic while using ddb(4), the operator can still recall the first failure on a particular CPU. Misc. changes to support this bigger change: - Set panicstr atomically to identify the first CPU to reach panic(). - Tweak db_show_panic_cmd() to print all panic messages across all CPUs. Prefix the first panic with an asterisk ('*'). - Prefer db_printf() to printf() during a panic if we have it. Apparently it disturbs less global state. - On amd64, tweak fault() to write the local panic buffer. This needs more work. Prompted by bluhm@ and deraadt@. Mostly written by deraadt@. Discussed with bluhm@, deraadt@ and kettenis@. Borne from a discussion on tech@ about making panic(9) more MP-safe: https://marc.info/?l=openbsd-tech&m=162086462316143&w=2 ok kettenis@, visa@, bluhm@, deraadt@
2021-06-01	Make spoofed disklabel boundstart and boundend default to the bounds	Kenneth R Westerback
	of the usable LBA range defined by the GPT header. And then shrink them to the bounds of the first OpenBSD partition if one is found. While here simplify the logic, eliminate some superfluous variables and reduce use of magic numbers. Improvement suggested by sobrado@ ok kettenis@
2021-05-31	Redefine ADJFREQ_MIN to avoid undefined behaviour (when not using -fwrapv)	Visa Hankala
	Change the definition of ADJFREQ_MIN so that it does not shift a negative value. Such shifting is undefined in standard C. This came up when cross-compiling the kernel using ports clang. The shifting becomes defined when compiling with option -fwrapv. Base clang enables this option by default. OK naddy@ cheloha@
2021-05-30	Declare all struct protosw as constant.	Alexander Bluhm
	OK mvs@
2021-05-28	Add f_modify and f_process callbacks to socket filterops.	Visa Hankala
	This makes kqueue use the extended callback interface with socket event filters. Now one level of nested kernel locking is avoided, and the callbacks run without splhigh(). The filterops no longer check NOTE_SUBMIT, and use a fixed locking pattern instead. The f_event routines are always called with solock(), whereas f_modify and f_process are always called without the lock. OK mpi@
2021-05-27	Relax criteria for recognizing GPT formatted media by allowing the	Kenneth R Westerback
	EFI GPT partition (0xEE) in the protective MBR to be smaller that the actual size of the media. This allows GPT disk images dd'ed onto larger physical media to be recognized by fdisk(8) and the kernel. Feedback from kettenis@ on various earlier versions.
2021-05-26	Fix the return value for the FUTEX_WAIT/FUTEX_WAIT_PRIVATE futex(2)	Mark Kettenis
	operation. System calls should return -1 and set errno when they fail. They should not return an errno value directly. This matches how the Linux version of futex(2) behaves and what Mesa expects. This fixes a bug in Mesa where a timeout wouldn't be reported properly. Technically this is an ABI break. But libc and libpthread were changed to be compatible with both the old and new ABI, and code outside of base almost certainly expects Linux compatible behaviour. If you have not rebuilt libc and the last few days, upgrade using a snap. Mesa issue discovered by jsg@ ok mpi@, deraadt@
2021-05-26	Use `so_lock' to protect key management (PF_KEY) sockets. This can be	mvs
	done because we have no cases where one thread should lock two sockets simultaneously. tested by yasuoka@ ok bluhm@ markus@
2021-05-25	As network features are not added dynamically, the domain structures	Alexander Bluhm
	are constant. Having more const makes MP review easier. More pointers are mapped read-only in the kernel image. OK deraadt@ mvs@
2021-05-19	In ttyinfo() check that ps_vmspace isn't NULL before calculating the	Mark Kettenis
	resident set size. This replicates what the sysctl code does and fixes a kernel crash reported by robert@ ok deraadt@
2021-05-18	Move potential sleeping m_getclr(9) out of `unp_lock' within unp_bind().	mvs
	ok mpi@
2021-05-17	Increase the default buffer space using on PF_UNIX sockets to 8k.	Claudio Jeker
	Additionally make the values tuneable via sysctl. OK deraadt@ mvs@
2021-05-16	panic does not require a \n at the end. When one is provided, it looks wrong.	Theo de Raadt

2021-05-14	Whitespace tweaks and a couple of stray u_int* in gpt_chk_mbr().	Kenneth R Westerback
	No intentional functional change.
2021-05-14	Tweak the two copies of gpt_chk_mbr() to return the index of the MBR	Kenneth R Westerback
	0xEE (DOSPTYP_EFI) partition, or -1 no usable such partition is found. Adopt a consistent idiom to capture the index for future use. Clean up the gpt_chk_mbr() logic to make it clearer what constraints are being applied when looking for the DOSTYP_EFI partition. No intentional functional change.
2021-05-13	Do `so_rcv' cleanup with sblock() held.	mvs
	solock() should be taken before sblock(). soreceive() grabs solock() and then locks `so_rcv'. But later it releases solock() before call uimove(9). So concurrent thread which performs soshutdown() could break sorecive() loop. But `so_rcv' is still locked by sblock() so this soshutdown() thread will sleep in sorflush() at sblock() call. soshutdown() thread doesn't release solock() after sblock() call so it has no matter where to release `so_rcv' - is will be locked until the solock() release. That's why this strange looking code works fine. This sbunlock() movement just after `so_rcv' cleanup affects nothing but makes the code consistent and clean to understand. ok mpi@
2021-05-13	Use NULL instead of 0 for mbuf(9) pointers.	mvs
	ok millert@
2021-05-13	Assign NULL instead of 0 to `control' within sendit(). It's mbuf(9)	mvs
	pointer. ok deraadt@
2021-05-13	Move ktrfds() below fdpunlock(). This fixes lock order issue between	mvs
	vn_lock(9) and fdplock(). Reported-by: syzbot+2300a1bedc425f6f851e@syzkaller.appspotmail.com ok visa@
2021-05-12	regen	Martin Pieuchot

2021-05-12	Revert unlock of connect(2), bind(2), listen(2) and shutdown(2).	Martin Pieuchot
	At least one of them cause a deadlock involving `unplock' and mbuf allocations ('mbufpl') as reported by millert@.
2021-05-11	timeout_barrier(9), timeout_del_barrier(9): remove kernel lock	cheloha
	In timeout_barrier(9) we take/release the kernel lock to ensure that the given timeout has finished running (if it had been running at all). This approach is inefficient. If we put a barrier timeout on the queue and wait for it to run in cond_wait(9) we can block instead of spinning for the kernel lock. We already do this for process-context timeouts in timeout_barrier(9) anyway. Discussed with dlg@, visa@, and mpi@. ok dlg@
2021-05-11	regen	mvs

2021-05-11	Unlock shutdown(2).	mvs
	ok mpi@
2021-05-11	regen	mvs

2021-05-11	Unlock listen(2).	mvs
	ok mpi@
2021-05-11	regen	mvs

2021-05-11	Unlock connect(2).	mvs
	ok mpi@
2021-05-11	regen	mvs

2021-05-11	Unlock bind(2).	mvs
	ok mpi@
2021-05-10	Revert previous, it introduced a regression with breakpoints in gdb.	Martin Pieuchot

2021-05-08	Spoof GPT partitions of type 21686148-6449-6e6f-744e-656564454649 (a.k.a.	Kenneth R Westerback
	"IdontNeedEFI", a.k.a. "BIOS boot") as FS_BOOT. Often used to contain the second stage boot loader binary on disk images. Makes it easier to recognize/overwrite/remove the contents. Not yet supported in fdisk(8). Example image provided by mlarkin@
2021-05-06	regen	anton

2021-05-06	Unlock lseek(2).	anton
	In August 2019 I tried to unlock lseek which failed since the vnode lock could not be acquired without holding the kernel lock back then. claudio@ recently made it possible to acquire a vnode lock without holding the kernel lock. The kernel lock is still required around VOP_GETATTR() as the underlying file system implementations are not MP-safe. ok claudio@
2021-05-06	Refactor routines to stop/unstop processes and save the corresponding signal.	Martin Pieuchot
	- Move the "hack" involving P_SINTR to avoid grabbing the SCHED_LOCK() recursively closer to where it is necessary, in proc_stop() - Introduce proc_unstop(), the symmetric routine to proc_stop(), which manipulates `ps_xsig' and use it whenever a SSTOPed thread needs to be awaken. - Manipulate `ps_xsig' only in proc_stop/unstop() ok kettenis@
2021-05-04	Reorder the integer sysctl functions. Then the traditional 4.4BSD	Alexander Bluhm
	comment 'As above...' makes sense again. Improve comments for sysctl_int_bounded() and sysctl_bounded_arr(). OK gnezdo@ mvs@
2021-05-04	As the unbouded feature in sysctl_int_bounded() is no longer used,	Alexander Bluhm
	remove it. This also fixes a defective check of the dynamic boundary in sysctl_sysvshm(). OK mvs@ gnezdo@
2021-05-04	syscalls.c, init_sysent.c, syscall.h, syscallargs.h: regen	cheloha
	Regen after unlocking getitimer(2) and setitimer(2). ok anton@, mpi@
2021-05-04	getitimer(2), setitimer(2): unlock syscalls	cheloha
	With the changes in kern_time.c v1.150, neither getitimer(2) nor setitimer(2) need the kernel lock anymore. ok anton@, mpi@
2021-05-01	Update the remaining SYSCTL_INT_READONLY cases	gnezdo
	OK mvs@
2021-05-01	Implement per-socket `so_lock' rwlock(9) and use it to protect routing	mvs
	(PF_ROUTE) sockets. This can be done because we have no cases where one thread should lock two sockets simultaneously. Against the previous version rtm_senddesync_timer() execution was moved to process context. Also this time `so_lock' used for routing sockets only but in the future it will be used to other socket types too. tested by claudio@ ok claudio@ bluhm@
2021-05-01	Retire OpenBSD/sgi.	Visa Hankala
	OK deraadt@
2021-04-30	Rearrange the implementation of bounded sysctl. The primitive	Alexander Bluhm
	functions are sysctl_int() and sysctl_rdint(). This brings us back the 4.4BSD implementation. Then sysctl_int_bounded() builds the magic for range checks on top. sysctl_bounded_arr() is a wrapper around it to support multiple variables. Introduce macros that describe the meaning of the magic boundary values. Use these macros in obvious places. input and OK gnezdo@ mvs@
2021-04-30	When terminating via pledge_fail() stop all threads, before issuing a	Theo de Raadt
	(delayed action) sigabort() and disabling all syscalls for this process (ie. all threads). This resulted in multiple-threads crashing over top of themselves, and a poor debugging experience. We keep using sigabort() rather than sigexit(), to keep the debugging process good. Diagnosed from a report from brynet, and followup discussion with many.