src - OpenBSD base system

Age	Commit message (Collapse)	Author
2022-01-01	copyright++;	Jonathan Gray

2021-12-29	Do not allow send/receive of kcov descriptors as the file descriptor can	Anton Lindqvist
	be kept alive longer than expected causing syzkaller to no longer being able to enable remote coverage. ok visa@ Reported-by: syzbot+ab2016d729cda7b0d003@syzkaller.appspotmail.com
2021-12-26	Rework garbage collector for unix(4) sockets.	Vitaliy Makkoveev
	This time unix(4) sockets garbage collector always destroys any socket with positive "fp->f_count == unp->unp_msgcount" equation. This is wrong because unix(4) sockets within SCM_RIGHTS message but closed on sender side also have this equation positive. Such sockets are not in the loop, and if garbage collector kill them before they are received, we get kernel panic. FreeBSD already has garbage collector reworked to fix this issue [1]. The logic is pretty simple so import it to our garbage collector. 1. https://reviews.freebsd.org/D23142 ok bluhm@
2021-12-25	kqueue: Invalidate revoked vnodes' knotes on the fly	Visa Hankala
	When a tty device is revoked, the associated knotes should be invalidated. Otherwise the user processes can keep on receiving events from the device. It appears tricky to do the invalidation as part of revocation in a way that does not allow unwanted event registration or clutter the tty code. For now, make the knotes invalid lazily before delivery. OK mpi@
2021-12-24	Make poll/select version of filt_solisten() more similar to soo_poll().	Visa Hankala
	OK mpi@
2021-12-23	sync	Philip Guenther

2021-12-23	Roll the syscalls that have an off_t argument to remove the explicit padding.	Philip Guenther
	Switch libc and ld.so to the generic stubs for these calls. WARNING: reboot to updated kernel before installing libc or ld.so! Time for a story... When gcc (back in 1.x days) first implemented long long, it didn't (always) pass 64bit arguments in 'aligned' registers/stack slots, with the result that argument offsets didn't match structure offsets. This affected the nine system calls that pass off_t arguments: ftruncate lseek mmap mquery pread preadv pwrite pwritev truncate To avoid having to do custom ASM wrappers for those, BSD put an explicit pad argument in so that the off_t argument would always start on a even slot and thus be naturally aligned. Thus those odd wrappers in lib/libc/sys/ that use __syscall() and pass an extra '0' argument. The ABIs for different CPUs eventually settled how things should be passed on each and gcc 2.x followed them. The only arch now where it helps is landisk, which needs to skip the last argument register if it would be the first half of a 64bit argument. So: add new syscalls without the pad argument and on landisk do that skipping directly in the syscall handler in the kernel. Keep compat support for the existing syscalls long enough for the transition. ok deraadt@
2021-12-23	Use TAILQ_FOREACH to traverse the disk list in sysctl_diskinit().	Alexander Bluhm
	OK anton@
2021-12-22	While malloc sleeps, the disk list could change during sysctl. Then	Alexander Bluhm
	allocated memory could be too short for the list of disks. Retry allocating enough space until it did not change. The disk list and duid memory are protected by kernel lock. Use asserts to mark this explicitly. Reported-by: syzbot+807423f6868bbfb836bc@syzkaller.appspotmail.com OK anton@ mpi@
2021-12-21	Let malloc return an error as opposed of panicking when sysctl	Anton Lindqvist
	kern.shminfo.shmseg is set to something ridiculously large. ok kettenis@ millert@ Reported-by: syzbot+9f1b201cdbc97b19c7f5@syzkaller.appspotmail.com
2021-12-20	Make filt_dead() selectively inactive with EVFILT_EXCEPT	Visa Hankala
	When a knote uses the dead event filter, the knote's file descriptor is not supposed to point to an object with pending out-of-band data. Make the knote inactive so that userspace will not receive a spurious event. However, kqueue-based poll(2) should still receive HUP notifications. This lets the system use dead_filtops with less strings attached relative to the filter type.
2021-12-20	Run seltrue/dead event filter in modify and process callbacks	Visa Hankala
	Do not assume event status in the modify and process callbacks. Instead always run the event filter so that it has a chance to set knote flags. The filter can also indicate event inactivity.
2021-12-15	Adjust pty and tty event filters	Visa Hankala
	* Implement EVFILT_EXCEPT for ttys for HUP condition detection. This filter is used when pollfd.events has no read/write events. * Add HUP condition detection to filt_ptcwrite() and filt_ttywrite() to reflect ptcpoll() and ttpoll(). Only poll(2) and select(2) can utilize the code; kevent(2) should behave as before with EVFILT_WRITE. * Clear EV_EOF and __EV_HUP if the EOF/HUP condition ends. OK mpi@
2021-12-14	Cover all state checks and updates with spltty() in filt_ttyread().	Visa Hankala

2021-12-13	acct(4) ac_tty shouldn't need NODEV from sys/param.h (which is kernel API),	Theo de Raadt
	-1 is sufficient to indicate the process had no controlling tty, removing one more sys/param.h include in our userland ok millert
2021-12-13	Revise EVFILT_EXCEPT filters	Visa Hankala
	Restrict the circumstances where EVFILT_EXCEPT filters trigger: * when out-of-band data is present and NOTE_OOB is requested. * when the channel is fully closed and consumer is poll(2). This should clarify the logic and suppress events that kqueue-based poll(2) does not except. OK mpi@
2021-12-13	Prevent kevent(2) use of EVFILT_EXCEPT with FIFOs and pipes	Visa Hankala
	Currently, the only intended direct usage of the EVFILT_EXCEPT filter is with NOTE_OOB to detect out-of-band data in ptys and sockets. NOTE_OOB does not apply to FIFOs or pipes. Prevent the user from registering the filter with these file types. The filter code is for the kernel's internal use. OK mpi@
2021-12-12	Add vnode parameter to VOP_STRATEGY()	Visa Hankala
	Pass the device vnode as a parameter to VOP_STRATEGY() to allow calling the correct vop_strategy callback. Now the vnode is also available in the callback. OK mpi@
2021-12-11	Clarify usage of __EV_POLL and __EV_SELECT	Visa Hankala
	Make __EV_POLL specific to kqueue-based poll(2), to remove overlap with __EV_SELECT that only select(2) uses. OK millert@ mpi@
2021-12-10	Revert "kbind(2): disable system call if not initialized before	Philip Guenther
	first __tfork(2)" The immediate issue is that a process linked with -znow will still perform lazy relocation on objects loaded with dlopen(), but there are possibly other dark corners to plumb to find a better invariant. Problem reported by thfr@
2021-12-09	We only have one syscall table: inline sysent/SYS_MAXSYSCALL and	Philip Guenther
	SYS_syscall as the nosys() function into the MD syscall entry routines and the SYSCALL_DEBUG support. Adjust alpha's syscall check to match the other archs. Also, make sysent const to get it into .rodata. With that, 'struct emul' is unused: delete it and all its references ok millert@
2021-12-08	Fix select(2) exceptfds handling of FIFOs and pipes	Visa Hankala
	Prevent select(2) from indicating an exceptional condition when the other end of a FIFO or pipe is closed. Originally, select(2) returned an exceptfds event only with a pty or socket that has out-of-band data pending. millert@ says that OpenBSD diverged from this by accident when poll(2) and select(2) were changed to use the same backend code in year 2003. OK millert@
2021-12-07	Delete the last emulation callbacks: we're Just ELF, so declare	Philip Guenther
	exec_elf_fixup() and coredump_elf() in <sys/exec_elf.h> and call them and the MD setregs() directly in kern_exec.c and kern_sig.c Also delete e_name[] (only used by sysctl), e_errno (unused), and e_syscallnames[] (only used by SYSCALL_DEBUG) and constipate syscallnames to 'const char *const[]' ok kettenis@
2021-12-07	Continue to delete emulation support: we only have one sigcode and	Philip Guenther
	sigobject. Just use the existing globals for the former and use a global for the latter. ok jsg@ kettenis@
2021-12-07	Add EVFILT_EXCEPT filter for pipes	Visa Hankala
	The kqueue-based select(2) needs the filter to replicate the old exceptfds behaviour. The upcoming new poll(2) code will use the filter for POLLHUP condition checking when the events bitmap is clear of read/write events. OK anton@
2021-12-07	Continue to delete emulation support: since we're Just ELF, the size	Philip Guenther
	of the auxinfo is fixed: provide ELF_AUX_WORDS in <sys/exec_elf.h> as a replacement for emul->e_arglen ok millert@
2021-12-07	Make `unp_msgcount' and `unp_file' protection with `unp_gc_lock'	Vitaliy Makkoveev
	rwlock(9). This save us from from races provided by unlocked access to the `f_count' which cause false marking alive socket as dead. We always modify `f_count' and `unp_msgcount' together so the `f_count' modification should also pass the `unp_gc_rwlock' before `unp_msgcount' increment and after `unp_msgcount' decrement. The locked `unp_file' assignment avoids us from drain unp_gc() run. This moves unp_gc() locking back when these wariables were protected with the same lock which was taken for all garbage collector run but uses another lock not `unp_lock'. ok kettenis@ bluhm@
2021-12-06	Start to delete emulation support: since we're Just ELF, make	Philip Guenther
	copyargs() return 0/1 and merge elf_copyargs() into it. Rename ep_emul_arg and ep_emul_argp to have clearer meaning and type and eliminate ep_emul_argsize as no longer necessary. Make sure ep_auxinfo (nee ep_emul_argp) is initialized as powerpc64 always uses it in setregs(). ok semarie@ deraadt@ kettenis@
2021-12-05	kbind(2): disable system call if not initialized before first __tfork(2)	Scott Soule Cheloha
	To unlock kbind(2) we need to protect ps_kbind_addr and ps_kbind_cookie. The simplest way to do this is to disallow kbind(2) initialization after the first __tfork(2) call. If the first thread does not initialize the kbind(2) variables before __tfork(2) then we disable kbind(2) during that first __tfork(2) call. This is guenther@'s patch, I'm just committing it. Discussed with guenther@, deraadt@, kettenis@, and mpi@. ok kettenis@, positive response from mpi@, "I am busy" guenther@
2021-12-02	firstc() and nextc() use an int of global static storage. Make this	Theo de Raadt
	a pointer to a local variable to allow concurrent use if that ever needs to happen in the future. ok mpi kettenis
2021-12-01	late allocation of clist in putc() and b_to_q() hasn't been required in	Theo de Raadt
	a decade, because all tty drivers preallocate. ok kettenis
2021-11-30	Prevent select(2) from blocking if registering found pending events.	Visa Hankala
	OK mpi@
2021-11-29	regen	Vitaliy Makkoveev

2021-11-29	Unlock accept(2) and accept4(2) syscalls. Unlock them both because they	Vitaliy Makkoveev
	follow the same code path. ok bluhm@
2021-11-29	kqueue: Revise badfd knote handling	Visa Hankala
	When closing a file descriptor and converting the poll/select knotes into badfd knotes, keep the knotes attached to the by-fd table. This should prevent kqueue_purge() from returning before the kqueue has become quiescent. This in turn should fix a KASSERT(TAILQ_EMPTY(&kq->kq_head)) panic in KQRELE() that bluhm@ has reported. The badfd conversion is only needed when a poll/select scan is ongoing. The system can skip the conversion if the knote is not part of the active event set. The code of this commit skips the conversion when the fd is closed by the same thread that has done the fd polling. This can be improved but should already cover typical fd usage patterns. As badfd knotes now hold slots in the by-fd table, kqueue_register() clears them. poll/select use kqueue_register() to set up a new scan; any found fd close notification is a leftover from the previous scan. The new badfd handling should be free of accidental knote accumulation. This obsoletes kqpoll_dequeue() and lowers kqpoll_init() overhead. Re-enable lazy removal of poll/select knotes because the panic should no longer happen. OK mpi@
2021-11-26	Mark exit1() and sigexit() as non-returning	Visa Hankala
	The late 1990s reasons for avoiding __dead with exit1() should not apply with the current compilers. This fixes compiler warnings about uninitialized variables in trap.c on mips64. Discussed with guenther@ and miod@
2021-11-24	Fix type of count.	Visa Hankala

2021-11-24	Simplify arithmetics on the main path.	Visa Hankala

2021-11-24	Remove unneeded <sys/stdarg.h>.	Visa Hankala
	OK guenther@
2021-11-24	Refactor postsig_done(). Pass the catchmask and signal reset flag to the	Claudio Jeker
	function. This will make unlocking cursig() & postsig() a bit easier. OK mpi@
2021-11-24	Minor code cleanup. Move a comment to the right place, move a function	Claudio Jeker
	to get a better order of functions. Also reduce the size of sigprop to NSIG from NSIG+1. NSIG is defined as 33 and so includes the extra element for this array. OK mpi@
2021-11-24	Add a few dt(4) TRACEPOINTS to SMR. Should help to better understand what	Claudio Jeker
	goes on in SMR. OK mpi@
2021-11-22	Revert poll(2) back to the original implementation	Visa Hankala
	The translation to and from kqueue still has major shortcomings. Discussed with deraadt@
2021-11-22	Translate POLLNVAL in ppollcollect()	Visa Hankala
	This makes the kqueue-based poll(2) behave more similarly to the old code when a monitored file descriptor is closed by another thread. OK mpi@
2021-11-22	Let futex_wait() run without kernel lock	Visa Hankala
	The KERNEL_LOCK() is no longer necessary with rwsleep() and PCATCH because the sleep machinery now does the locking internally. OK mpi@
2021-11-19	Make futexes work in shared anonymous memory.	Mark Kettenis
	ok mpi@
2021-11-17	When unp_connect() releases both solock() and vnode(9) locks the socket we	Vitaliy Makkoveev
	were connected could be closed by concurrent thread. Check connection state and return ECONNREFUSED if the connection was lost. ok bluhm@
2021-11-16	Use nowake when poll/select has empty fd set	Visa Hankala
	When the fd set is empty, the code waits for a signal or timeout. Wakeups from the kqueue are neither expected nor wanted. OK cheloha@, millert@, anton@, mpi@
2021-11-16	Move UNIX domain sockets garbage collector out of `unp_lock.	Vitaliy Makkoveev
	Except `unp_ino' this leaves only per-socket data protected by `unp_lock'. The `unp_ino' protection is not the big deal and will be done with mutex(9) in the future diff. The garbage collector flags moved from from `unp_flags' to unp_gcflags'. The two new locks introduced to protect garbage collector data. The `unp_gc_lock' rwlock(9) protects `unp_defer', `unp_gcing', `unp_gcflags' and `unp_link' list. The `unp_df_lock' protects `ud_link' list. We need to simultaneously lock `unp_gc_lock' and `unp_lock'. When we perform unp_attach() or unp_detach() we link PCB to `unp_link' list with `unp_lock' held. But when unp_gc() does `unp_link' list walkthrough with the `unp_gc_lock' lock held it should lock socket while performs `so_rcv' buffer scan and the lock order should be the opposite. In the future diff `unp_lock' will be replaced by per-socket `so_lock' so it's better to enforce `unp_gc_lock' -> `unp_lock' (solock()) lock order and release `unp_lock' in the unp_attach() and unp_detach() paths. The previously committed diffs made this safe. The `unp_df_lock' introduced because the `unp_lock' and `unp_gc_lock' state are unknown when unp_discard() called. Since it touches only `ud_link' list the re-lock dances are unwanted in this path. Also this keeps M_WAITOK allocation outside rwlock(9) when unp_discard() called from unp_externalize() error path. ok bluhm@
2021-11-15	Copy p_p->ps_pledge into a local variable (called pledge) in every function	Theo de Raadt
	which checks PLEDGE_* bits more than once. Some functions are called without locking, and this avoids misinterpreting bits which have some coupled behaviour. ok cheloha kettenis