src - OpenBSD base system

Age	Commit message (Collapse)	Author
2022-02-24	Unlock getsockname(2) syscall. For inet and UNIX sockets it fills passed	Vitaliy Makkoveev
	'sockaddr' structure with socket's address. For key management and route domain sockets it just returns error. ok bluhm@
2022-02-22	Since other exported commandnames were increased to 24 and graduated into	Theo de Raadt
	proper strings, adapt struct acct's ac_comm similarily. While here increase ac_mem to 32-bits, increase ac_flag from 8 to 32 bits for future extensions, add ac_pid for forensics, and reorder the structure to avoid compiler pads. More work remains in the sa(8) command to use ac_pid better. This is a flag day for the acct file format, new/old files/tools are incompatible. ok bluhm millert
2022-02-22	Start using new _MAXCOMLEN (a proper string expanded to 24 bytes	Theo de Raadt
	including the NUL), in all internal interafaces, and expose this in ktrace, core, or proc.h visibility. ok millert
2022-02-22	Delete unnecessary #includes of <sys/domain.h> and/or <sys/protosw.h>	Philip Guenther
	net/if_pppx.c pointed out by jsg@ ok gnezdo@ deraadt@ jsg@ mpi@ millert@
2022-02-21	anscestors -> ancestors	Jonathan Gray

2022-02-21	consisitent -> consistent	Jonathan Gray

2022-02-21	expliclitly -> explicitly	Jonathan Gray

2022-02-19	The suspend/resume code sleeps-not-allowed phases are protected with	Theo de Raadt
	cold=2. Use the same strategy in a a similar phase during hibernate.
2022-02-19	tsleep() prints a stack trace when cold==2. The suspend/resume code has	Theo de Raadt
	phases where sleeps are not allowed, and this used to discover it. msleep() needs the same check.
2022-02-17	Writes to the ps_flags field of struct process should be atomic.	Rob Pierce
	Ok deraadt@ guenther@
2022-02-16	return unique errors (I chose some errno values.. ) for the various	Theo de Raadt
	failure modes. Also, pack the code a little bit, easier to read.
2022-02-16	Reduce code duplication in socket event filters.	Visa Hankala
	OK mpi@
2022-02-16	unifdef PROC_PC	Jonathan Gray
	ok guenther@ rob@
2022-02-16	If the lid is closed, suspend_finish() now returns EAGAIN, so go to the top	Theo de Raadt
	and restart the suspend all over again. This was previously done by issuing a task to the acpi thread, but this is simpler. (I want to try to duplicate these tests earlier in the resume path...)
2022-02-16	change MD gosleep() and sleep_finish() to return int, the MI code will be	Theo de Raadt
	able to react to this suitably.
2022-02-15	Reintroduce ps state flag 'c' indicating chrooted process (via PS_BITS).	Rob Pierce
	Ok deraat@
2022-02-15	Since acpitoshiba brightness button processing no longer plays games	Theo de Raadt
	with AML parsing outside the acpi thread, the locking-release dance around wsdisplay_{suspend,resume} can be removed ok kettenis
2022-02-15	when the MI suspend code encounters problems, we need a way to	Theo de Raadt
	reset the MD state before bailing out. New MD function sleep_abort() does that.
2022-02-15	unifdef TIOCHPCL, 4.3BSD compat ioctl	Jonathan Gray
	ok deraadt@ guenther@
2022-02-15	MI disable_lid_wakeups() is not needed, x86 systems can do this	Theo de Raadt
	in sleep_resume(), which seems sensible for other future systems also
2022-02-14	Introduce a signal context that is used to pass signal related information	Claudio Jeker
	from cursig() to postsig() or the caller itself. This will simplify locking. Also alter sigactsfree() a bit and move it into process_zap() so ps_sigacts is always a valid pointer. OK semarie@
2022-02-14	update sbchecklowmem() to better detect actual mbuf memory usage.	David Gwynne
	previously sbchecklowmem() (and sonewconn()) would look at the mbuf and mbuf cluster pools to see if they were approaching their hard limits. based on how many mbufs/clusters were allocated against the limits, socket operations would start to fail with ENOBUFS until utilisation went down. mbufs and clusters have changed a lot since then though. there are now many mbuf cluster pools, not just one for 2k clusters. because of this the mbuf layer now limits the amount of memory all the mbuf pools can allocate backend pages from rather than limit the individual pools. this means sbchecklowmem() ends up looking at the default pool hard limit, which is UINT_MAX, which in turn means means sbchecklowmem() probably never applies backpressure. this is made worse on multiprocessor systems where per cpu caches of mbuf and cluster pool items are enabled because the number of in use pool items is distorted by the cpu caches. this switches sbchecklowmem to looking at the page allocations made by all the pools instead. the big benefit of this is that the page allocations are much more representative of the overall mbuf memory usage in the system. the downside is is that the backend page allocation accounting does not see idle memory held by pools. pools cannot release partially free pages to the page backend (obviously), and pools cache idle items to avoid thrashing on the backend page allocator. this means the page allocation level is higher than the memory used by actual in-flight mbufs. however, this can also be a benefit. the backend page allocation is a kind of smoothed out "trend" line. mbuf utilisation over short periods can be extremely bursty because of things like rx ring dequeue and fill cycles, or large socket sends. if you're trying to grow socket buffers while these things are happening, luck becomes an important factor in whether it will work or not. because pools cache idle items, the backend page utilisation better represents the overall trend of activity in the system and will give more consistent behaviour here. this diff is deliberately simple. we're basically going from "no limits" to "some sort of limit" for sockets again, so keeping the code simple means it should be easy to understand and tweak in the future. ok djm@ visa@ claudio@
2022-02-13	Move some MI pieces out of suspend_mp/resume_mp	Theo de Raadt
	ok kettenis
2022-02-13	Use knote_modify() and knote_process() in obvious places.	Visa Hankala

2022-02-13	Rename knote_modify() to knote_assign()	Visa Hankala
	This avoids verb overlap with f_modify.
2022-02-12	Reduce code duplication in pipe event filters	Visa Hankala
	Use the f_event callback for checking event state within the pipe event filters. This enables the same f_modify and f_process functions to handle the different filter types. OK anton@
2022-02-11	Inline klist_empty() for more economic machine code.	Visa Hankala
	OK mpi@
2022-02-11	the sleep_clocks() hook is not needed because the architectures which	Theo de Raadt
	need to do this can do it a few moments later in a different hook
2022-02-10	Duplicate "park disk" code, so that the SUSPEND case can be MI, it is only	Theo de Raadt
	HIBERNATE that needs to be in MD code. ok gkoehler
2022-02-08	The suspend/resume code is a sticky mess of MI, MD, and ACPI sequencing.	Theo de Raadt
	This splits out the MI sequencing, backing it with per-architecture helper functions. Further steps will be neccesary because ACPI and MD are too tightly coupled, but soon we'll be able to use this code for more architectures (which depends on figuring out the lowest-level cpu sleeping method) ok kettenis
2022-02-08	use sizeof(long) - 1 in m_pullup to determine payload alignment.	David Gwynne
	this makes it consistent with the rest of the network stack when determining alignment. ok bluhm@
2022-02-08	poll(2): Switch to kqueue backend	Visa Hankala
	Implement the poll(2) system call on top of the kqueue subsystem. This obsoletes the old, non-MP-safe poll backend. On entering poll(2), the new code translates each pollfd array entry into a set of knotes. When these knotes receive events through kqueue, the events are translated back to pollfd format. Entries in the pollfd array can refer to the same file descriptor with overlapping event masks. To allow such overlap with knotes, use an extra kn_pollid key that separates knotes of different pollfd entries. Adapted from DragonFly BSD, initial implementation by mpi@. Tested in snaps for three weeks. OK mpi@
2022-02-07	Delete STACKGAPLEN: this exec-time allocation at the top of the	Philip Guenther
	original thread's stack hasn't been used since 2015. ok miod@ deraadt@
2022-02-06	Simplify cursig() a bit and make sure that signals are always sent to	Claudio Jeker
	the parent of ptraced processes. Especially ignore the signal mask set by sigprocmask(2) in that case. In userret() alter the testcase for when to call cursig() which is only there to avoid taking the KERNEL_LOCK when returning from a MP safe syscall. This can be revisited once cursig() is MP safe. Problem with debugging signal handlers found by kurt@ Tested and OK kurt@, OK mpi@
2022-02-04	whitelist resolv.conf for stat. go dns library does this.	Ted Unangst
	ok deraadt
2022-01-28	When it's the possessive of 'it', it's spelled "its", without the	Philip Guenther
	apostrophe.
2022-01-25	Capture a repeated pattern into sysctl_securelevel_int function	Greg Steuck
	A few variables in the kernel are only writeable before securelevel is raised. It makes sense to handle them with less code. OK sthen@ bluhm@
2022-01-20	snprintf(9) allows NULL string if size is 0. But doing NULL pointer	Alexander Bluhm
	arithmetic is undefined behavior. Check that size is positive before adding to pointer. While there, use NUL char for string termination. found by kubsan; joint work with tobhe@; OK millert@
2022-01-20	Shifting signed integers left by 31 is undefined behavior in C.	Alexander Bluhm
	found by kubsan; joint work with tobhe@; OK miod@
2022-01-20	initial support for drm sync files, fences associated with file	Jonathan Gray
	descriptors for explicit fencing tested with libdrm's amdgpu_test syncobj timeline tests and vkcube on intel broadwell with Mesa 21.3 (which hangs without sync file support after the 'anv: Assume syncobj support' Mesa commit) feedback and ok visa@
2022-01-18	Properly handle read-only clusters in m_pullup(9).	Alexander Bluhm
	If the first mbuf of a chain in m_pullup is a cluster, check if the cluster is read-only (shared or an external buffer). If so, don't touch it and create a new mbuf for the pullup data. This restores original 4.4BSD m_pullup, that not only returned contiguous mbuf data of the specified length, but also converted read-only clusters into writeable memory. The latter feature was lost during some refactoring. from ehrhardt@; tested by weerd@; OK stsp@ bluhm@ claudio@
2022-01-17	Allow more memory ranges in hibernate	Mike Larkin
	The previous limit of VM_PHYSSEG_MAX ranges (16) was proving too small for newer machines. This diff reorganizes the hibernate signature block to allow for 22 ranges by removing the kernel version comparison and replacing it with a SHA of several unique kernel features (the version string and several addresses of functions not inside the same .o). Reported by claudio@, who also helped fix some issues in the diff. Input from deraadt@ as well. Tested by myself and claudio on a variety of machines. Only compile tested on i386 as I have no more S4-capable i386 hardware anymore. ok claudio@
2022-01-11	regen	Vitaliy Makkoveev

2022-01-11	Unlock getpeername(2). For inet and unix sockets it follows the code	Vitaliy Makkoveev
	which was unlocked with accept(2) unlocking. For key management and route domain sockets it just copies the read-only data. ok bluhm@
2022-01-11	move kern_unveil.c to use DPRINTF()	Sebastien Marie
	Changes the way printf debug is done in kern_unveil.c Currently, each printf() is enclosed in #ifdef DEBUG_UNVEIL. It moves to using DPRINTF(), and reduces the number of #ifdef inside the file. Also changes some strings to use __func__ instead of using the function name verbatim. ok visa@
2022-01-09	Add an UNVEIL_USERSET flag which is set when a unveil node is added via	Claudio Jeker
	unveil(2). It is not set for nodes that are added as a result of a file being added via unveil(2). Use this flag to test if backtracking should be done or not. Also introduce UNVEIL_MASK which checks if any user flags are set and is used to properly return EACCES vs ENOENT. This fixes a problem where unveil("/", "r") & unveil("/usr/bin/id", "rx") cause an error when read accessing "/usr/bin". It also makes sure that unveil(path, "") will return ENOENT for any access of anything under path. Reported by and OK semarie@
2022-01-07	hibernate_clear_signature() is only used by hibernate_resume(), so	Philip Guenther
	pass in the already read hibernate_info instead of reading it again. ok deraadt@
2022-01-07	Extract the slice from the zeroth swap device instead of assuming	Philip Guenther
	it's the 'b' slice and (sanity) check against the partition count. Also, make the "is union hibernate_info too large?" a compile time check. ok deraadt@
2022-01-04	Use the device we read the hibernate signature from for the entire	Philip Guenther
	resume. This fixes setups where a umass device no longer attaching at resume results in a softraid device being renumbered so the hibernate-time device is no longer correct ok mlarkin@ jsing@
2022-01-02	immediatly -> immediately	Theo Buehler