src - OpenBSD base system

Age	Commit message (Collapse)	Author
2016-11-02	poison the TAILQ_ENTRY in items in the per cpu pool cache.	David Gwynne

2016-11-02	add poisoning of items on the per cpu caches.	David Gwynne
	it copies the existing pool code, except it works on pool_list structures instead of pool_item structures. after this id like to poison the words used by the TAILQ_ENTRY in the pool_list struct that arent used until a list of items is moved into the global depot.
2016-11-02	use a TAILQ to maintain the list of item lists used by the percpu code.	David Gwynne
	it makes it more readable, and fixes a bug in pool_list_put where it was returning the next item in the current list rather than the next list to be freed.
2016-11-02	add per cpu caches for free pool items.	David Gwynne
	this is modelled on whats described in the "Magazines and Vmem: Extending the Slab Allocator to Many CPUs and Arbitrary Resources" paper by Jeff Bonwick and Jonathan Adams. the main semantic borrowed from the paper is the use of two lists of free pool items on each cpu, and only moving one of the lists in and out of a global depot of free lists to mitigate against a cpu thrashing against that global depot. unlike slabs, pools do not maintain or cache constructed items, which allows us to use the items themselves to build the free list rather than having to allocate arrays to point at constructed pool items. the per cpu caches are build on top of the cpumem api. this has been kicked a bit by hrvoje popovski and simon mages (thank you). im putting it in now so it is easier to work on and test. ok jmatthew@
2016-10-27	For consistency, allow symlinkat(2) in the same way as symlink(2);	Ingo Schwarze
	no need to wait until the first program using it breaks... "could make sense" semarie@ (and thanks for the cluestick) OK deraadt@
2016-10-27	use ncpusfound to size the percpu allocations.	David Gwynne
	ncpus is used on half the architectures to indicate the number of cpus that have been hatched, and is used on them in things like ddb to figure out how many cpus to shut down again. ncpusfound is incremented during autoconf on MP machines to show how big ncpus will probably become. percpu is initted after autoconf but before cpus are hatched, so this works well.
2016-10-27	refactor m_pullup a bit.	David Gwynne
	the most important change is that if the requested data is already in the first mbuf in the chain, return quickly. if that isnt true, the code will try to use the first mbuf to fit the requested data. if that isnt true, it will prepend an mbuf, and maybe a cluster, to fit the requested data. m_pullup will now try to maintain the alignment of the original payload, even when prepending a new mbuf for it. ok mikeb@
2016-10-27	add a new pool for 2k + 2 byte (mcl2k2) clusters.	David Gwynne
	a certain vendor likes to make chips that specify the rx buffer sizes in kilobyte increments. unfortunately it places the ethernet header on the start of the rx buffer, which means if you give it a mcl2k cluster, the ethernet header will not be ETHER_ALIGNed cos mcl2k clusters are always allocated on 2k boundarys (cos they pack into pages well). that in turn means the ip header wont be aligned correctly. the current workaround on these chips has been to let non-strict alignment archs just use the normal 2k cluster, but use whatever cluster can fit 2k + 2 on strict archs. that turns out to be the 4k cluster, meaning we waste nearly 2k of space on every packet. properly aligning the ethernet header and ip headers gives a performance boost, even on non-strict archs.
2016-10-24	avoid using realloc in the name of things that dont work like realloc.	David Gwynne
	cpumem_realloc and counters_realloc actually allocated new per cpu data for new cpus, they didnt resize the existing allocation. specifically, this renames cpumem_reallod to cpumem_malloc_ncpus, and counters_realloc to counters_alloc_ncpus. ok (and with some fixes by) bluhm@
2016-10-24	move the mbstat structure to percpu counters	David Gwynne
	each cpus counters still have to be protected by splnet, but this is better thana single set of counters protected by a global mutex. ok bluhm@
2016-10-24	non-MP vs MP codepaths were confusingly split between the .c and .h file.	Theo de Raadt
	Unify these by placing #ifdef MULTIPROCESSOR inside the functions, then collapse further to reduce _KERNEL blocks ok dlg
2016-10-23	unbreak by fixing obvious pastos	Christian Weisgerber

2016-10-23	handle non-INET6 kernels in some way	Theo de Raadt

2016-10-23	dns hijacking must be af specific. move it into the port check function,	Ted Unangst
	and redirect inet6 sockets to the ::1 flavor of localhost.
2016-10-22	Factor out pr->ps_vmspace into a local variable for fill_kproc()	Philip Guenther
	ok jsing@ kettenis@
2016-10-22	Adjust allocpid() to take into account lastpid	Philip Guenther
	ok jsing@ kettensi@
2016-10-22	Delete dead copy of pr->ps_vmspace; uvmspace_exec() can change it anyway	Philip Guenther
	ok kettenis@ jsing@
2016-10-21	pledge changes needed to support pledging vmd(8) on i386, forthcoming.	Mike Larkin
	ok deraadt@
2016-10-21	consistently zero the allocated memory in both the MP and UP cases.	David Gwynne
	from markus@
2016-10-21	add generalised access to per cpu data structures and counters.	David Gwynne
	both the cpumem and counters api simply allocates memory for each cpu in the system that can be used for arbitrary per cpu data (via cpumem), or a versioned set of counters per cpu (counters). there is an alternate backend for uniprocessor systems that basically turns the percpu data access into an immediate access to a single allocation. there is also support for percpu data structures that are available at boot time by providing an allocation for the boot cpu. after autoconf, these allocations have to be resized to provide for all cpus that were enumerated by boot. ok mpi@
2016-10-19	Change process_{domem,auxv_offset}() to take a process instead of a proc.	Philip Guenther
	Make process_auxv_offset() take and release a reference of the vmspace like process_domem() does. ok kettenis@
2016-10-19	Change pmap_proc_iflush() to take a process instead of a proc	Philip Guenther
	powerpc: rename second argument of pmap_proc_iflush() to match other archs ok kettenis@
2016-10-15	Process groups can't be removed if a zombie process is in them, so	Philip Guenther
	ispidtaken() can rely on pgfind() for all pgrp checks and can simply use zombiefind() for the zombie check ok jca@
2016-10-10	white space fixes.	David Gwynne
	no functional change
2016-10-10	copy the offset of data inside mbufs in m_copym().	David Gwynne
	this is cheap since it is basic math. it also means that payloads which have been aligned carefully will also be aligned in their copy. ok yasuoka@ claudio@
2016-10-09	With systrace and procfs gone, process_checkioperm() and process_domem()	Philip Guenther
	are for option PTRACE only ok kettenis@
2016-10-09	sowakeup() is only called from sorwakeup() and sowwakeup(). Both	Alexander Bluhm
	have an splsoftassert(IPL_SOFTNET) now, so sowakeup() does not need to call splsoftnet() anymore. From mpi@'s netlock diff; OK mikeb@
2016-10-08	upon further review, port numbers go all the way up to ushort max	Ted Unangst

2016-10-08	initialize the port variable before sysctl, since it's also read out.	Ted Unangst

2016-10-08	Add ktracing of the fds returned by pipe() and socketpair()	Philip Guenther
	ok deraadt@
2016-10-07	introduce a sysctl to hijack dns sockets. when set to a port number,	Ted Unangst
	all dns socket connections will be redirected to localhost:port. this could be a sockopt on the listening socket, but sysctl is an easier interface to work with right now. ok deraadt
2016-10-06	Remove redundant comments that say a function must be called at	Alexander Bluhm
	splsoftnet() if the function does a splsoftassert(IPL_SOFTNET) anyway.
2016-10-06	Separate splsoftnet() from variable initialization.	Alexander Bluhm
	From mpi@'s netlock diff; OK mikeb@
2016-10-06	In pledge_namei_wlpath() if resolvpath() errors out early it will not	Jonathan Gray
	set variables that will be later used as the size argument to free(NULL calls. This should be harmless as free returns early if the address is NULL without checking the size. Initialise these variables before the call to ensure they are never passed to another function uninitialised. ok tedu@ millert@ deraadt@
2016-10-05	Display the process's PID with p->p_p->ps_pid, not p->p_pid.	Philip Guenther
	Use a local variable struct process *pr to simplify expressions ok deraadt@
2016-10-05	Display/test/use the process PID, not the thread's TID, in a few places.	Philip Guenther
	ok mpi@ mikeb@
2016-10-03	avoid holding timeout_mutex while interacting with the scheduler.	David Gwynne
	as noted by haesbaert, this is necessary to avoid deadlocks because the scheduler can call back into the timeout subsystem while its holding its own locks. this happened in two places. firstly, in softclock() it would take timeout_mutex to find pending work. if that pending work needs a process context, it would queue the work for the thread and call wakeup, which enters the scheduler locks. if another cpu is trying to tsleep (or msleep) with a timeout specified, the sleep code would be holding the sched lock and call timeout_add, which takes timeout_mutex. this is solved by deferring the wakeup to after timeout_mutex is left. this also has the benefit of mitigating the number of wakeups done per softclock tick. secondly, the timeout worker thread takes timeout_mutex and calls msleep when there's no work to do (ie, the queue is empty). msleep will take the sched locks. again, if another cpu does a tsleep with a timeout, you get a deadlock. to solve this im using sleep_setup and sleep_finish to sleep on an empty queue, which is safe to do outside the lock as it is comparisons of the queue head pointers, not derefs of the contents of the queue. as long as the sleeps and wakeups are ordered correctly with the enqueue and dequeue operations under the mutex, this all works. you can think of the queue as a single descriptor ring, and the wakeup as an interrupt. the second deadlock was identified by guenther@ ok tedu@ mpi@
2016-10-02	Add va_nlink information to struct kinfo_file (so bump the shlib minor)	Philip Guenther
	from Sebastien Marie
2016-09-30	Drop a now unneeded variable initialization; spotted by bluhm@	Jeremie Courreges-Anglas

2016-09-30	Make read(2) return EISDIR on directories.	Jeremie Courreges-Anglas
	Years ago Theo made read(2) return 0 on directories, instead of dumping the directory content. Another behavior is allowed as an extension by POSIX, returning an EISDIR error, as used on a few other systems. This behavior is deemed more useful as it helps spotting errors. This implies that it might break some setups. Ports bulk builds by ajacoutot@ and naddy@, ok millert@ bluhm@ naddy@ deraadt@
2016-09-28	Cast enum to u_int when doing a bounds check to avoid a clang warning that	Mark Kettenis
	the comparison is always true. ok jca@, tedu@
2016-09-27	move from RB macros to RBT functions	David Gwynne

2016-09-26	Regen	Jeremie Courreges-Anglas

2016-09-26	unbalenced->unbalanced	Jeremie Courreges-Anglas

2016-09-25	Make a move towards ending 4 decades of kernel snooping.	Theo de Raadt
	Add sysctl kern.allowkmem (default 0) which controls the ability to open /dev/mem or /dev/kmem at securelevel > 0. Over 15 years we converted 99% of utilities in the tree to operate on sysctl-nodes (either by themselves or via code hiding in the guts of -lkvm). pstat -d and -v & procmap are affected and continued use of them will require kern.allowkmem=1 in /etc/sysctl.conf. acpidump (and it's buddy sendbug) are affected, but we'll work out a solution soon. There will be some impact in ports. ok kettenis guenther
2016-09-24	move knhash size to event.h, use it for hashfree. from Mathieu -	Ted Unangst
	ok guenther
2016-09-24	introduce hashfree() function to free hash tables, with sizes.	Ted Unangst
	ok guenther
2016-09-22	Introduce a new 'softclock' thread that will be used to execute timeout	Martin Pieuchot
	callbacks needing a process context. The function timeout_set_proc(9) has to be used instead of timeout_set(9) when a timeout callback needs a process context. Note that if such a timeout is waiting, understand sleeping, for a non negligible amount of time it might delay other timeouts needing a process context. dlg@ agrees with this as a temporary solution. Manpage tweaks from jmc@ ok kettenis@, bluhm@, mikeb@
2016-09-21	sysctl KERN_ARND is no longer used (in ports, it only occurs in fallback	Theo de Raadt
	paths of libevent). This interface was the first generation of what eventually became getentropy(2) and arc4random(3) -- june 1997! Ports scan by sthen, general agreement guenther
2016-09-20	Protect soshutdown() with splsoftnet() to define one layer where	Alexander Bluhm
	we enter networking code. Fixes an splassert() found by David Hill. OK mikeb@