src - OpenBSD base system

Age	Commit message (Collapse)	Author
2017-07-22	Introduce jiffies, a volatile unsigned long version of our ticks variable	Mark Kettenis
	for use by the linux compatibility APIs in drm(4). While I hate infecting code in sys/kern with this, untangling all the of having different types and different signedness is too much for me right now. The best strategy may be to change ticks itself to be long but that needs some careful auditing. ok deraadt@
2017-07-20	When receiving a struct sockaddr from userland, enforce that memory	Alexander Bluhm
	for sa_len and sa_family is provided. This will make handling of socket name mbufs within the kernel safer. issue reported by Ilja Van Sprundel; OK claudio@
2017-07-20	Initialize a local variable to not leak kernel stack info to userland	Martin Pieuchot
	if TIOCGPGRP fail. Issue found by Ilja van Sprundel. ok bluhm@, millert@, deraadt@
2017-07-20	If pool_get() sleeps while allocating additional memory for socket	Alexander Bluhm
	splicing, another process may allocate it in the meantime. Then one of the splicing structures leaked in sosplice(). Recheck that no struct sosplice exists after a protential sleep. reported by Ilja Van Sprundel; OK mpi@
2017-07-20	Extend the scope of the socket lock in soo_stat() to protect `so_state'	Martin Pieuchot
	and `so_rcv'. ok bluhm@, claudio@, visa@
2017-07-20	Prepare filt_soread() to be locked. No functionnal change.	Martin Pieuchot
	ok bluhm@, claudio@, visa@
2017-07-19	Uninitialized variable can leak kernel memory.	Theo de Raadt
	Found by Ilja Van Sprundel ok kettenis
2017-07-19	Move KTRPOINT call up. The lenght variable i is getting aligned and so	Claudio Jeker
	uninitialised data can be dumped into the ktrace message. Found by Ilja Van Sprundel OK bluhm@
2017-07-18	Both syslog(3) and syslogd(8) truncate the message at 8192 bytes.	Alexander Bluhm
	Do the same in sendsyslog(2) and document the behavior. reported by Ilja Van Sprundel; OK millert@ deraadt@
2017-07-18	soreserve() modifies `so_snd' and `so_rcv' so asserts that it is called	Martin Pieuchot
	with the socket lock. This change is safe because sbreserve() already asserts that the lock is held, but it acts as implicit documentation and indicates that I looked at the function.
2017-07-13	Do not unlock the netlock in the goto out error path before it has	Alexander Bluhm
	been acquired in sosend(). Fixes a kernel lock assertion panic. OK visa@ mpi@
2017-07-12	Invalidate read-ahead buffers when read short	Mike Belopuhov
	Buffercache performs read-ahead for cluster reads by extending the length of an original read operation to the MAXPHYS (64k). Upon I/O completion, the length is trimmed and the buffer is returned to the filesystem and the remaining data is cached. However, under certain circumstances, the underlying hardware may fail to do a complete I/O operation and return with a non- zero value of the residual length (i.e. data that wasn't read). The residual length may exceed the size of an original request and must be re-adjusted to uphold the contract with the caller, e.g. the filesystem. At the same time, read-ahead buffers that cover chunks of memory corresponding to the residual length must be invalidated and not cached. Discussed at length during d2k17, ok tedu
2017-07-12	Do not call fo_ioctl() in syscall that do, or will, take the socket	Martin Pieuchot
	lock. Prevents a future lock recursion since soo_ioctl() will need to grab the lock. ok bluhm@, visa@
2017-07-12	Compute the level of contention only once.	Visa Hankala
	Suggested by and OK dlg@
2017-07-12	When there is no contention on a pool cache lock, lower the number	Visa Hankala
	of items that a cache list is allowed to hold. This lets the cache release resources back to the common pool after pressure on the cache has decreased. OK dlg@
2017-07-10	make malloc(9) mpsafe by using a mutex instead of splvm.	David Gwynne
	this is almost a straightforward change of spl ops with mutex ops, except the accounting has been shuffled around. memory is counted as used before an attempt to allocate it from uvm is made to prevent overcommitting memory. this is modelled on how pools limit allocations. the uvm bits have been eyeballed by kettenis@ who says they should be safe. visa@ found some nits which have been fixed. tested by chris@ and amit kulkarni ok kettenis@ visa@ mpi@
2017-07-08	Revert grabbing the socket lock in kqueue filters.	Martin Pieuchot
	It is unsafe to sleep while iterating the list of pending events in kqueue_scan(). Reported by abieber@ and juanfra@
2017-07-04	some of this code was written in an era when spaces cost extra.	Ted Unangst
	add a little breathing room.
2017-07-04	Always hold the socket lock when calling sblock().	Martin Pieuchot
	Implicitely protects `so_state' with the socket lock in sosend(). ok visa@, bluhm@
2017-07-04	Assert that the socket lock is held when `so_state' is modified.	Martin Pieuchot
	ok bluhm@, visa@
2017-07-04	Assert that the socket lock is held when `so_qlen' is modified.	Martin Pieuchot
	ok bluhm@, visa@
2017-07-03	Do not grab the socket lock in doaccept() twice. Pass NOTE_SUBMIT	Alexander Bluhm
	to KNOTE() as we are already holding the lock. Fixes "panic: rw_enter: netlock locking against myself" reported by Gregor Best and reproduced with src/regress/lib/libtls/gotls. OK millert@
2017-07-03	Protect `so_state', `so_error' and `so_qlen' with the socket lock in	Martin Pieuchot
	kqueue filters. ok millert@, bluhm@, visa@
2017-06-29	Due to risks known for decades, TIOCSTI now performs no action, and simply	Theo de Raadt
	returns EIO. The base system has been cleaned of TIOCSTI uses (collaboration between anton and I), and the ports tree appears mostly clean. A few stragglers may be discovered and cleaned up later... In a month or so, we should see if the #define can be removed entirely. ok anton tedu, support from millert
2017-06-27	Add missing solock()/sounlock() dances around sbreserve().	Martin Pieuchot
	While here document an abuse of parent socket's lock. Problem reported by krw@, analysis and ok bluhm@
2017-06-26	Assert that the corresponding socket is locked when manipulating socket	Martin Pieuchot
	buffers. This is one step towards unlocking TCP input path. Note that all the functions asserting for the socket lock are not necessarilly MP-safe. All the fields of 'struct socket' aren't protected. Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to tell when a filter needs to lock the underlying data structures. Logic and name taken from NetBSD. Tested by Hrvoje Popovski. ok claudio@, bluhm@, mikeb@
2017-06-23	set the alignment of the per cpu cache structures to CACHELINESIZE.	David Gwynne
	hardcoding 64 is too optimistic.
2017-06-23	change the semantic for calculating when to grow the size of a cache list.	David Gwynne
	previously it would figure out if there's enough items overall for all the cpus to have full active an inactive free lists. this included currently allocated items, which pools wont actually hold on a free list and cannot predict when they will come back. instead, see if there's enough items in the idle lists in the depot that could instead go on all the free lists on the cpus. if there's enough idle items, then we can grow. tested by hrvoje popovski and amit kulkarni ok visa@
2017-06-22	calculate a "sum" based upon pointers to functions all over the kernel,	Theo de Raadt
	so that an unhibernate kernel can detect if it is running with the kernel it booted. ok mlarkin
2017-06-21	Permit TIOCSTAT on a tty.	Theo de Raadt

2017-06-20	In ddb print socket bit field so_state in hex to match SS_ defines.	Alexander Bluhm

2017-06-20	Do not touch file pointers for which FILE_IS_USABLE() is false.	Gerhard Roth
	They're might not be fully constructed. ok mpi@ deraadt@ bluhm@
2017-06-20	Convert sodidle() to timeout_set_proc(9), it needs a process context	Martin Pieuchot
	to grab the rwlock. Problem reported by Rivo Nurges. ok bluhm@
2017-06-19	dynamically scale the size of the per cpu cache lists.	David Gwynne
	if the lock around the global depot of extra cache lists is contented a lot in between the gc task runs, consider growing the number of entries a free list can hold. the size of the list is bounded by the number of pool items the current set of pages can represent to avoid having cpus starve each other. im not sure this semantic is right (or the least worst) but we're putting it in now to see what happens. this also means reality matches the documentation i just committed in pool_cache_init.9. tested by hrvoje popovski and amit kulkarni ok visa@
2017-06-19	Terminate pledge log(9) with newline. This fixes dmesg(8) output.	Alexander Bluhm
	found by regress/sys/kern/pledge/generic; OK deraadt@
2017-06-16	add garbage collection of unused lists percpu cached items.	David Gwynne
	the cpu caches in pools amortise the cost of accessing global structures by moving lists of items around instead of individual items. excess lists of items are stored in the global pool struct, but these idle lists never get returned back to the system for use elsewhere. this adds a timestamp to the global idle list, which is updated when the idle list stops being empty. if the idle list hasn't been empty for a while, it means the per cpu caches arent using the idle entries and they can be recovered. timestamping the pages prevents recovery of a lot of items that may be used again shortly. eg, rx ring processing and replenishing from rate limited interrupts tends to allocate and free items in large chunks, which the timestamping smooths out. gc'ed lists are returned to the pool pages, which in turn get gc'ed back to uvm. ok visa@
2017-06-16	split returning an item to the pool pages out of pool_put as pool_do_put.	David Gwynne
	this lets pool_cache_list_put return items to the pages. currently, if pool_cache_list_put is called while the per cpu caches are enabled, the items on the list will put put straight back onto another list in the cpu cache. this also avoids counting puts for these items twice. a put for the items have already been coutned when the items went to a cpu cache, it doesnt need to be counted again when it goes back to the pool pages. another side effect of this is that pool_cache_list_put can take the pool mutex once when returning all the items in the list with pool_do_put, rather than once per item. ok visa@
2017-06-15	report contention on caches global data to userland.	David Gwynne

2017-06-15	white space tweaks. no functional change.	David Gwynne

2017-06-15	implement the backend of the sysctls that report pool cache info.	David Gwynne
	KERN_POOL_CACHE reports info about the global cache info, like how long the lists of cache items the cpus build should be and how many of these lists are idle on the pool struct. KERN_POOL_CACHE_CPUS reports counters from each each. the counters are for how many item and list operations the cache has handled on a cpu. the sysctl provides an array of ncpusfound * struct kinfo_pool_cache_cpu, not a single struct kinfo_pool_cache_cpu. tested by hrvoje popovski ok mikeb@ millert@ ----------------------------------------------------------------------
2017-06-14	tweak sysctl_string and sysctl_tstring to use size_t for lengths, not int	David Gwynne
	theyre both wrappers around sysctl__string, which is where half the fix is too.
2017-06-13	when enabling cpu caches, check the item size against the right thing	David Gwynne
	lists of free items on the per cpu caches are built out the pool items as struct pool_cache_items, not struct pool_cache. make the KASSERT in pool_cache_init check that properly.
2017-06-13	use size_t for the size of things in memory, not int.	David Gwynne
	this tweaks the len argument to sysctl_rdstring, sysctl_struct, and sysctl_rdstruct. there's probably more to fix. ok millert@
2017-06-12	Pledge is fairly done, so the kernel printf's can be converted to log()	Theo de Raadt
	calls. They'll be a little less visible, but still in the system logs. ok bluhm
2017-06-08	ASLR, W^X, and guard pages trigger processor traps that result in	Alexander Bluhm
	SIGILL, SIGBUS, SIGSEGV signals. Make such memory violations visible in lastcomm(1). This also works if a programm tries to hide them with a signal handler. Manual kill -SEGV does not generate false positives. OK deraadt@
2017-06-08	make rb_n2e return a struct rb_entry , not void	David Gwynne
	maybe this will help prevent misassignment in the future.
2017-06-08	use unsigned long instead of caddr_t to move between nodes and entries.	David Gwynne
	this removes the need for sys/param.h. this code can be built with only sys/tree.h, which in turn only needs sys/_null.h.
2017-06-08	add RBT_SET_LEFT, RBT_SET_RIGHT, and RBT_SET_PARENT	David Gwynne
	this are provided so an RBT and it's topology can be copied without having to reinsert the copied nodes into a new tree. there are two reasons RBT_LEFT/RIGHT/PARENT macros cant be used like RB_LEFT/RIGHT/PARENT for this. firstly, RBT_LEFT and co are functions that return a pointer value, they dont provide access to the pointer itself for use as an lvalue that you can assign to. secondly, RBT entries dont store pointers to other nodes, they point to the RBT_ENTRY structures inside other nodes. this means that RBT_SET_LEFT and co have to get an offset from the node to the RBT_ENTRY and store that.
2017-06-07	Add an acct(5) flag for pledge violations. Then lastcomm(1) shows	Alexander Bluhm
	when something went wrong. This allows to monitor whether the system is under attack and that the attack has been prevented by OpenBSD pledge(2). OK deraadt@ millert@ jmc@
2017-06-07	Assert that the KERNEL_LOCK() is held when messing with routing,	Martin Pieuchot
	pfkey and unix sockets. ok claudio@