src - OpenBSD base system

Age	Commit message (Collapse)	Author
2020-05-23	Prevent km_alloc() from returning garbage if pagelist is empty.	jan
	ok bluhm@, visa@
2020-04-23	Document uvmexp.nswget without relying on implementation details.	Martin Pieuchot
	Prompted by a question from schwarze@ ok deraadt@, schwarze@, visa@
2020-04-04	Tweak the code that wakes up uvm_pmalloc sleepers in the page daemin.	Mark Kettenis
	Although there are open questions about whether we should flag failures with UVM_PMA_FAIL or not, we really should only wake up a sleeper if we unlink the pma. For now only do that if pages were actually freed in the requested region. Prompted by: CID 1453061 Logically dead code which should be fixed by this commit. ok (and together with) beck@
2020-03-25	Do not test against NULL a variable which is dereference before that.	Martin Pieuchot
	CID 1453116 ok kettenis@
2020-03-24	Use FALLTHROUGH in uvm_total() like it is done in uvm_loadav().	Martin Pieuchot
	CID 1453262.
2020-03-04	Do not count pages mapped as PROT_NONE against the RLIMIT_DATA limit.	Mark Kettenis
	Instead count (and check the limit) when their protection gets flipped from PROT_NONE to something that permits access. This means that mprotect(2) may now fail if changing the protection would exceed RLIMIT_DATA. This helps code (such as Chromium's JavaScript interpreter that reserves large chunks of address space but populates it sparsely. ok deraadt@, otto@, kurt@, millert@, robert@
2020-02-18	Cleanup <sys/kthread.h> and <sys/proc.h> includes.	Martin Pieuchot
	Do not include <sys/kthread.h> where it is not needed and stop including <sys/proc.h> in it. ok visa@, anton@
2020-01-20	struct vops is not modified during runtime so use const which moves each	Claudio Jeker
	into read-only data segment. OK deraadt@ tedu@
2020-01-16	Use list for freeing pages in uvn_flush() to optimize freeing chunks of	Mark Kettenis
	contiguous pages. ok beck@
2020-01-04	Add uvm_anfree_list() to free anons as a list of pages. Use this in	Bob Beck
	the amap code to free pages as a list instead of one at a time to allow for more efficient freeing. Most of the work done at elk lakes, with testing by me and mlarkin and kettenis. Speeds up a test program which zeros a big pile of memory and then exits considerably. ok kettenis@
2020-01-01	Add uvm_pmr_remove_1strange_reverse to efficiently free pages	Bob Beck
	in reverse order from uvm. Use it in uvm_pmr_freepageq when the pages appear to be in reverse order. This greatly improves cases of massive page freeing as noticed by mlarkin@ in his ongoing efforts to have the most gigantish buffer cache on the planet. Most of this work done by me with help and polish from kettenis@ at e2k19. Follow on commits to this will make use of this for more efficient freeing of amaps and a few other things. ok kettenis@ deraadt@
2019-12-30	convert infinite msleep(9) to msleep_nsec(9)	Jonathan Gray
	ok mpi@
2019-12-25	Hook up the shrinker for inteldrm(4). This is a "light" version that only	Mark Kettenis
	drops graphics buffers that are cached and not in active use. Help from beck@ for pointing out how to hook this up to our pagedaemon. ok jsg@
2019-12-18	Set vm_map's pmap in uvm_map_setup().	Visa Hankala
	OK guenther@, kettenis@, mpi@
2019-12-18	Use separate rwlock initializations for userland ("vmspace") and kernel	Mark Kettenis
	maps. This lets witness know that these really are different classes avoiding false positives when detecting lock order reversals. ok guenther@, visa@, mpi@
2019-12-12	Header cleanup.	Martin Pieuchot
	- reduces gratuitous differences with NetBSD, - merges multiple '#ifdef _KERNEL' blocks, - kills unused 'struct vm_map_intrsafe' - turns 'union vm_map_object' into a anonymous union (following to NetBSD) - move questionable vm_map_modflags() into uvm/uvm_map.c - remove guards around MAX_KMAPENT, it is defined&used only once - document lock differences - fix tab vs space ok mlarkin@, visa@
2019-12-09	Many people have crossed the ABI, so re-enable "syscall call-from" checking.	Theo de Raadt

2019-12-09	improve comment for uvm_map_inentry_pc(), the underlying	Theo de Raadt
	non-writeable / syscall checker.
2019-12-08	Convert infinite sleeps to {m,t}sleep_nsec(9).	Martin Pieuchot
	ok visa@, jca@
2019-12-08	Remove an unnecessary #ifndef PMAP_EXCLUDE_DECLS. It was last utilized	Visa Hankala
	by sparc pmap. OK mpi@ guenther@ kettenis@
2019-12-06	Sync KVE_ET_* and UVM_ET_* flags.	Martin Pieuchot
	ok guenther@
2019-12-05	Move uvmexp_print() to a better place.	Martin Pieuchot
	ok mlarkin@
2019-12-05	Remove clause #3 from mrg@NetBSD license.	Martin Pieuchot
	In May 29 2008, Matthew R. Green removed it in NetBSD: github.com/IIJ-NetBSD/netbsd-src/commit/7ea20401d535da9996394136ef ok deraadt@
2019-12-04	Fix a bad offset calculation in uvm_share.	Mike Larkin
	Syzkaller found a bug in uvm_share when using a vmd(8) mmap region with an offset that ended up making an overlap with a previous vmm(4) uvm_map range. This diff reworks the range and offset calculation in uvm_share. Only vmm(4) uses this, so there should be no visible effects outside vmm(4) environments. Syzkaller also went sorta crazy on this one, finding multiple reproducers for the same bug with just slightly different parameters, thus the multiple "Reported-by" lines below. ok stefan@, anton@ Reported-by: syzbot+2c625ab1b8e964da644a@syzkaller.appspotmail.com Reported-by: syzbot+1300829862412751462d@syzkaller.appspotmail.com Reported-by: syzbot+27cfad3394f34528cbec@syzkaller.appspotmail.com Reported-by: syzbot+3e700c5698177f91cce1@syzkaller.appspotmail.com
2019-12-02	Stop supporting UVM_FLAG_TRYLOCK in uvm_mapanon(), it is not used.	Martin Pieuchot
	ok tedu@, visa@
2019-11-30	temporarily neuter the syscall-callfrom check as a few people	Theo de Raadt
	haven't crossed over the ABI break as easily as expected.
2019-11-29	Add uvm_objfree function to free all pages in a uvm_obj in one go.	Bob Beck
	Use this in the buffer cache to free all the pages from a buffer, resulting in a considerable speedup when throwing away pages from the buffer cache. Lots of work done with mlarkin and kettenis ok kettinis@ deraadt@
2019-11-29	Split out the code that removes a page from uvm objects and clears the flags	Mark Kettenis
	into a separate uvm_pageclean() function and call it from uvm_pagefree(). ok mpi@, guenther@, beck@
2019-11-29	Repurpose the "syscalls must be on a writeable page" mechanism to	Theo de Raadt
	enforce a new policy: system calls must be in pre-registered regions. We have discussed more strict checks than this, but none satisfy the cost/benefit based upon our understanding of attack methods, anyways let's see what the next iteration looks like. This is intended to harden (translation: attackers must put extra effort into attacking) against a mixture of W^X failures and JIT bugs which allow syscall misinterpretation, especially in environments with polymorphic-instruction/variable-sized instructions. It fits in a bit with libc/libcrypto/ld.so random relink on boot and no-restart-at-crash behaviour, particularily for remote problems. Less effective once on-host since someone the libraries can be read. For static-executables the kernel registers the main program's PIE-mapped exec section valid, as well as the randomly-placed sigtramp page. For dynamic executables ELF ld.so's exec segment is also labelled valid; ld.so then has enough information to register libc's exec section as valid via call-once msyscall(2) For dynamic binaries, we continue to to permit the main program exec segment because "go" (and potentially a few other applications) have embedded system calls in the main program. Hopefully at least go gets fixed soon. We declare the concept of embedded syscalls a bad idea for numerous reasons, as we notice the ecosystem has many of static-syscall-in-base-binary which are dynamically linked against libraries which in turn use libc, which contains another set of syscall stubs. We've been concerned about adding even one additional syscall entry point... but go's approach tends to double the entry-point attack surface. This was started at a nano-hackathon in Bob Beck's basement 2 weeks ago during a long discussion with mortimer trying to hide from the SSL scream-conversations, and finished in more comfortable circumstances next to a wood-stove at Elk Lakes cabin with UVM scream-conversations. ok guenther kettenis mortimer, lots of feedback from others conversations about go with jsing tb sthen
2019-11-28	uvm_pagealloc_contig() doesn't exist and shouldn't exist	Philip Guenther
	ok kettenis@
2019-11-28	Remove end of line whitespace.	Mike Larkin
	No code change.
2019-11-27	Add dummy msyscall(2) system call which is currently a noop. This will	Theo de Raadt
	be used by kernel and ld.so in the near future. Adding the system call earlier will reduce the number of people who try to build through and encounter agony. ok kettenis guenther
2019-11-26	Fix a panic string that had the wrong function name and an improperly	Mike Larkin
	wrapped line. No code change.
2019-11-26	Fix a bunch of lines that had trailing whitespace.	Mike Larkin
	No code change.
2019-11-05	Kill uvm_deallocate(9) and use uvm_unmap() directly.	Martin Pieuchot
	ok kettenis@, semarie@, deraadt@
2019-11-02	Revert previous, a race is present and can be triggered with golang.	Martin Pieuchot
	Found by jsing@
2019-11-02	Start documenting which locking primitives apply to uvm_map members.	Martin Pieuchot
	ok kettenis@
2019-11-01	Push the KERNEL_LOCK() down in uvm_map_inentry().	Martin Pieuchot
	The lookup in uvm_map_inentry_fix() is already serialized by the vm_map_lock and such lookup is already executed w/o the KERNEL_LOCK(). ok kettenis@, deraadt@
2019-11-01	Keep local function definitions in C files.	Martin Pieuchot

2019-09-09	Inform about system call memory write protection and stack mapping	Alexander Bluhm
	violations in system accounting. This will help to find missbehaving programs and possible attacks. The flags bit field is full, so recycle the PDP-11 compatibility on VAX. lastcomm(1) prints the AMAP flag as 'M'. daily(8) prints a list of affected processes. OK deraadt@
2019-07-18	R.I.P. UVM_WAIT(). Use tsleep_nsec(9) directly.	cheloha
	UVM_WAIT() doesn't provide much of a useful abstraction. All callers tsleep forever and no callers set PCATCH, so only 2 of 4 parameters are actually used. Might as well just use tsleep_nsec(9) directly and make the uvm code a bit less specialized. Suggested by mpi@. ok mpi@ visa@ millert@
2019-07-03	Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).	cheloha
	Equivalent to their unsuffixed counterparts except that (a) they take a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not zero) indicates that a timeout should not be set. For now, zero nanoseconds is not a strictly valid invocation: we log a warning on DIAGNOSTIC kernels if we see such a call. We still sleep until the next tick in such a case, however. In the future this could become some sort of poll... TBD. To facilitate conversions to these interfaces: add inline conversion functions to sys/time.h for turning your timeout into nanoseconds. Also do a few easy conversions for warmup and to demonstrate how further conversions should be done. Lots of input from mpi@ and ratchov@. Additional input from tedu@, deraadt@, mortimer@, millert@, and claudio@. Partly inspired by FreeBSD r247787. positive feedback from deraadt@, ok mpi@
2019-07-01	Document which mechanism protect some fields used w/o KERNEL_LOCK().	Martin Pieuchot
	ok visa@, semarie@
2019-06-21	Make resource limit access MP-safe. So far, the copy-on-write sharing	Visa Hankala
	of resource limit structs has been done between processes. By applying copy-on-write also between threads, threads can read rlimits in a nearly lock-free manner. Inspired by code in DragonFly BSD and FreeBSD. OK mpi@, agreement from jmatthew@ and anton@
2019-06-14	The addition of writeable-syscall checking near MAP_STACK checking	Theo de Raadt
	damaged the error messages. Repair that, passing distinct format strings for the two cases. ok beck
2019-06-01	Refactor the MAP_STACK feature, and introduce another similar variation:	Theo de Raadt
	Lookup the address that a syscall instruction is executed from, and kill the process if that page is writeable. This brings an aspect of W^X behaviour to W\|X mappings (in JITs not yet adapted to W^X). The goal is to remove simple attack methods and force use of ret2libc or other more complicated means. ok kettenis stefan visa
2019-05-16	Handle a bit more work without taking the kernel lock. This should avoid	Mark Kettenis
	taking the kernel lock on when operating on the kernel_map when called from all kernel memory allocation interfaces. ok visa@, mlarkin@
2019-05-15	free size for amap; ok visa@	anton

2019-05-11	move the noise about W^X mapping failure inside the sysctl kern.wxabort	Theo de Raadt
	knob, since we found a proram which tests RWX mapping then changes execution behaviour to non-W^X. (that program is chrome, as v8 is heading towards W^X compliance with mprotect RW/RX swaps, and also has jitless components in developent.) ok sthen kettenis robert
2019-05-10	simplify logic after wakeup since this variable is only manipulated	Bob Beck
	under lock ok guenther@