src - OpenBSD base system

Age	Commit message (Collapse)	Author
2021-01-19	(re)Introduce locking for amaps & anons.	Martin Pieuchot
	A rwlock is attached to every amap and is shared with all its anon. The same lock will be used by multiple amaps if they have anons in common. This should be enough to get the upper part of the fault handler out of the KERNEL_LOCK() which seems to bring up to 20% improvements in builds. This is based/copied/adapted from the most recent work done in NetBSD which is an evolution of the precendent simple_lock scheme. Tested by many, thanks! ok kettenis@, mvs@
2021-01-16	Move `access_type' to the fault context.	Martin Pieuchot
	Fix a regression where the valye wasn't correctly overwritten for wired mapping, introduced in previous refactoring. ok mvs@
2021-01-11	Assert that the KERNEL_LOCK() is held in uao_set_swslot().	Martin Pieuchot
	ok kettenis@
2021-01-09	Enforce range with sysctl_int_bounded in swap_encrypt_ctl	gnezdo
	OK millert@
2021-01-02	uvm: uvm_fault_lower(): don't sleep on lbolt	cheloha
	We can simulate the current behavior without lbolt by sleeping for 1 second on the &nowake channel. ok mpi@
2020-12-28	Use per-CPU counters for fault and stats counters reached in uvm_fault().	Martin Pieuchot
	ok kettenis@, dlg@
2020-12-15	Remove the assertion in uvm_km_pgremove().	Martin Pieuchot
	At least some initialization code on i386 calls it w/o KERNEL_LOCK(). Found the hardway by jungle Boogie and Hrvoje Popovski.
2020-12-14	Grab the KERNEL_LOCK() or ensure it's held when poking at swap data structures.	Martin Pieuchot
	This will allow uvm_fault_upper() to enter swap-related functions without holding the KERNEL_LOCK(). ok jmatthew@
2020-12-08	Use a while loop instead of goto in uvm_fault().	Martin Pieuchot
	ok jmatthew@, tb@
2020-12-07	Convert the per-process thread list into a SMR_TAILQ.	Martin Pieuchot
	Currently all iterations are done under KERNEL_LOCK() and therefor use the *_LOCKED() variant. From and ok claudio@
2020-12-02	Document that the page queue must only be locked if the page is managed.	Martin Pieuchot
	ok kettenis@
2020-12-01	Turn uvm_pagealloc() mp-safe by checking uvmexp global with pageqlock held.	Martin Pieuchot
	Use a new flag, UVM_PLA_USERESERVE, to tell uvm_pmr_getpages() that using kernel reserved pages is allowed. Merge duplicated checks waking the pagedaemon to uvm_pmr_getpages(). Add two more pages to the amount reserved for the kernel to compensate the fact that the pagedaemon may now consume an additional page. Document locking of some uvmexp fields. ok kettenis@
2020-11-27	Set the correct IPL for `pageqlock' now that it is grabbed from interrupt.	Martin Pieuchot
	Reported by AIsha Tammy. ok kettenis@
2020-11-24	Grab the `pageqlock' before calling uvm_pageclean() as intended.	Martin Pieuchot
	Document which global data structures require this lock and add some asserts where the lock should be held. Some code paths are still incorrect and should be revisited. ok jmatthew@
2020-11-19	Move logic handling lower faults, case 2, to its own function.	Martin Pieuchot
	No functionnal change. ok kettenis@, jmatthew@, tb@
2020-11-16	Remove Case2 goto, use a simple if () instead.	Martin Pieuchot
	ok tb@, jmatthew@
2020-11-13	Use a helper to look for existing mapping & return if there's an anon.	Martin Pieuchot
	Separate fault handling code for type 1 and 2 and reduce differences with NetBSD. ok tb@, jmatthew@, kettenis@
2020-11-13	Move the logic dealing with faults 1A & 1B to its own function.	Martin Pieuchot
	Some minor documentation improvments and style nits but this should not contain any functionnal change. ok tb@
2020-11-13	Introduce amap_adjref_anons() an helper to reference count amaps.	Martin Pieuchot
	Reduce code duplication, reduce differences with NetBSD and simplify upcoming locking diff. ok jmatthew@
2020-11-06	Remove unused `anon' argument from uvmfault_unlockall().	Martin Pieuchot
	It won't be used when amap and anon locking will be introduced. This "fixes" passing a unrelated/uninitialized pointer in an error path in case of memory shortage. ok kettenis@
2020-10-26	Fix a deadlock between uvn_io() and uvn_flush(). While faulting on a	anton
	page backed by a vnode, uvn_io() will end up being called in order to populate newly allocated pages using I/O on the backing vnode. Before performing the I/O, newly allocated pages are flagged as busy by uvn_get(), that is before uvn_io() tries to lock the vnode. Such pages could then end up being flushed by uvn_flush() which already has acquired the vnode lock. Since such pages are flagged as busy, uvn_flush() will wait for them to be flagged as not busy. This will never happens as uvn_io() cannot make progress until the vnode lock is released. Instead, grab the vnode lock before allocating and flagging pages as busy in uvn_get(). This does extend the scope in uvn_get() in which the vnode is locked but resolves the deadlock. ok mpi@ Reported-by: syzbot+e63407b35dff08dbee02@syzkaller.appspotmail.com
2020-10-24	We will soon have DRM on powerpc64.	Mark Kettenis

2020-10-21	move the backwards-stack vm_minsaddr check from hppa trap.c to uvm_grow(),	Theo de Raadt
	within the correct #ifdef of course. ok kettenis
2020-10-21	Constify and use C99 initializer for "struct uvm_pagerops".	Martin Pieuchot
	While here put some KERNEL_ASSERT_LOCKED() in the functions called from the page fault handler. The removal of locking of `uobj' will need to be revisited and these are good indicator that something is missing and that many comments are lying. ok kettenis
2020-10-21	Move the top part of uvm_fault() (lookups, checks, etc) in their own function.	Martin Pieuchot
	The name, uvm_fault_check() and logic comes from NetBSD as reuducing diff with their tree is useful to learn from their experience and backport fixes. No functional change intended. ok kettenis@
2020-10-20	Remove guard, uao_init() is called only once and no other function use one.	Martin Pieuchot
	ok kettenis@
2020-10-19	Clear vmspace pointer in struct process before calling uvmspace_free(9).	Mark Kettenis
	ok patrick@, mpi@
2020-10-19	Serialize accesses to "struct vmspace" and document its refcounting.	Martin Pieuchot
	The underlying vm_space lock is used as a substitute to the KERNEL_LOCK() in uvm_grow() to make sure `vm_ssize' is not corrupted. ok anton@, kettenis@
2020-10-13	typo in comment	Martin Pieuchot

2020-10-12	Use KASSERT() instead of if(x) panic() for NULL dereference checks.	Martin Pieuchot
	Improves readability and reduces the difference with NetBSD without compromising debuggability on RAMDISK. While here also use local variables to help with future locking and reference counting. ok semarie@
2020-10-09	Remove unecesary includes.	Martin Pieuchot
	ok deraadt@
2020-10-07	Do not release the KERNEL_LOCK() when mmap(2)ing files.	Martin Pieuchot
	Previous attempt to unlock amap & anon exposed a race in vnode reference counting. So be conservative with the code paths that we're not fully moving out of the KERNEL_LOCK() to allow us to concentrate on one area at a time. The panic reported was: ....panic: vref used where vget required ....db_enter() at db_enter+0x5 ....panic() at panic+0x129 ....vref(ffffff03b20d29e8) at vref+0x5d ....uvn_attach(1010000,ffffff03a5879dc0) at uvn_attach+0x11d ....uvm_mmapfile(7,ffffff03a5879dc0,2,1,13,100000012) at uvm_mmapfile+0x12c ....sys_mmap(c50,ffff8000225f82a0,1) at sys_mmap+0x604 ....syscall() at syscall+0x279 Note that this change has no effect as long as mmap(2) is still executed with ze big lock. ok kettenis@
2020-10-04	Recent changes for PROT_NONE pages to not count against resource limits,	Theo de Raadt
	failed to note this also guarded against heavy amap allocations in the MAP_SHARED case. Bring back the checks for MAP_SHARED from semarie, ok kettenis https://syzkaller.appspot.com/bug?extid=d80de26a8db6c009d060
2020-09-29	Introduce a helper to check if all available swap is in use.	Martin Pieuchot
	This reduces code duplication, reduces the diff with NetBSD and will help to introduce locks around global variables. ok cheloha@
2020-09-25	Use KASSERT() instead of if(x) panic() for sanity checks.	Martin Pieuchot
	Reduce the diff with NetBSD. ok kettenis@, deraadt@
2020-09-24	Remove trailing white spaces.	Martin Pieuchot

2020-09-22	Spell inline correctly.	Martin Pieuchot
	Reduce differences with NetBSD. ok mvs@, kettenis@
2020-09-22	Kill outdated comment, pmap_enter(9) doesn't sleep.	Martin Pieuchot
	ok kettenis@
2020-09-14	Since the issues with calling uvm_map_inentry_fix() without holding the	Mark Kettenis
	kernel lock are fixed now, push the kernel lock down again. ok deraadt@
2020-09-13	Include <sys/systm.h> directly instead of relying on uvm_map.h to pull it.	Martin Pieuchot

2020-09-12	Add tracepoints in the page fault handler and when entries are added to maps.	Martin Pieuchot
	ok kettenis@
2020-07-06	fix spelling	Theo de Raadt

2020-07-06	Add support for timeconting in userland.	Paul Irofti
	This diff exposes parts of clock_gettime(2) and gettimeofday(2) to userland via libc eliberating processes from the need for a context switch everytime they want to count the passage of time. If a timecounter clock can be exposed to userland than it needs to set its tc_user member to a non-zero value. Tested with one or multiple counters per architecture. The timing data is shared through a pointer found in the new ELF auxiliary vector AUX_openbsd_timekeep containing timehands information that is frequently updated by the kernel. Timing differences between the last kernel update and the current time are adjusted in userland by the tc_get_timecount() function inside the MD usertc.c file. This permits a much more responsive environment, quite visible in browsers, office programs and gaming (apparently one is are able to fly in Minecraft now). Tested by robert@, sthen@, naddy@, kmos@, phessler@, and many others! OK from at least kettenis@, cheloha@, naddy@, sthen@
2020-06-24	kernel: use gettime(9)/getuptime(9) in lieu of time_second(9)/time_uptime(9)	cheloha
	time_second(9) and time_uptime(9) are widely used in the kernel to quickly get the system UTC or system uptime as a time_t. However, time_t is 64-bit everywhere, so it is not generally safe to use them on 32-bit platforms: you have a split-read problem if your hardware cannot perform atomic 64-bit reads. This patch replaces time_second(9) with gettime(9), a safer successor interface, throughout the kernel. Similarly, time_uptime(9) is replaced with getuptime(9). There is a performance cost on 32-bit platforms in exchange for eliminating the split-read problem: instead of two register reads you now have a lockless read loop to pull the values from the timehands. This is really not too bad in the grand scheme of things, but compared to what we were doing before it is several times slower. There is no performance cost on 64-bit (__LP64__) platforms. With input from visa@, dlg@, and tedu@. Several bugs squashed by visa@. ok kettenis@
2020-05-23	Prevent km_alloc() from returning garbage if pagelist is empty.	jan
	ok bluhm@, visa@
2020-04-23	Document uvmexp.nswget without relying on implementation details.	Martin Pieuchot
	Prompted by a question from schwarze@ ok deraadt@, schwarze@, visa@
2020-04-04	Tweak the code that wakes up uvm_pmalloc sleepers in the page daemin.	Mark Kettenis
	Although there are open questions about whether we should flag failures with UVM_PMA_FAIL or not, we really should only wake up a sleeper if we unlink the pma. For now only do that if pages were actually freed in the requested region. Prompted by: CID 1453061 Logically dead code which should be fixed by this commit. ok (and together with) beck@
2020-03-25	Do not test against NULL a variable which is dereference before that.	Martin Pieuchot
	CID 1453116 ok kettenis@
2020-03-24	Use FALLTHROUGH in uvm_total() like it is done in uvm_loadav().	Martin Pieuchot
	CID 1453262.
2020-03-04	Do not count pages mapped as PROT_NONE against the RLIMIT_DATA limit.	Mark Kettenis
	Instead count (and check the limit) when their protection gets flipped from PROT_NONE to something that permits access. This means that mprotect(2) may now fail if changing the protection would exceed RLIMIT_DATA. This helps code (such as Chromium's JavaScript interpreter that reserves large chunks of address space but populates it sparsely. ok deraadt@, otto@, kurt@, millert@, robert@