summaryrefslogtreecommitdiff
path: root/sys/uvm
AgeCommit message (Collapse)Author
2021-01-19(re)Introduce locking for amaps & anons.Martin Pieuchot
A rwlock is attached to every amap and is shared with all its anon. The same lock will be used by multiple amaps if they have anons in common. This should be enough to get the upper part of the fault handler out of the KERNEL_LOCK() which seems to bring up to 20% improvements in builds. This is based/copied/adapted from the most recent work done in NetBSD which is an evolution of the precendent simple_lock scheme. Tested by many, thanks! ok kettenis@, mvs@
2021-01-16Move `access_type' to the fault context.Martin Pieuchot
Fix a regression where the valye wasn't correctly overwritten for wired mapping, introduced in previous refactoring. ok mvs@
2021-01-11Assert that the KERNEL_LOCK() is held in uao_set_swslot().Martin Pieuchot
ok kettenis@
2021-01-09Enforce range with sysctl_int_bounded in swap_encrypt_ctlgnezdo
OK millert@
2021-01-02uvm: uvm_fault_lower(): don't sleep on lboltcheloha
We can simulate the current behavior without lbolt by sleeping for 1 second on the &nowake channel. ok mpi@
2020-12-28Use per-CPU counters for fault and stats counters reached in uvm_fault().Martin Pieuchot
ok kettenis@, dlg@
2020-12-15Remove the assertion in uvm_km_pgremove().Martin Pieuchot
At least some initialization code on i386 calls it w/o KERNEL_LOCK(). Found the hardway by jungle Boogie and Hrvoje Popovski.
2020-12-14Grab the KERNEL_LOCK() or ensure it's held when poking at swap data structures.Martin Pieuchot
This will allow uvm_fault_upper() to enter swap-related functions without holding the KERNEL_LOCK(). ok jmatthew@
2020-12-08Use a while loop instead of goto in uvm_fault().Martin Pieuchot
ok jmatthew@, tb@
2020-12-07Convert the per-process thread list into a SMR_TAILQ.Martin Pieuchot
Currently all iterations are done under KERNEL_LOCK() and therefor use the *_LOCKED() variant. From and ok claudio@
2020-12-02Document that the page queue must only be locked if the page is managed.Martin Pieuchot
ok kettenis@
2020-12-01Turn uvm_pagealloc() mp-safe by checking uvmexp global with pageqlock held.Martin Pieuchot
Use a new flag, UVM_PLA_USERESERVE, to tell uvm_pmr_getpages() that using kernel reserved pages is allowed. Merge duplicated checks waking the pagedaemon to uvm_pmr_getpages(). Add two more pages to the amount reserved for the kernel to compensate the fact that the pagedaemon may now consume an additional page. Document locking of some uvmexp fields. ok kettenis@
2020-11-27Set the correct IPL for `pageqlock' now that it is grabbed from interrupt.Martin Pieuchot
Reported by AIsha Tammy. ok kettenis@
2020-11-24Grab the `pageqlock' before calling uvm_pageclean() as intended.Martin Pieuchot
Document which global data structures require this lock and add some asserts where the lock should be held. Some code paths are still incorrect and should be revisited. ok jmatthew@
2020-11-19Move logic handling lower faults, case 2, to its own function.Martin Pieuchot
No functionnal change. ok kettenis@, jmatthew@, tb@
2020-11-16Remove Case2 goto, use a simple if () instead.Martin Pieuchot
ok tb@, jmatthew@
2020-11-13Use a helper to look for existing mapping & return if there's an anon.Martin Pieuchot
Separate fault handling code for type 1 and 2 and reduce differences with NetBSD. ok tb@, jmatthew@, kettenis@
2020-11-13Move the logic dealing with faults 1A & 1B to its own function.Martin Pieuchot
Some minor documentation improvments and style nits but this should not contain any functionnal change. ok tb@
2020-11-13Introduce amap_adjref_anons() an helper to reference count amaps.Martin Pieuchot
Reduce code duplication, reduce differences with NetBSD and simplify upcoming locking diff. ok jmatthew@
2020-11-06Remove unused `anon' argument from uvmfault_unlockall().Martin Pieuchot
It won't be used when amap and anon locking will be introduced. This "fixes" passing a unrelated/uninitialized pointer in an error path in case of memory shortage. ok kettenis@
2020-10-26Fix a deadlock between uvn_io() and uvn_flush(). While faulting on aanton
page backed by a vnode, uvn_io() will end up being called in order to populate newly allocated pages using I/O on the backing vnode. Before performing the I/O, newly allocated pages are flagged as busy by uvn_get(), that is before uvn_io() tries to lock the vnode. Such pages could then end up being flushed by uvn_flush() which already has acquired the vnode lock. Since such pages are flagged as busy, uvn_flush() will wait for them to be flagged as not busy. This will never happens as uvn_io() cannot make progress until the vnode lock is released. Instead, grab the vnode lock before allocating and flagging pages as busy in uvn_get(). This does extend the scope in uvn_get() in which the vnode is locked but resolves the deadlock. ok mpi@ Reported-by: syzbot+e63407b35dff08dbee02@syzkaller.appspotmail.com
2020-10-24We will soon have DRM on powerpc64.Mark Kettenis
2020-10-21move the backwards-stack vm_minsaddr check from hppa trap.c to uvm_grow(),Theo de Raadt
within the correct #ifdef of course. ok kettenis
2020-10-21Constify and use C99 initializer for "struct uvm_pagerops".Martin Pieuchot
While here put some KERNEL_ASSERT_LOCKED() in the functions called from the page fault handler. The removal of locking of `uobj' will need to be revisited and these are good indicator that something is missing and that many comments are lying. ok kettenis
2020-10-21Move the top part of uvm_fault() (lookups, checks, etc) in their own function.Martin Pieuchot
The name, uvm_fault_check() and logic comes from NetBSD as reuducing diff with their tree is useful to learn from their experience and backport fixes. No functional change intended. ok kettenis@
2020-10-20Remove guard, uao_init() is called only once and no other function use one.Martin Pieuchot
ok kettenis@
2020-10-19Clear vmspace pointer in struct process before calling uvmspace_free(9).Mark Kettenis
ok patrick@, mpi@
2020-10-19Serialize accesses to "struct vmspace" and document its refcounting.Martin Pieuchot
The underlying vm_space lock is used as a substitute to the KERNEL_LOCK() in uvm_grow() to make sure `vm_ssize' is not corrupted. ok anton@, kettenis@
2020-10-13typo in commentMartin Pieuchot
2020-10-12Use KASSERT() instead of if(x) panic() for NULL dereference checks.Martin Pieuchot
Improves readability and reduces the difference with NetBSD without compromising debuggability on RAMDISK. While here also use local variables to help with future locking and reference counting. ok semarie@
2020-10-09Remove unecesary includes.Martin Pieuchot
ok deraadt@
2020-10-07Do not release the KERNEL_LOCK() when mmap(2)ing files.Martin Pieuchot
Previous attempt to unlock amap & anon exposed a race in vnode reference counting. So be conservative with the code paths that we're not fully moving out of the KERNEL_LOCK() to allow us to concentrate on one area at a time. The panic reported was: ....panic: vref used where vget required ....db_enter() at db_enter+0x5 ....panic() at panic+0x129 ....vref(ffffff03b20d29e8) at vref+0x5d ....uvn_attach(1010000,ffffff03a5879dc0) at uvn_attach+0x11d ....uvm_mmapfile(7,ffffff03a5879dc0,2,1,13,100000012) at uvm_mmapfile+0x12c ....sys_mmap(c50,ffff8000225f82a0,1) at sys_mmap+0x604 ....syscall() at syscall+0x279 Note that this change has no effect as long as mmap(2) is still executed with ze big lock. ok kettenis@
2020-10-04Recent changes for PROT_NONE pages to not count against resource limits,Theo de Raadt
failed to note this also guarded against heavy amap allocations in the MAP_SHARED case. Bring back the checks for MAP_SHARED from semarie, ok kettenis https://syzkaller.appspot.com/bug?extid=d80de26a8db6c009d060
2020-09-29Introduce a helper to check if all available swap is in use.Martin Pieuchot
This reduces code duplication, reduces the diff with NetBSD and will help to introduce locks around global variables. ok cheloha@
2020-09-25Use KASSERT() instead of if(x) panic() for sanity checks.Martin Pieuchot
Reduce the diff with NetBSD. ok kettenis@, deraadt@
2020-09-24Remove trailing white spaces.Martin Pieuchot
2020-09-22Spell inline correctly.Martin Pieuchot
Reduce differences with NetBSD. ok mvs@, kettenis@
2020-09-22Kill outdated comment, pmap_enter(9) doesn't sleep.Martin Pieuchot
ok kettenis@
2020-09-14Since the issues with calling uvm_map_inentry_fix() without holding theMark Kettenis
kernel lock are fixed now, push the kernel lock down again. ok deraadt@
2020-09-13Include <sys/systm.h> directly instead of relying on uvm_map.h to pull it.Martin Pieuchot
2020-09-12Add tracepoints in the page fault handler and when entries are added to maps.Martin Pieuchot
ok kettenis@
2020-07-06fix spellingTheo de Raadt
2020-07-06Add support for timeconting in userland.Paul Irofti
This diff exposes parts of clock_gettime(2) and gettimeofday(2) to userland via libc eliberating processes from the need for a context switch everytime they want to count the passage of time. If a timecounter clock can be exposed to userland than it needs to set its tc_user member to a non-zero value. Tested with one or multiple counters per architecture. The timing data is shared through a pointer found in the new ELF auxiliary vector AUX_openbsd_timekeep containing timehands information that is frequently updated by the kernel. Timing differences between the last kernel update and the current time are adjusted in userland by the tc_get_timecount() function inside the MD usertc.c file. This permits a much more responsive environment, quite visible in browsers, office programs and gaming (apparently one is are able to fly in Minecraft now). Tested by robert@, sthen@, naddy@, kmos@, phessler@, and many others! OK from at least kettenis@, cheloha@, naddy@, sthen@
2020-06-24kernel: use gettime(9)/getuptime(9) in lieu of time_second(9)/time_uptime(9)cheloha
time_second(9) and time_uptime(9) are widely used in the kernel to quickly get the system UTC or system uptime as a time_t. However, time_t is 64-bit everywhere, so it is not generally safe to use them on 32-bit platforms: you have a split-read problem if your hardware cannot perform atomic 64-bit reads. This patch replaces time_second(9) with gettime(9), a safer successor interface, throughout the kernel. Similarly, time_uptime(9) is replaced with getuptime(9). There is a performance cost on 32-bit platforms in exchange for eliminating the split-read problem: instead of two register reads you now have a lockless read loop to pull the values from the timehands. This is really not *too* bad in the grand scheme of things, but compared to what we were doing before it is several times slower. There is no performance cost on 64-bit (__LP64__) platforms. With input from visa@, dlg@, and tedu@. Several bugs squashed by visa@. ok kettenis@
2020-05-23Prevent km_alloc() from returning garbage if pagelist is empty.jan
ok bluhm@, visa@
2020-04-23Document uvmexp.nswget without relying on implementation details.Martin Pieuchot
Prompted by a question from schwarze@ ok deraadt@, schwarze@, visa@
2020-04-04Tweak the code that wakes up uvm_pmalloc sleepers in the page daemin.Mark Kettenis
Although there are open questions about whether we should flag failures with UVM_PMA_FAIL or not, we really should only wake up a sleeper if we unlink the pma. For now only do that if pages were actually freed in the requested region. Prompted by: CID 1453061 Logically dead code which should be fixed by this commit. ok (and together with) beck@
2020-03-25Do not test against NULL a variable which is dereference before that.Martin Pieuchot
CID 1453116 ok kettenis@
2020-03-24Use FALLTHROUGH in uvm_total() like it is done in uvm_loadav().Martin Pieuchot
CID 1453262.
2020-03-04Do not count pages mapped as PROT_NONE against the RLIMIT_DATA limit.Mark Kettenis
Instead count (and check the limit) when their protection gets flipped from PROT_NONE to something that permits access. This means that mprotect(2) may now fail if changing the protection would exceed RLIMIT_DATA. This helps code (such as Chromium's JavaScript interpreter that reserves large chunks of address space but populates it sparsely. ok deraadt@, otto@, kurt@, millert@, robert@