summaryrefslogtreecommitdiff
path: root/sys/uvm
AgeCommit message (Collapse)Author
2024-04-13correct indentationJonathan Gray
no functional change, found by smatch warnings ok miod@ bluhm@
2024-04-10Use uvmpd_dropswap() in the case of swap shortage.Martin Pieuchot
ok kn@, kettenis@, miod@
2024-04-10Call uao_dropswap() instead of rerolling it.Martin Pieuchot
ok kn@, kettenis@, miod@
2024-04-06Prevent accounting bug when an anon w/ swap slot is passed to uvm_anon_release()Martin Pieuchot
uvm_anon_release() is always called for anon that have an associated page so decrementing `uvmexp.swpgonly' is incorrect. This happened because the page was cleared before calling uvm_anfree(). Reported by many including mvs@, miod@ and robert@ ok kettenis@, miod@
2024-04-05delete msyscall stubTheo de Raadt
2024-04-05On machines lacking xonly support hardware, we emulate xonly in theTheo de Raadt
copyin(9) layer below system calls, using a 4-entry lookup; the 4th entry is libc.so text. We were assuming, or rather insisting, that on all our architectures libc.so text is treated as xonly, even if the linker was behind in it's game. Since msyscall(2) is gone, kernel no longer has information about the start,len of libc.so text segment. But we can instead use the (same) start,len range of pinsyscalls() instead for this purpose. ld.so is passing the same text-range to the kernel in this position. regression tests run by anton discovered that libc.so text had become copyin-readable. ok kettenis
2024-04-05Esure the base,len range provided by ld.so is definately in the map.Theo de Raadt
Being outside the map doesn't seem like it can do anything bad. Discussed with kettenis
2024-04-03Stopping grabbing the kernel lock in kbind(2).Mark Kettenis
ok mpi@
2024-04-03pmap_virtual_space() and pmap_steal_memory() are mutually exclusive, soMiod Vallat
make sure only one of them is prototyped and only one of them is implemented. ok mpi@ kettenis@
2024-04-02Delete the msyscall mechanism entirely, since mimmutable+pinsyscalls hasTheo de Raadt
replaced it with a more strict mechanism, which happens to be lockless O(1) rather than micro-lock O(1)+O(log N). Also nop-out the sys_msyscall(2) guts, but leave the syscall around for a bit longer so that people can build through it, since ld.so(1) still wants to call it.
2024-03-30Document that pmemrange control data are protected by `uvm.fpageqlock'.Martin Pieuchot
2024-03-28Delete pinsyscall(2) [which was specific only to SYS_execve] nowTheo de Raadt
that it has been replaced with pinsyscalls(2) [which tells the kernel the location of all system calls in libc.so] floated to various people before release, but it was prudent to wait.
2024-03-27Initialize uvm_km_pages.mtx before use.Kurt Miller
okay mpi@ miod@
2024-03-24Cleanup uvmpd_tune() & document global variable ownership.Martin Pieuchot
- Stop calling uvmpd_tune() inside uvm_pageout(). OpenBSD currently doesn't support adding RAM. `uvmexp.npages' is immutable after boot. - Document that `uvmexp.freemin' and `uvmexp.freetarg' are immutable. - Reduce the scope of the `uvm_pageq_lock' lock. It serializes accesses to `uvmexp.active' and `uvmexp.inactive'. ok kettenis@
2024-02-21Only return EPERM for immutable regions for the nasty operationsTheo de Raadt
of madvise() and msync() which damaged the region. The sync ones are allowed to proceed (even if most of them are nops...) based on issues noted by anton and semarie
2024-02-13Remove sanity checks from uvm_pagefree(). The first thing this function doesMiod Vallat
is invoke uvm_pageclean(), which performs the exact same sanity check, so one set of checks is enough. ok mpi@
2024-02-03Remove Softdep.Bob Beck
Softdep has been a no-op for some time now, this removes it to get it out of the way. Flensing mostly done in Talinn, with some help from krw@ ok deraadt@
2024-01-21workaround for the static non-PIE instbin "instbin" program on the installTheo de Raadt
media is no longer needed, due to fix in libc/dlfcn/init.c thanks kettenis and gkoehler
2024-01-21For minherit(MAP_INHERIT_ZERO) upon readonly memory return EPERM.Theo de Raadt
ok kettenis
2024-01-21madvise(2) and msync(2) have some memory/mapping destructive ops which shouldTheo de Raadt
not be allowed upon immutable memory, instead return EPERM. Some of these ops are not destructive in OpenBSD, but they are destructive on other systems, so we take the "all ops" are illegal approach. Related to this, it should not be allowed to minherit(MAP_INHERIT_ZERO) immutable regions, or vice versa, calling mimmutable() upon MAP_INHERIT_ZERO regions, because such a range will be zero'd post-fork in the child. These now also return EPERM. Adjusting the madvise / msync behaviour upon immutable memory brings us closer to the behaviour of the mimmutable clone "mseal" being proposed by google for inclusion in Linux. ok kettenis
2024-01-21oops, brain scrambled trying to squeeze the ifdef into bad placeTheo de Raadt
2024-01-21some bizzare glitch related to ramdisk instbin static binaries, theirTheo de Raadt
mutable mapping is not working right, so temporarily bring back the RW -> R *only* for ramdisk kernels
2024-01-20Early during mimmutable(2) development, we had a big problem with theTheo de Raadt
chrome v8_flags variable's placement in bss, and as a workaround made it possible to demote a mimmutable mapping's permissions from RW to R. Further mimmutable-related work in libc's malloc created the same problem, which led to a better design: objects could be placed into .openbsd.mutable region, and then at runtime their permission and immutability could be manipulated better. So the RW to R demotion logic is no longer being used, and now this semantic is being deleted. ok kettenis
2024-01-19remove the guts of pinsyscall(2), it just returns 0 now.Theo de Raadt
It has been made redundant by the introduction of pinsyscalls(2) which handles all system calls, rather than just 1.
2024-01-17Fix core file writing when a file map into memory has later been truncatedKurt Miller
to be smaller than the mapping. Record which memory segments are backed by vnodes while walking the uvm map and later suppress EFAULT errors caused by the underlying file being truncated. okay miod@
2024-01-16The kernel will now read pinsyscall tables out of PT_OPENBSD_SYSCALLS inTheo de Raadt
the main program or ld.so, and accept a submission of that information for libc.so from ld.so via pinsyscalls(2). At system call invocation, the syscall number is matched to the specific address it must come from. ok kettenis, gnezdo, testing of variations by many people
2023-12-07Add a stub pinsyscalls() system call that simply returns 0 for now,Theo de Raadt
before future work where ld.so(1) will need this new system call. Putting this in the kernel ahead of time will save some grief. ok kettenis
2023-12-05Cast uvmexp.swpages to long before multiplying by 99 to avoid integerClaudio Jeker
overflows on systems with big swap partitions. OK kettenis@ miod@
2023-10-27Make out-of-swap checks more robust.Martin Pieuchot
Consider that the swap space is full when 99% of it is filled with pages that are no longer present in memory. This prevents deadlocks when out-of-swap if some swap ranges had I/O errors and have been marked as 'bad', or if some pages are unreachable by the pagedaemon and still holding some slots. Also introduce uvm_swapisfilled() to check if there are some free slots in the swap. Note that we consider the swap space completly filled if it is not possible to write a full cluster. This prevents deadlocks if a few slots are never allocated. ok miod@
2023-10-27Do not decrement the swap counter if the anon is associated to a "bad" slot.Martin Pieuchot
When such anon is freed its content is obviously not living in swap. ok miod@
2023-10-24Merge two equivalent if blocks.Martin Pieuchot
No functional change, ok tb@
2023-10-16Consider required constraint when moving pages from active to inactive lists.Martin Pieuchot
Make sure low pages are deactivated first when there is a shortage of inactive pages. Without this the system can have a ton of high pages on the active list and never swapout anything if there's a shortage of low pages. This prevents a deadlock on amd64 reported and tested by bluhm@. ok kettenis@
2023-09-16Allow counters_read(9) to take an optional scratch buffer.Martin Pieuchot
Using a scratch buffer makes it possible to take a consistent snapshot of per-CPU counters without having to allocate memory. Makes ddb(4) show uvmexp command work in OOM situations. ok kn@, mvs@, cheloha@
2023-09-05Address the case 2b version of inconsistent view across threads ofPhilip Guenther
a page undergoing copy-on-write faulting. We fixed the case 1b version in rev 1.125 (2022-02-01), but missed this other path. jsg@ noted that in NetBSD Chuck Silvers had a relevant commit, their rev 1.234 (2023-08-13), which looks like it fixed both cases due to their refactoring of common code into a uvmfault_promote() function. ok mpi@ jca@
2023-09-02Zap anon pages mappings in uvm_anon_release() instead of in the fault handler.Martin Pieuchot
This makes all code paths deactivating or freeing anons consistent. No objection from the usual suspects.
2023-08-18Move the loadavg calculation to sched_bsd.c as update_loadav()Claudio Jeker
With this uvm_meter() is no more and update_loadav() uses a simple timeout instead of getting called via schedcpu(). OK deraadt@ mpi@ cheloha@
2023-08-12Add sanity checks in uvm_pagelookup().Martin Pieuchot
ok kettenis@
2023-08-11Kill unused variable in uvm_aio_aiodone_pages().Martin Pieuchot
2023-08-03Remove the per-cpu loadavg calculation.Claudio Jeker
The current scheduler useage is highly questionable and probably not helpful. OK kettenis@ cheloha@ deraadt@
2023-08-03Mark the exponential constants for load avarage calculation as const.Claudio Jeker
OK cheloha@
2023-08-02uvm_loadav: don't recompute schedstate_percpu.spc_nrunScott Soule Cheloha
We track the nrun value in schedstate_percpu.spc_nrun. There is no reason to walk the allproc list to recompute it. Prompted by claudio@. Thread: https://marc.info/?l=openbsd-tech&m=169059099426049&w=2 ok claudio@
2023-08-02Remove unused vm_map_upgrade() & vm_map_downgrade().Martin Pieuchot
Upgrade/downgrade operations on a `vmmaplk' are no longer necessary since vm_map_busy() completely unlocks it (r1.318 of uvm/uvm_map.c). ok kettenis@
2023-08-01The swapper left the building long time ago. Now with the issue inClaudio Jeker
inteldrm fixed we should be able to remove this unneeded wakeup for good. OK mvs@ cheloha@ deraadt@
2023-06-21Revert "schedcpu, uvm_meter(9): make uvm_meter() an independent timeout"Scott Soule Cheloha
Sometimes causes boot hang after mounting root partition. Thread 1: https://marc.info/?l=openbsd-misc&m=168736497407357&w=2 Thread 2: https://marc.info/?l=openbsd-misc&m=168737429214370&w=2
2023-06-20schedcpu, uvm_meter(9): make uvm_meter() an independent timeoutScott Soule Cheloha
uvm_meter(9) should not base its periodic uvm_loadav() call on the UTC clock. It also no longer needs to periodically wake up proc0 because proc0 doesn't do any work. schedcpu() itself may change or go away, but as kettenis@ notes we probably can't completely remove the concept of a "load average" from OpenBSD, given its long Unix heritage. So, (1) remove the uvm_meter() call from schedcpu(), (2) make uvm_meter() an independent timeout started alongside schedcpu() during scheduler_start(), and (3) delete the vestigial periodic proc0 wakeup. With input from deraadt@, kettenis@, and claudio@. deraadt@ cautions that this change may confuse administrators who hold the load average in high regard. Thread: https://marc.info/?l=openbsd-tech&m=168710929409153&w=2 general agreement with this direction from kettenis@ ok claudio@
2023-05-30spellingJonathan Gray
ok jmc@ guenther@ tb@
2023-05-20Do not grab the `vmmaplk' recursively, prevent a self-deadlock.Martin Pieuchot
Change the semantic of vm_map_busy() to be able to completely unlock the `vmmaplk' instead of downgrading it to a read lock in mlock(2). This is necessary because uvm_fault_wire() tries to re-grab the same lock. We now keep track of the thread currently holding the vmmap busy to ensure it can relock & unbusy the vmmap. The new pattern becomes: ....vm_map_lock(map); ....vm_map_busy(map); /* prevent other threads to grab an exclusive lock */ ....vm_map_unlock(map); .... ..../* .... * Do some stuff generally requiring a tsleep(9). .... */ .... ....vm_map_lock(map); ....vm_map_unbusy(map); /* allow other threads to make progress after unlock */ ....vm_map_unlock(map); Fix adapted from NetBSD's r1.249 of uvm/uvm_map.c. Issue reported by Jacqueline Jolicoeur exposed by a "wallet refresh" of the Monero App. Panic hand-copied below: sleep_finish() rw_enter() uvmfault_lookup() uvm_fault_check() uvm_fault() uvm_fault_wire() uvm_map_pageable_wire() sys_mlock() This version skips bumping the map's timestamp if the lock is acquired by the thread marked the VM map busy. This prevents a KASSERT() reported by bluhm@ triggered by regress/misc/posixtestsuite conformance/interfaces/mmap/18-1 ok kettenis@
2023-05-13Put back in the simplification of the aiodone daemon.Martin Pieuchot
Previous "breakage" of the swap on arm64 has been found to be an issue on one machine the rockpro/arm64 related to a deadlock built into the sdmmc(4) stack interacting with swapping code both running under KERNEL_LOCK(). This issue is easily reproducible on -current and entering swap when building LLVM on a rockpro crashes the machine by memory corruption. Tested by mlarkin@ on octeon & i386, by myself on amd64 & arm64 and by sthen@ on i386 port bulk. ok beck@ some time ago. Previous commit message: Simplify the aiodone daemon which is only used for async writes. - Remove unused support for asynchronous read, including error conditions - Grab the proper lock for each page that has been written to swap. This allows to enable an assertion in uvm_page_unbusy(). - Move the uvm_anon_release() call outside of uvm_page_unbusy() and assert for the different anon cases. ok beck@, kettenis@
2023-05-09Inline once-used variable to sync all uvm_map_clean() callersKlemens Nanni
OK mpi
2023-04-26Backout previous commit:Alexander Bluhm
Do not grab the `vmmaplk' recursively, prevent a self-deadlock. It causes panic: uvm_map_pageable_wire: stale map Found by regress/misc/posixtestsuite conformance/interfaces/mmap/18-1 requested by deraadt@