Age | Commit message (Collapse) | Author |
|
Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .
This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.
Discussed with and OK dlg@, OK mpi@
|
|
obvious misconfigurations that cannot work.
OK mpi@ tedu@
|
|
disabled in the process. Rather than tying it to KERNBASE, make it simply
-1, which means it even more invalid..
ok tedu
|
|
MAP_CONCEAL'd memory is not written to disk in the event of a core dump.
It may grow other qualities in the future.
Wanted by libressl, probably useful elsewhere, too.
Prompted by deraadt@, concept from deraadt@/kettenis@. With input from
deraadt@, cjeker@, kettenis@, otto@, bcook@, matthew@, guenther@, djm@,
and tedu@.
ok otto@ deraadt@
|
|
objects that readers can access without locking. This provides a basis
for read-copy-update operations.
Readers access SMR-protected shared objects inside SMR read-side
critical section where sleeping is not allowed. To reclaim
an SMR-protected object, the writer has to ensure mutual exclusion of
other writers, remove the object's shared reference and wait until
read-side references cannot exist any longer. As an alternative to
waiting, the writer can schedule a callback that gets invoked when
reclamation is safe.
The mechanism relies on CPU quiescent states to determine when an
SMR-protected object is ready for reclamation.
The <sys/smr.h> header additionally provides an implementation of
singly- and doubly-linked lists that can be used together with SMR.
These lists allow lockless read access with a concurrent writer.
Discussed with many
OK mpi@ sashan@
|
|
was never updated.
from Amit Kulkarni
|
|
sp must be on a MAP_STACK page. Relax the check a bit -- the sp may be
on a PROT_NONE page. Can't see how an attacker can leverage that situation.
(New perl build process contains a "how many call frames can my stack
hold" checker, and this triggers via the MAP_STACK fault rather than
the normal access check. The MAP_STACK check still has a kernel printf
as we hunt for applications which map stacks poorly. Interestingly the
perl code has a knob to disable similar printing alerts on Windows, which
apparently has a feature somewhat like MAP_STACK!)
ok tedu guenther kettenis
|
|
instead
From Pamela Mosiejczuk, many thanks!
OK phessler@ deraadt@
|
|
introduced with __MAP_NOFAULT. The regression let uvm_fault() run
without proper locking and rechecking of state after map version change
if page zero-fill was chosen.
OK kettenis@ deraadt@
Reported-by: syzbot+9972088c1026668c6c5c@syzkaller.appspotmail.com
|
|
about shared resources which no program should see. only a few pieces of
software use it, generally poorly thought out. they are being fixed, so
mincore() can be deleted.
ok guenther tedu jca sthen, others
|
|
another process is doing. We don't want that, so instead have it
always return that memory is in core.
ok deraadt kettenis
|
|
physio(9) to prevent another thread from unmapping the memory and triggering
an assertion or even corruption random physical memory pages.
ok deraadt@
Should fix:
Reported-by: syzbot+b8e7faf688f8c9d341b1@syzkaller.appspotmail.com
Reported-by: syzbot+b6a9255faa0605669432@syzkaller.appspotmail.com
|
|
|
|
inteldrm driver to add support for the I915_MMAP_WC flag.
ok deraadt@, jsg@
|
|
ok jsg@ (who pointed out the kern_pledge.c change was necessary as well)
|
|
fd_getfile(9) is mpsafe. Note that sys_mmap(2) isn't actually unlocked
currently. However this diff has been tested with it unlocked, and I
hope to unlock it for real soon-ish.
ok visa@, mpi@
|
|
the start of the range of pages that we're changing. Prevents a panic from
a somewhat convoluted test case that anton@ came up with.
ok guenther@, anton@
|
|
kernel calls to ensure that the UVM cache for memory mapped files is
up to date.
ok mpi@
|
|
unusedNN.
Missing man page bits pointed out by
jmc@. Ports source scan by sthen@.
ok deraadt@ guenther@
|
|
|
|
"Buffer cache pages are wired but not counted as such. Therefore we
have to set the wire count on the pages to 0 before we call
uvm_pagefree() on them, just like we do in buf_free_pages().
Otherwise the wired pages counter goes negative. While there, also
sprinkle some KASSERTs in there that buf_free_pages() has as well."
ok beck@ (again)
|
|
unnecessary because curproc always does the locking.
OK mpi@
|
|
curproc that does the locking or unlocking, so the proc parameter
is pointless and can be dropped.
OK mpi@, deraadt@
|
|
ok visa@
|
|
stack buffer. With a page-aligned buffer, creating a MAP_STACK sub-region
would undo the PROT_NONE guard. Ignore that last page.
(We could check if the last page is non-RW before choosing to skip it. But
we've already elected to grow STK sizes to compensate. Always ignoring the
last page makes it a non-MAP_STACK guard page which can be opportunistically
discovered)
ok semarie stefan kettenis
|
|
the brk area anyway.
- Use a larger hint bound to spread the allocations more for the 32-bit case
- Simplified the overy abstracted brs/stack allocator and switch of
guard pages for the brk case. This allows i386 some extra space,
depending on memory usage patterns.
- Reduce brk area on i386 to give the rnd space more room
ok stefan@ sthen@
|
|
Other parts of uvm/pmap check for proper prot flags
already. This fixes the qemu startup problems that
semarie@ reported on tech@.
|
|
syscall) confirm the stack register points at MAP_STACK memory, otherwise
SIGSEGV is delivered. sigaltstack() and pthread_attr_setstack() are modified
to create a MAP_STACK sub-region which satisfies alignment requirements.
Observe that MAP_STACK can only be set/cleared by mmap(), which zeroes the
contents of the region -- there is no mprotect() equivalent operation, so
there is no MAP_STACK-adding gadget.
This opportunistic software-emulation of a stack protection bit makes
stack-pivot operations during ROPchain fragile (kind of like removing a
tool from the toolbox).
original discussion with tedu, uvm work by stefan, testing by mortimer
ok kettenis
|
|
direction, otherwise we might break the loop prematurely; ok stefan@
|
|
issues with upcoming NFSnode's locks.
ok visa@
|
|
protection cannot block the final SIGABRT.
While here apply the same logic to ddb(4)'s kill command.
From semarie@, ok deraadt@
|
|
revoked while syncing disk, so the processes lose their executable
pages. Instead of killing them with a SIGBUS after page fault,
just sleep. This should prevent that init dies without pages
followed by a kernel panic.
initial diff from tedu@; OK deraadt@ tedu@
|
|
The account flag `ASU' will no longer be set but that makes suser()
mpsafe since it no longer mess with a per-process field.
No objection from millert@, ok tedu@, bluhm@
|
|
|
|
no other process which could free it. Better panic in malloc(9)
or pool_get(9) instead of sleeping forever.
tested by visa@ patrick@ Jan Klemkow
suggested by kettenis@; OK deraadt@
|
|
so diffs in snapshots can exercise the change in a less disruptive way
idea with sthen, ok kettenis tom others
|
|
ok millert@ sthen@
|
|
ok deraadt@ krw@
|
|
that is attempted.
Minor cleanups:
- Eliminate some always false and always true tests against MAP_ANON
- We treat anon mappings with neither MAP_{SHARED,PRIVATE} as MAP_PRIVATE
so explicitly indicate that
ok kettenis@ beck@
|
|
Tested by Hrvoje Popovski.
|
|
when WITNESS is enabled
ok visa@ kettenis@
|
|
according to POSIX. Bring regression test and kernel in line for
amd64 and i386. Other architectures have to follow.
OK deraadt@ kettenis@
|
|
with the RS780E chipset.
OK kettenis@, jsg@
|
|
A deadlock can occur when the uvm_km_thread(), running without KERNEL_LOCK()
is interrupted by and non-MPSAFE handler while holding the pool's mutex. At
that moment if another CPU is holding the KERNEL_LOCK() and wants to grab the
pool mutex, like in sys_kbind(), kaboom!
This is a temporaty solution, a more generate approach regarding mutexes and
un-KERNEL_LOCK()ed threads is beeing discussed.
Deadlock reported by sthen@, ok kettenis@
|
|
Recursions are still marked as XXXSMP.
ok deraadt@, bluhm@
|
|
found by jmc@
|
|
the particular use before init was in uvm_init step 6, which calls
kmeminit to set up malloc(9), which calls uvm_km_zalloc, which calls
pmap_enter, which calls pool_get, which tries to allocate a page
using km_alloc, which isnt initalised until step 9 in uvm_init.
uvm_km_page_init calls kthread_create though, which uses malloc
internally, so it cant be reordered before malloc init.
to cope with this, uvm_km_page_init is split up. it sets up the
subsystem, and is called before kmeminit. the thread init is moved
to uvm_km_page_lateinit, which is called after kmeminit in uvm_init.
|
|
PZERO used to be a special value in the first BSD releases but since
the introduction of tsleep(9) there's no way to tell if a thread is
going to sleep for a "short" period of time.
This remove the only (ab)use of ``p_priority'' outside the scheuler
logic, which will help moving avway from a priority-based scheduler.
ok visa@
|
|
ok kettenis@
|
|
between mount locks and inode locks, which may been recorded in either order
ok visa@
|