Age | Commit message (Collapse) | Author |
|
net/if_pppx.c pointed out by jsg@
ok gnezdo@ deraadt@ jsg@ mpi@ millert@
|
|
The drm subsystem implements graphics buffers as uvm objects backed by
anonymous memory, thus drm locks and aobj locks share the same "vmobjlock"
type.
uvm_obj_wire() is only called from sys/dev/pci/drm/, so instead of changing
drm's lock init/alloc routines to mark allow duplicate locks in general,
enter uvm's vmobjlock with RW_DUPOK in this function to allow duplicate
lock types per thread in this specific call path alone.
Fixes the following WITNESS report when booting/starting X (as seen already
in other unrelated bugs@ reports):
wsdisplay0: screen 1-5 added (std, vt100 emulation)
witness: acquiring duplicate lock of same type: "&uobj->vmobjlock"
1st uobjlk
2nd uobjlk
Starting stack trace...
witness_checkorder(fffffd83b625f9b0,9,0) at witness_checkorder+0x8ac
rw_enter(fffffd83b625f9a0,1) at rw_enter+0x68
uvm_obj_wire(fffffd843c39e948,0,40000,ffff800033b70428) at uvm_obj_wire+0x46
shmem_get_pages(ffff800008008500) at shmem_get_pages+0xb8
__i915_gem_object_get_pages(ffff800008008500) at __i915_gem_object_get_pages+0x6d
i915_gem_fault(ffff800008008500,ffff800033b707c0,10009b000,a43d6b1c000,ffff800033b70740,1,35ba896911df1241,ffff8000000aa078,ffff8000000aa178) at i915_gem_fault+0x203
drm_fault(ffff800033b707c0,a43d6b1c000,ffff800033b70740,1,0,0,7eca45006f70ee0,ffff800033b707c0) at drm_fault+0x156
uvm_fault(fffffd843a7cf480,a43d6b1c000,0,2) at uvm_fault+0x179
upageflttrap(ffff800033b70920,a43d6b1c000) at upageflttrap+0x62
usertrap(ffff800033b70920) at usertrap+0x129
recall_trap() at recall_trap+0x8
end of kernel
end trace frame: 0x7f7ffffdc7c0, count: 246
End of stack trace.
Input kettenis
OK mpi
|
|
|
|
ok mpi@
|
|
The (known) lock order reversals which now occur more reliably and much
earlier on WITNESS boots with this diff knock out syzcaller reports since
syzcaller stops at the first "crash report":
https://syzkaller.appspot.com/bug?id=81b39e970cd2eb21b97d1b31746c693e300fd2dd
|
|
This is an updated version of uvm_map.c r1.283 "Unwire with map lock held".
The previous version introduced a use-after-free by not unlocking vm_map
locks in uvm_map_teardown(), resulting in dangling references on the
reaper's lock list (thanks visa!).
Lock and unlock the map in around uvm_map_teardown() instead.
This code path holds the last reference, hence the lock isn't strictly
needed except for satisfying upcoming locking assertions.
Tested on amd64, arm64, i386, macppc, octeon, sparc64.
This time also with WITNESS enabled (except on sparc64 which builds but does
not boot with WITNESS; this is a known issue).
OK mpi visa
|
|
WITNESS builds broke^W^Wkernels panic on boot as reported by anton and bluhm.
Booting bsd.mp in single-user mode inside VMM shows:
root on sd0a (5f9e458ed30b39ab.a) swap on sd0b dump on sd0b
Enter pathname of shell or RETURN for sh:
witness: lock order reversal:
1st 0xfffffd801f8ce468 vmmaplk (&map->lock)
2nd 0xfffffd801b8162c0 inode (&ip->i_lock)
lock order "&ip->i_lock"(rrwlock) -> "&map->lock"(rwlock) first seen at:
#0 rw_enter_read+0x38
#1 uvmfault_lookup+0x8a
#2 uvm_fault_check+0x32
#3 uvm_fault+0xfb
#4 kpageflttrap+0x12c
#5 kerntrap+0x91
#6 alltraps_kern_meltdown+0x7b
#7 copyout+0x53
#8 ffs_read+0x1f6
#9 VOP_READ+0x41
#10 vn_rdwr+0xa1
#11 vmcmd_map_readvn+0xa0
#12 exec_process_vmcmds+0x88
#13 sys_execve+0x732
#14 start_init+0x26f
#15 proc_trampoline+0x1c
lock order data w1 -> w2 missing
# exit
kernel: protection fault trap, code=0
Stopped at witness_checkorder+0x312: movl 0x10(%r14),%ecx
gkoehler reported faults on poisened addresses on macppc dual G5.
|
|
WITNESS builds broke as reported by anton and bluhm:
root on sd0a (5ec49b3ad23eb2d4.a) swap on sd0b dump on sd0b
kernel: protection fault trap, code=0
Stopped at witness_checkorder+0x4ec: movl 0x10(%r12),%ecx
https://syzkaller.appspot.com/bug?id=be02b290a93c648986c35370a271aad4135a5044
https://syzkaller.appspot.com/text?tag=CrashLog&x=136e9aa4700000
|
|
Introduce vm_map_assert_{wrlock,rdlock,anylock,unlocked}() in rwlock(9)
fashion and back up function comments about locking assumptions with proper
assertions.
Also add new comments/assertions based on code analysis and sync with
NetBSD as much as possible.
vm_map_lock() and vm_map_lock_read() are used for exclusive and shared
access respectively; currently no code path is purely protected by
vm_map_lock_read() alone, i.e. functions called with a read lock held by the
callee are also called with a write lock elsewhere.
Thus only vm_map_assert_{wrlock,anylock}() are used as of now.
This should help with unlocking UVM related syscalls
Tested as part of a larger diff through
- amd64 package bulk build by naddy
- amd64, arm64, powerpc64 base builds and regress by bluhm
- amd64 and sparc64 base builds and regress by me
Input mpi
Feedback OK kettenis
|
|
uvm_unmap_remove() effectively requires its caller to lock the vm map.
Even though uvm_map_teardown() is only called after a map's last reference
is dropped and is thus safe from other threads accessing the map, grab the
map's lock in uvm_map_teardown() to satify upcoming lock assertions in
uvm_unmap_remove().
Tested as part of a larger diff through
- amd64 package bulk builds by naddy
- amd64, arm64, powerpc64 base builds and regress by bluhm
- amd64 and sparc64 base builds and regress by me
Feedback mpi
OK kettenis
|
|
subset of the request permissions, so when forcing an initial RO
fault for CoW also clamp the access_type.
problem reported by bluhm@
based on a suggestion from miod@
ok kettenis@
|
|
No object change.
OK millert
|
|
can't be written to while any thread can see the original version
of the page via a not-yet-flushed stale TLB entry: pmaps can indicate
they do this correctly by defining __HAVE_PMAP_MPSAFE_ENTER_COW;
uvm will force the initial CoW fault to be read-only otherwise.
Set that on amd64 and fix the problem case in pmap_enter() by putting
a read-only mapping in place, shooting the TLB entry, then fixing
it to the final read-write entry so this thread can continue without
re-faulting.
reported by jsing@ from https://github.com/golang/go/issues/34988
assisted by discussion in https://reviews.freebsd.org/D14347
tweaks from jsing@ and kettenis@
ok jsing@ mpi@ kettenis@
|
|
ok visa@
|
|
kern.wxabort=1 logs and kills programs after W^X violations.
At least sigexit() -> coredump() as well as the non-atomic increment of
ps_wxcounter require protection, so grab the big lock for the entire block.
This is part of the effort to unlock mmap(2)'s MAP_ANON case.
Feedback mvs claudio kettenis deraadt
OK kettenis
|
|
The swap code path in uvm_aio_aiodone() is not holding the corresponding
page lock and shouldn't as long as anons are locked inside uvm_page_unbusy()
to handle the PG_RELEASED case.
Reported by Ralf Horstmann on bugs@
|
|
There is no functionnal change as the former is just a wrapper around the
latter. However upper layer of UVM do not need to mess with the internals
of the page allocator.
This will also help when a page cache will be introduced to reduce contention
on the global mutex serializing acess to pmemrange's data.
ok kettenis@, kn@, tb@
|
|
boundaries: hppa has 8-byte PLT entries that sometimes do that.
ok kettenis@
|
|
Reduce differences with NetBSD, no functional changes.
|
|
Tested by many during the past months, thanks!
ok sthen@
|
|
Switch libc and ld.so to the generic stubs for these calls.
WARNING: reboot to updated kernel before installing libc or ld.so!
Time for a story...
When gcc (back in 1.x days) first implemented long long, it didn't (always)
pass 64bit arguments in 'aligned' registers/stack slots, with the result that
argument offsets didn't match structure offsets. This affected the nine system
calls that pass off_t arguments:
ftruncate lseek mmap mquery pread preadv pwrite pwritev truncate
To avoid having to do custom ASM wrappers for those, BSD put an explicit pad
argument in so that the off_t argument would always start on a even slot and
thus be naturally aligned. Thus those odd wrappers in lib/libc/sys/ that use
__syscall() and pass an extra '0' argument.
The ABIs for different CPUs eventually settled how things should be passed on
each and gcc 2.x followed them. The only arch now where it helps is landisk,
which needs to skip the last argument register if it would be the first half of
a 64bit argument. So: add new syscalls without the pad argument and on landisk
do that skipping directly in the syscall handler in the kernel. Keep compat
support for the existing syscalls long enough for the transition.
ok deraadt@
|
|
Pass the correct entry to uvm_fault_unwire_locked().
Reported-by: syzbot+bb2f63f076618e9ed0d3@syzkaller.appspotmail.com
ok kettenis@, deraadt@
|
|
Fix a NULL dereference introduced in previous, reported by anton@ and
Benjamin Baier.
Reported-by: syzbot+c172bd335801b67e515b@syzkaller.appspotmail.com
|
|
Like the per-amap lock the `vmobjlock' is principally used to serialized
access to objects in the fault handler to allow faults occurring on
different CPUs and different objects to be processed in parallel.
The fault handler now acquires the `vmobjlock' of a given UVM object as
soon as it finds one. For now a write-lock is always acquired even if
some operations could use a read-lock.
Every pager, corresponding to a different kind of UVM object, now expect
the UVM object to be locked and some operations, like *_get() return it
unlocked. This is enforced by assertions checking for rw_write_held().
The KERNEL_LOCK() is now pushed to the VFS boundary in the vnode pager.
To ensure the correct amap or object lock is held when modifying a page
many uvm_page* operations are now asserting for the "owner" lock.
However, fields of the "struct vm_page" are still being protected by the
global `pageqlock'. To prevent lock ordering issues with the new
`vmobjlock' and to reduce differences with NetBSD this lock is now taken
and released for each page instead of around the whole loop.
This commit does not remove the KERNEL_LOCK/UNLOCK() dance. Unlocking
will follow if there is no fallout.
Ported from NetBSD, tested by many, thanks!
ok kettenis@, kn@
|
|
Pass the device vnode as a parameter to VOP_STRATEGY() to allow calling
the correct vop_strategy callback. Now the vnode is also available
in the callback.
OK mpi@
|
|
first __tfork(2)"
The immediate issue is that a process linked with -znow will still
perform lazy relocation on objects loaded with dlopen(), but there
are possibly other dark corners to plumb to find a better invariant.
Problem reported by thfr@
|
|
prints the end which is in the next page. Subtract 1 to avoid confusion.
|
|
Thread: https://marc.info/?l=openbsd-tech&m=163884527530326&w=2
ok deraadt@
|
|
To unlock kbind(2) we need to protect ps_kbind_addr and
ps_kbind_cookie.
The simplest way to do this is to disallow kbind(2) initialization
after the first __tfork(2) call. If the first thread does not
initialize the kbind(2) variables before __tfork(2) then we disable
kbind(2) during that first __tfork(2) call.
This is guenther@'s patch, I'm just committing it.
Discussed with guenther@, deraadt@, kettenis@, and mpi@.
ok kettenis@, positive response from mpi@, "I am busy" guenther@
|
|
ok millert mpi
|
|
Reduce differences with NetBSD, tested by many as part of a larger diff.
ok kettenis@
|
|
No functionnal change.
Reduce differences with NetBSD, tested by many as part of a larger diff.
|
|
For now, only assert that the tree of pages is empty in uvm_obj_destroy().
This will soon be used to free the per-UVM object lock.
While here call uvm_obj_init() when new vnodes are allocated instead of
in uvn_attach(). Because vnodes and there associated UVM object are
currently never freed, it isn't easy to know where/when to garbage
collect the associated lock. So simply check that the reference of a
given object is 0 when uvn_attach().
Tested by many as part of a bigger diff.
ok kettenis@
|
|
(both kernel and userland bits)
GENERIC + VFSLCKDEBUG is broken with it.
|
|
This flag is currently used to mark or unmark a vnode to actively
check vnode locking semantic (when compiled with VFSLCKDEBUG).
Currently, VLOCKSWORK flag isn't properly set for several FS
implementation which have full locking support. This commit enable
proper checking for them too (cd9660, udf, fuse, msdosfs, tmpfs).
Instead of using a particular flag, it directly check if
v_op->vop_islocked is nullop or not to activate or not the vnode
locking checks.
ok mpi@
|
|
ok mpi@
|
|
used in the near future (by mpi@) to improve the locking for uvm objects.
Introducing this function now will me allow me to call it in the
appropriate place in the drm code.
ok mpi@, jsg@
|
|
Do not allow a faulting thread to sleep on a contended vnode lock to prevent
lock ordering issues with upcoming per-uobj lock.
Also reduce the sleep value for VM_PAGER_AGAIN from 1sec to 5nsec to not add
visible slowdown when starting a multi-threaded application with threads that
fault on the same vnode (chromium, firefox, etc).
Tested by anton@, tb@, robert@ and gnezdo@
ok anton@, tb@
Reported-by: syzbot+e63407b35dff08dbee02@syzkaller.appspotmail.com
|
|
This fix (ab)use the vnode lock to serialize access to some fields of
the corresponding pages associated with UVM vnode object and this will
create new deadlocks with the introduction of a per-uobj lock.
ok anton@
|
|
This is possible now that amaps & anons are protected by a per-map rwlock.
Tested by many as part of a bigger diff.
ok kettenis@
|
|
Some pmaps (x86, hppa) and the buffer cache rely on UVM objects to allocate
and manipulate pages. These objects should not be manipulated by uvm_fault()
and do not currently require the same locking enforcement.
Use the dummy pagers to explicitly document which UVM functions are meant to
manipulate UVM objects (uobj) that do not need the upcoming `vmobjlock' and
instead still rely on the KERNEL_LOCK().
Tested by many as part of a larger diff.
ok kettenis@, beck@
|
|
In amap_copy() make the new amap share the source amap's lock right in
the begining and only allocate a new one if no anon have been referenced.
Issue reported by Thomas L. <tom.longshine at web dot de> on bugs@.
ok tb@
|
|
ok deraadt@ kettenis@
|
|
- Use atomic operations for increment/decrement
- Rewrite the loop from uao_swap_off() to only keep a reference to the
next item in the list.
ok jmatthew@
|
|
- add wscons devices
- build radeondrm and add MD uvm bits to support it.
|
|
This change introduced or exposed a leak of anons which result in system
freezes.
anton@ observed a high number of INUSE for anonpl and semarie@ saw multiple
processes waiting in the fault handler on "flt_noramX" probably the one
related to allocating an anon.
|
|
For example uvm_objinit() becomes uvm_obj_init(). Reduce differences
between the trees and help porting new functions needed for UVM object
locking.
No functionnal change.
|
|
Reduce the difference with NetBSD.
ok kettenis@
|
|
This is possible now that amaps & anons are protected by a per-map rwlock.
ok kettenis@, jmatthew@
|
|
ok kettenis@
|