Age | Commit message (Collapse) | Author |
|
previously the code was using a percpu flag to manage the sleeps/wakeups,
which means multiple threads waiting for a barrier on a cpu could
race. moving to a cond struct on the stack fixes this.
while here, get rid of the sbar taskq and just use systqmp instead.
the barrier tasks are short, so there's no real downside.
ok mpi@
|
|
time; the aggressive mountpoint destruction seems to hit insane
use-after-frees when we are already far on the way down.
|
|
Change mountpoint to RDONLY a little later. Seems to improve the
rw->ro transition a bit.
|
|
|
|
|
|
this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.
|
|
|
|
pledge for a new execve image immediately upon start. Also introduces
"error" which makes violations return -1 ENOSYS instead of killing the
program ("error" may not be handed to a setuid/setgid program, which
may be missing/ignoring syscall return values and would continue with
inconsistant state)
Discussion with many
florian has used this to improve the strictness of a daemon
|
|
OK krw@
|
|
for blocks re-fetchable from the filesystem. However at reboot time,
filesystems are unmounted, and since processes lack backing store they
are killed. Since the scheduler is still running, in some cases init is
killed... which drops us to ddb [noted by bluhm]. Solution is to convert
filesystems to read-only [proposed by kettenis]. The tale follows:
sys_reboot() should pass proc * to MD boot() to vfs_shutdown() which
completes current IO with vfs_busy VB_WRITE|VB_WAIT, then calls VFS_MOUNT()
with MNT_UPDATE | MNT_RDONLY, soon teaching us that *fs_mount() calls a
copyin() late... so store the sizes in vfsconflist[] and move the copyin()
to sys_mount()... and notice nfs_mount copyin() is size-variant, so kill
legacy struct nfs_args3. Next we learn ffs_mount()'s MNT_UPDATE code is
sharp and rusty especially wrt softdep, so fix some bugs adn add
~MNT_SOFTDEP to the downgrade. Some vnodes need a little more help,
so tie them to &dead_vnops.
ffs_mount calling DIOCCACHESYNC is causing a bit of grief still but
this issue is seperate and will be dealt with in time.
couple hundred reboots by bluhm and myself, advice from guenther and
others at the hut
|
|
SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in
contexts related to kqueue(2) where we'd like to avoid grabbing solock().
While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and
csignal() to mark which remaining functions need to be addressed in the
socket layer.
ok visa@, bluhm@
|
|
|
|
extend ddb(4) "ps /o" output to print which CPU is currently holding the
KERNEL_LOCK().
Tested by dhill@, ok visa@
|
|
ok visa@
|
|
with the kernel lock.
Fixes a deadlock seen by Hrvoje Popovski and dhill@.
OK mpi@, dhill@
|
|
WITNESS checking as (our) witness code isn't smart enough to let that by.
ok visa@
|
|
|
|
sufficiently and at least one horrific security hole was the result.
ok deraadt@ beck@
|
|
|
|
if you're trying to free something that a timeout is using, you
have to wait for that timeout to finish running before doing the
free. timeout_del can stop a timeout from running in the future,
but it doesn't know if a timeout has finished being scheduled and
is now running.
previously you could know that timeouts are not running by simply
masking softclock interrupts on the cpu running the kernel. however,
code is now running outside the kernel lock, and timeouts can run
in a thread instead of softclock.
timeout_barrier solves the first problem by taking the kernel lock
and then masking softclock interrupts. that is enough to ensure
that any further timeout processing is waiting for those resources
to run again.
the second problem is solved by having timeout_barrier insert work
into the thread. when that work runs, that means all previous work
running in that thread has completed.
fixes and ok visa@, who thinks this will be useful for his work
too.
|
|
mp-safe.
ok bluhm@, visa@
|
|
KERNEL_LOCK(), so change asserts accordingly.
This is now possible since sblock()/sbunlock() are always called with
the socket lock held.
ok bluhm@, visa@
|
|
OK deraadt@
|
|
In particular, this allows SIOCGIF* requests to run in parallel.
lots of help & ok mpi, ok visa, sashan
|
|
so that statically initialized locks get properly enrolled
to the validator.
OK mpi@
|
|
the code has rotted, and obviously hasnt been used for ages. it is
also hard to make mpsafe. if we need something like this again it
would be better to do it from scratch.
ok tedu@ visa@
|
|
taskq_barrier guarantees that any task that was running on the taskq
has finished by the time taskq_barrier returns. it is similar to
intr_barrier.
this is needed for use in ifq_barrier as part of an upcoming change.
|
|
|
|
This is a requirement to use a sleeping lock inside kqueue filters.
It is now possible, but not recommended, to sleep inside ``f_event''.
Threads iterating over the list of pending events are now recognizing
and skipping other threads' markers. knote_acquire() and knote_release()
must be used to "own" a knote to make sure no other thread is sleeping
with a reference on it.
Acquire and marker logic taken from DragonFly but the KERNEL_LOCK()
is still serializing the execution of the kqueue code.
This also enable the NET_LOCK() in socket filters.
Tested by abieber@ & juanfra@, run by naddy@ in a bulk, ok visa@, bluhm@
|
|
functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
|
reporting an error in a scenario like the following:
1. mtx_enter(&tqa->tq_mtx);
2. IRQ
3. mtx_enter(&tqb->tq_mtx);
Found by Hrvoje Popovski, OK mpi@
|
|
Direction suggested by mpi
OK mpi, visa
|
|
Micro-optimization useful to x86 archs where the cmpxchg{q,l} instruction
used by rw_enter(9) and rw_exit(9) already include an implicit memory
barrier.
From Mateusz Guzik, ok visa@, mikeb@, kettenis@
|
|
This reduces code duplication and makes it easier to instrument
lock primitives.
The MI mplock uses the ticket lock code that has been in use
on amd64, i386 and sparc64. These are the architectures that now
switch to the MI code.
The lock_machdep.c files are unhooked from the build but not
removed yet, in case something goes wrong.
OK mpi@, kettenis@
|
|
via sendsyslog(2) along with the corresponding errno.
Help when troubleshooting which program is triggering an error, like
an overflow.
ok bluhm@
|
|
- control operations: trace_me, attach, detach, step, kill, continue.
Manipulate process relation/state or send a signal
- kernel-state get/set: thread list, event mask, trace state.
About the process and don't require target to be stopped, need copyin/out
- user-state get/set: memory, register, window cookie.
Often thread-specific, require target to be stopped, need copyin/out
sys_ptrace() changes to handle request checking, copyin/out to
kernel buffers with size check and zeroing, and dispatching to the
routines above for the real work. This simplfies the permission checks
and copyin/out handling and will simplify lock handling in the future.
Inspired in part by FreeBSD.
ok mpi@ visa@
|
|
This should make it easier to figure out what is going on. Note
that the pledgecode it shows is only a guess which pledge(2) might
help.
OK deraadt@ semarie@
|
|
volatile member of the struct.
Not forcing a memory read on every access, 3 in this function, might
reduce cache traffic in some cases.
Micro-optimization and diff provided by Mateusz Guzik.
ok visa@
|
|
ok visa@, bluhm@, deraadt@
|
|
insert/remove operation.
No functionnal change for the moment. However this helps to make this
code mp-safe.
Note that markers are still not, and wont be, counted.
ok visa@, jsing@, bluhm@
|
|
It turns this set of events per-thread without having to lock anything.
From Dragonfly 10f6680a4f6684751aaae0965abfe140f19e9231
ok kettenis@, visa@, bluhm@
|
|
Exposes per-CPU counters to real parrallelism.
ok visa@, bluhm@, jca@
|
|
ok visa@
|
|
|
|
|
|
Send an uncatchable SIGABRT to the process specified by the pid
argument. Useful in case of CPU exhaustion to kill the DoSing
process and generate a core for later inspection.
ok phessler@, visa@, kettenis@, miod@
|
|
|
|
addresses will cause a fault on load by the kernel.
Problem observed by Maxime Villard
ok kettenis@ deraadt@
|
|
okay bluhm@, deraadt@
|
|
top->m_pkthdr.len was accessed without check. See CID 1452933.
In fact top cannot be NULL there and the condition was always false.
m_getuio() did never reserve space for the header. The correct
check is m == top to find the first mbuf.
OK visa@
|