Age | Commit message (Collapse) | Author |
|
Read-only access to local `clkinfo' filled with immutable data.
ok bluhm
|
|
Instead of the KERNEL_LOCK use the ps_mtx for most operations.
If the ps_klist is modified an additional global rwlock (kqueue_ps_list_lock)
is required. This includes the knotes with NOTE_FORK and NOTE_EXIT since
in either cases a ps_klist is changed. In the NOTE_FORK | NOTE_TRACK case
the call to kqueue_register() can sleep this is why a global rwlock is used.
Adjust the reaper() to call knote_processexit() without KERNEL_LOCK.
Double lock idea from visa@
OK mvs@
|
|
you can only fit a couple of nanonseconds into an int, which limited
the usefulness of the api. worse, if a large nsec value was passed
in it could be cast to a negative int value which tripped over a
KASSERT at the top of timeout_add that ends up being called. avoid
this footgun by working in the bigger type and doing the same range
checks/fixes for other timeout_add wrappers.
ok claudio@ mvs@
|
|
microboottime() and following binboottime() are mp-safe and `mb' is
local data.
ok bluhm
|
|
Add corresponding cases to the kern_sysctl() switch and unlock read-only
variables from `kern_vars'. Unlock KERN_SOMAXCONN and KERN_SOMINCONN
which are atomically read-only accessed only from solisten().
ok kettenis
|
|
ok bluhm
|
|
|
|
Unlock few obvious immutable or read-only variables from "kern.*" and
"hw.*" paths. Keep the rest variables locked as before, include pages
wiring. Use new sysctl_vs{,un}lock() functions introduced for thar
purpose.
In kern.* path:
- KERN_OSTYPE, KERN_OSRELEASE, KERN_OSVERSION, KERN_VERSION -
immutable;
- KERN_NUMVNODES - read-only access to integer;
- KERN_MBSTAT - read-only access to per-CPU counters;
In hw.* path:
- HW_MACHINE, HW_MODEL, HW_NCPUONLINE, HW_PHYSMEM, HW_VENDOR,
HW_PRODUCT, HW_VERSION, HW_SERIALNO, HW_UUID, HW_PHYSMEM64 -
immutable;
- HW_USERMEM and HW_USERMEM64 - `physmem' is immutable, uvmexp.wired
is mutable but integer; read-only access to localy stored difference
between `physmem' and uvmexp.wired;
- `hw_vars' - read-only access to integers; some of them like
HW_BYTEORDER and HW_PAGESIZE are immutable;
ok bluhm kettenis
|
|
OK mvs@
|
|
Since proc and signal filters share the same klist it makes sense
to keep them together.
OK mvs@
|
|
ok bluhm
|
|
dowait6() can only look at per process state so switch this over.
Right now SIGCONT handling in ptsignal is recursive and not quite
right but this is a step in the right direction. It fixes dowait6()
handling for multithreaded processes where the main thread exited.
OK mpi@
|
|
ok deraadt@ claudio@
|
|
Requested by kettenis@ and guenther@
|
|
is always true. Also consitently wrap all flag checks into parantheses.
OK kettenis@ guenther@
|
|
knote_locked() will call wakeup() and with it the SCHED_LOCK and by that
makes log_mtx no longer a leaf lock. By using an own lock for the klist
we can keep log_mtx a leaf lock and with that printf(9) can be used in
most contexts again.
OK mvs@
|
|
Use atomic operations to reference count VM spaces.
Tested by claudio@, bluhm@, sthen@, jca@
ok jca@, claudio@
|
|
again in sleep_signal_check().
OK dlg@
|
|
that a process has been stopped so make room for that.
OK kettenis@
|
|
ps_mainproc. dowait6() needs to stop using ps_mainproc and this is the
first step.
OK guenther@
|
|
Socket splicing belongs to sockets buffers. udp(4) sockets are fully
switched to fine-grained buffers locks, so use them instead of exclusive
solock().
Always schedule somove() thread to run as we do for tcp(4) case. This
brings delay to packet processing, but it is comparable wit non splicing
case where soreceive() threads are always scheduled.
So, now spliced udp(4) sockets rely on sb_lock() of `so_rcv' buffer
together with `sb_mtx' mutexes of both buffers. Shared solock() only
required around pru_send() call, so the most of somove() thread runs
simultaneously with network stack.
Also document 'sosplice' structure locking.
Feedback, tests and OK from bluhm.
|
|
If a large mbuf in the source socket buffer does not fit into the
drain buffer, split the mbuf. But if the drain buffer still has
some data in it, stop moving data and try again later. This skips
a potentially expensive mbuf operation.
When looking which socket buffer has to be locked, I found that the
length of the source send buffer was checked. Change it to drain.
As this is a performance optimization for a special corner case,
noone noticed the bug.
OK sashan@
|
|
Double checked by kettenis@
Sorry for the time window with breakage visible on arm64 and riscv64. :-/
|
|
Erroneously dropped from the last elf_aux_info(3) diff I sent on tech@.
Lack of this chunk would affect arm64 and riscv64 as they're the two
architectures providing hwcap*.
Should have been ok kettenis@
|
|
builder and avoids the ufs_inactive problems, bluhm hits panics on
shutdown and filesystem unmount on the regress testers.
We'll have to try the other approach of detecting the corrupted
vnode perhaps.
|
|
All incpb locking has been converted to socket receive buffer mutex.
Per PCB mutex inp_mtx is not needed anymore. Also delete PRU related
locking functions. A flag PR_MPSOCKET indicates whether protocol
functions support parallel access with per socket rw-lock.
TCP is the only protocol that is not MP capable from the socket
layer and needs exclusive netlock.
OK mvs@
|
|
udp_send() and following udp{,6}_output() do not append packets to
`so_snd' socket buffer. This mean the sosend() and sosplice() sending
paths are dummy pru_send() and there is no problems to simultaneously
run them on the same socket.
Push shared solock() deep down to sesend() and take it only around
pru_send(), but keep somove() running unedr exclusive solock(). Since
sosend() doesn't modify `so_snd' the unlocked `so_snd' space checks
within somove() are safe. Corresponding `sb_state' and `sb_flags'
modifications are protected by `sb_mtx' mutex(9).
Tested and OK bluhm.
|
|
This was noticed by syzkiller and analyzed in isolaiton by mbuhl@
and visa@ two years ago. As the kernel has become more unlocked it
has started to appear more and was being hit regularly by jsing@
on the Go builder.
The problem was during reclaim of a inode the corresponding vnode
could be picked up by a vget() by another thread while the inode
was being cleared out in the ufs_inactive routine and the thread running
ufs_inactive slept for i/o. When raced the vnode would then not have
zero use count and would not be cleared out on exit from ufs_inactive
with a dead/invalid vnode being used.
While this could get "fixed" by checking for the race happening
and trying again in the inactive routine, or by adding "yet another
visible vnode locking flag" we choose to add a vdoom() api for the
moment that allows the caller to block future attempts to grab this
vnode until it is cleared out fully with vclean.
Teste by jsing@ on the Go builder and seems to solve the issue.
ok kettenis@, claudio@
|
|
In sysctl_int_bounded() use atomic operations to load, store, or
swap integer values. By using volatile pointers this will result
in a single assembly instruction, no matter how over optimizing
compilers will become. Note that this does not solve data dependency
problems, nor MP problems in the kernel code using these integers.
For full MP safety additional considerations, memory barriers, or
locks will be needed where the values are used. But for simple
integer in- and output volatile is enough. If new and old value
pointers are given to sysctl, atomic swapping guarantees that
userlands sees the same old value only once. There are more
sysctl_int() functions that have to be adapted.
OK deraadt@ kettenis@
|
|
OK mpi@
|
|
Remove #if notyet/#endif chunk that references the never-defined STATFS_SOFTUPD.
ok jsg@
|
|
Noticed by bluhm@ on octeon
|
|
in the order needed for future changes. No functional change.
OK mpi@
|
|
sched_toidle() is called by cpu_hatch() to start APs and then curproc
may be NULL.
OK mpi@
|
|
|
|
This fix rebooting a GENERIC.MP kernel on SP machines because unpeg is out
of the loop in smr_thread().
|
|
ok kettenis@, mlarkin@, miod@, claudio@
|
|
For procs (threads) the accounting happens now lockless by curproc using
a generation counter. Callers need to use tu_enter() and tu_leave() for this.
To read the proc p_tu struct tuagg_get_proc() should be used. It ensures
that the values read is consistent.
For processes only the time of exited threads is accumulated in ps_tu and
to get the proper process time usage tuagg_get_process() needs to be called.
tuagg_get_process() will sum up all procs p_tu plus the ps_tu.
This removes another SCHED_LOCK() dependency. Adjust the code in
exit1() and exit2() to correctly account for the full run time.
For this adjust sched_exit() to do the runtime accounting like it is done
in mi_switch().
OK jca@ dlg@
|
|
The note can be removed but add a comment that since this is called from
the idle process exit2() is not allowed to sleep.
OK jca@
|
|
ok kn@
|
|
ok claudio@
|
|
path changed in rev 1.206. At least acme-client(1) is not happy with
this change.
Reported by claudio. Tests and ok by bluhm.
|
|
The only reason to re-lock dying `so' is the lock order with vnode(9)
lock, thus `unp_gc_lock' rwlock(9) could be taken after solock().
ok bluhm
|
|
ok mglocker@
|
|
|
|
At sockets layer only mark buffers as SB_MTXLOCK. At PCB layer only
protect `so_rcv' with corresponding `sb_mtx' mutex(9).
SS_ISCONNECTED and SS_CANTRCVMORE bits are redundant for AF_ROUTE
sockets. Since SS_CANTRCVMORE modifications performed with both solock()
and `sb_mtx' held, the 'unlocked' SS_CANTRCVMORE check in
rtm_senddesync() is safe.
ok bluhm
|
|
Speeds up resuming from hibernate.
Testing florian@ stsp@
ok mlarkin@ stsp@
|
|
testing by florian@ mglocker@ mlarkin@
ok deraadt@ mglocker@ mlarkin@
|
|
entry in enum lock_class_index was removed in sys/_lock.h
You get fireworks if the lock_classes array and enum lock_class_index
get out of sync.
|
|
The SPL level is not tacked by the mutex and we no longer need to track
this in the callers.
OK miod@ mlarkin@ tb@ jca@
|