Age | Commit message (Collapse) | Author |
|
Although the period is specified in seconds, convert to milliseconds so
uneven periods are not truncated after integer division by two.
Input and OK cheloha
|
|
When send buffer space in the drain socket becomes available, a
task is added to move data, and also the userland was informed.
The latter is not usefull as this would mix a kernel and user stream.
So programs do not wait for this event. Avoid calling sowakeup()
from sowwakeup(), this also reduces grabing the kernel lock. Instead
inform the userland about the write event when the splicing is
dissolved in sounsplice().
OK claudio@
|
|
Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.
For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.
To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.
Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.
Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.
Partly inspired by FreeBSD r247787.
positive feedback from deraadt@, ok mpi@
|
|
table. This should prevent a race with kevent when unlocked code
closes file descriptors that are fully set up.
OK mpi@
|
|
these values are used as the backpressure thresholds in the interface
rx q processing code. theyre being exposed as tunables to userland
while we are figuring out what the best values for them are.
ok visa@ deraadt@
|
|
|
|
As with nanosleep(2), poll(2), and select(2), here we can chip away at
the timespec until it's empty. This lets us support the full range of
the timespec regardless of the kernel's HZ.
Update the manpage accordingly.
ok visa@
|
|
This removes a system-wide serialization point, which might help
finding timing-related bugs.
OK deraadt@ anton@
|
|
ok deraadt@
|
|
Matches the recent F_SETLK change, POSIX and the man page.
|
|
This behavior matches POSIX and our own fnctl(2) man page.
OK anton@ deraadt@
|
|
|
|
OK semarie@ mpi@ deraadt@ anton@
|
|
for unlocking.
OK semarie@ mpi@ deraadt@ anton@
|
|
ok visa@
|
|
unlocks read(2) and write(2) syscalls families, and push the KERNEL_LOCK
deeper in the code path. KERNEL_LOCK is managed per file type in fileops
handlers (fo_read, fo_write, and fo_close). read(2) and write(2) on
socket are KERNEL_LOCK-free.
initial work from mpi@ and ians@
ok mpi@ kettenis@ visa@ ians@
|
|
unlocks read(2) and write(2) syscalls families, and push the KERNEL_LOCK
deeper in the code path. KERNEL_LOCK is managed per file type in fileops
handlers (fo_read, fo_write, and fo_close). read(2) and write(2) on
socket are KERNEL_LOCK-free.
initial work from mpi@ and ians@
ok mpi@ kettenis@ visa@ ians@
|
|
of resource limit structs has been done between processes. By applying
copy-on-write also between threads, threads can read rlimits in
a nearly lock-free manner.
Inspired by code in DragonFly BSD and FreeBSD.
OK mpi@, agreement from jmatthew@ and anton@
|
|
kernel. kubsan reports findings using printf() and assuming that calling
printf() is safe in all contexts can be problematic. Instead, defer
reporting of findings to the systq task queue.
Storage for findings is allocated early in the boot process in order to
catch potential UB during boot. The same findings are reported once the
task queue subsystem has been initialized.
Feedback from kettenis@ and ok mpi@
|
|
waiters, just set a flag in logwakeup(). The flag is later noted through
periodic polling. This lets the wakeup code run with sufficient locking.
logwakeup() is a very tricky place to take locks because the function
can be called in many different contexts. By not requiring locks in
the routine helps to keep printf(9) as usable as possible.
OK mpi@
|
|
it actually isn't reached...
|
|
This is necessary when invoking sleep_finish_timeout() without the
kernel lock. If not cancelled properly, an already running endtsleep()
might cause a spurious wakeup on the thread if the thread re-enters
a sleep queue very quickly before the handler completes.
The flag P_TIMEOUT should stay cleared across the timeout cancellation.
Add an assertion for that.
OK mpi@
|
|
use copyin() on. While here: just put the struct iovec for ktrace on the
stack instead of mallocing and freeing it.
problem debugged by patrick@
ok deraadt@ mpi@
|
|
feature bits checked in namei()
|
|
was already gone.
OK mpi@
|
|
the need to do this in libc.
btw, it is unfortunate posix went this way, because converting a clearly
illegal condition to not be fatal but instead return an error which is
potentially not checked in the caller, is sadly a large component of the
runaway-train model that makes exploitation of software easy.. illegal
software should crash hard.
ok beck
|
|
When the main thread of a MT process dies, it doesn't matter at which
priority it gets awaken to do the lasts cleanups. Not using PUSER makes
it easier to understand the existing scheduler logic.
ok visa@
|
|
could crash due to missing inp_ppcb. This happend when fstat(1)
was called often and TCP was aborted with reset. Protect the sysctl
path with the net lock.
OK mpi@
|
|
if the packet has the M_TIMESTAMP csum_flag, ph_timestamp is added
to the boottime clock, otherwise it just uses microtime().
|
|
variable that tracks when to send next SIGXCPU. This eases MP work and
prevents accidental alteration of shared resource limit structs.
OK mpi@ semarie@
|
|
mlarkin@ noticed we would freeze while removing enormous files because
of the amount of work done to invalidate buffers on unlink. This adds
a temporary workaround to ensure we give up the lock and yield while
doing this.
The longer term answer will be to move these buffers to another list
and not do the work here.
ok deraadt@
|
|
|
|
missing from the SP variant of mtx_enter() and mtx_enter_try().
mtx_leave() was correct already.
Prompted by and OK patrick@
|
|
|
|
Basically just make all the bintime routines look and behave more like
the timeradd(3) macros.
Switch to three-argument forms for structure math, introduce and use
bintimecmp(9), and rename the structure conversion routines to resemble
e.g. TIMEVAL_TO_TIMESPEC(3).
Document all of this in a new bintimeadd.9 page.
Code input from mpi@, manpage input from schwarze@.
code ok mpi@, docs ok schwarze@, docs probably still ok jmc@
|
|
function is also a proper place for setting up the plimit pool.
While here, raise the IPL of the plimit pool to IPL_MPFLOOR, needed
in upcoming MP work.
OK claudio@
|
|
It currently creates a lock ordering problem because SCHED_LOCK() is taken
by hardclock(). That means the "priorities" of a thread should be moved
out of the SCHED_LOCK() first in order to make progress.
Reported-by: syzbot+8e4863b3dde88eb706dc@syzkaller.appspotmail.com
via anton@ as well as by kettenis@
|
|
Note that hardclock(9) still increments p_{u,s,i}ticks without holding a
lock.
ok visa@, cheloha@
|
|
with the fields of struct proc. Make pl_refcnt unsigned for upcoming
atomic updating.
OK deraadt@ guenther@
|
|
It is bad style to make a pointer point outside the object
so correct this to simply point to the last byte up front.
ok deraadt@
|
|
realpath(2) have output filenames. Generate additional KTR_NAMEI
records upon success.
ok millert beck
|
|
|
|
I borrowed an example usage from __getcwd poorly to begin with
and then there was some other strangeness in there.
diagnosed with deraadt.
ok deraadt@
|
|
was used to return the length of the path, when the actual return value is 0.
This would cause confusing results in ktrace.
Diagnosed with beck since __realpath() picked up the same odd behaviour
|
|
tick boundary of schedlock().
This reduces the contention on the SCHED_LOCK() when the current thread
is already spinning.
Prompted by deraadt@, ok visa@
|
|
|
|
|
|
long. Instead, use everything after the first /sys/ segment as the path.
|
|
__attribute__((nonnull)); which the undefined behavior sanitizer in
clang is aware of. A new handler is therefore needed in order to compile
a kernel with kubsan enabled.
ok visa@
|
|
socketpair. Do not wakeup receiver if there is no data available.
OK claudio@ anton@
|