Age | Commit message (Collapse) | Author |
|
Since sys/kern/kern_timeout.c r1.84, timeout_barrier() has used sleeping
with soft-interrupt-driven timeouts. Adjust the sleep machinery so that
the timeout clearing can block in sleep_finish().
This adds one step of recursion inside sleep_finish(). However, the
sleep queue handling does not recurse because sleep_finish() completes
it before calling timeout_del_barrier().
This fixes the following panic:
panic: kernel diagnostic assertion "(p->p_flag & P_TIMEOUT) == 0" failed: file "sys/kern/kern_synch.c", line 373
Stopped at db_enter+0x10: popq %rbp
db_enter() at db_enter+0x10
panic() at panic+0xbf
__assert() at __assert+0x25
sleep_setup() at sleep_setup+0x1d8
cond_wait() at cond_wait+0x46
timeout_barrier() at timeout_barrier+0x109
timeout_del_barrier() at timeout_del_barrier+0xa2
sleep_finish() at sleep_finish+0x16d
tsleep() at tsleep+0xb2
sys_nanosleep() at sys_nanosleep+0x12d
syscall() at syscall+0x374
OK mpi@ dlg@
|
|
While one thread is running kqueue_scan(), another thread can begin
scanning the same kqueue, observe that the event queue is empty, and
go to sleep. If the first thread re-inserts a knote for re-processing,
the second thread can miss the newly pending event. Wake up the kqueue
after a re-insert to correct this.
This fixes a Go test hang that jsing@ tracked down to kqueue.
Tested in snaps for a week.
OK jsing@ mpi@
|
|
for the lock operation and to pass a value to the unlock operation.
sofree() still needs an extra flag to know if sounlock() should be called
or not. But sofree() is called less often and mostly without keeping the lock.
OK mpi@ mvs@
|
|
This code has already been exercised quite extensively by syzkaller and
got decent test coverage.
|
|
When the user requests a lock range that ends at LLONG_MAX, replace
the end point with the special EOF value -1. This avoids ambiguity
with lf_end in lf_split(). The ambiguity could result in a broken
data structure.
This change is visible to userspace in a corner case. When a lock range
has been requested with an end point at absolute position LLONG_MAX,
fcntl(F_GETLK) returns l_len == 0, instead of a positive value, for that
range. This seems consistent with FreeBSD and Linux.
OK anton@
Reported-by: syzbot+c93afea6c27a3fa3af39@syzkaller.appspotmail.com
|
|
OK anton@
|
|
Recommit the reverted change selectively so that only pipes are
affected. Leave sockets untouched for now.
|
|
|
|
reason why)
discussed with many, ok millert
|
|
This refactors the commin parts of sys_truncate() and sys_ftruncate()
into dotruncate(). If the new size of the file is larger than the
RLIMIT_FSIZE limit _and_ the file is being extended, not truncated,
return EFBIG. Adapted from a diff by Piotr Durlej.
With help from and OK by deraadt@ guenther@.
|
|
|
|
modification is already protected by `fd_lock' rwlock(9).
ok bluhm@
|
|
With this cursig(), postsig() and trapsignal() become safe to be called
without KERNEL_LOCK. As a side-effect sleep with PCATCH no longer needs
the KERNEL_LOCK either. Since sending a signal can happen from interrupt
context raise the ps_mtx IPL to high.
Feedback from mpi@ and kettenis@
OK kettenis@
|
|
by checking that it is a single threaded process or that ps_single is set.
OK mpi@
|
|
Always fetch the knlist array pointer at the start of every iteration
in knote_remove(). This prevents the use of a stale pointer after
another thread has simultaneously reallocated the kq_knlist array.
Reported and tested by and OK jsing@
|
|
in the pre-change behavior, if the CPU frequency is raised, it will stay up
for 5 cycles minimum (with one cycle being run every 100ms).
With this change, the time to keep the frequency raised is incremented at
each cycle up to 5. This mean short load need triggering the frequency
increase will last less than the current minimum of 500ms.
this only affect the automatic mode when on battery, extending the battery
life for most interactive use scenarios and idling loads.
tested by many with good results
ok ketennis@
|
|
multiple IP forwarding threads were processing packets and holding
the shared net lock, the exclusive net lock was blocked permanently.
This could result in ping times well above 10 seconds.
Add the RWLOCK_WRWANT bit to the check mask of readers. Then they
cannot grab the lock if a writer is also waiting. This logic was
already present in revision 1.3, but got lost during refactoring.
When exiting the lock, there exists a race when the RWLOCK_WRWANT
bit gets deleted. Add a comment that was present until revision
1.8 to document it. The race itself is not easy to fix and had no
impact during testing.
OK sashan@
|
|
The commit caused hangs with NFS.
Reported by ajacoutot@ and naddy@
|
|
The deferred activation can now run in an MP-safe task queue.
|
|
OK mpi@
|
|
witness. Make ratecheck mutex global.
Reported-by: syzbot+9864ba1338526d0e8aca@syzkaller.appspotmail.com
|
|
mutex with spl high for all function calls is used for now. It
protects the lasttime and curpps parameter. This solution is MP
safe for the usual use case, allows progress, and can be optimized
later. Remove a useless #if 1 while there.
OK claudio@
|
|
|
|
ok cheloha millert miod
|
|
Make refcnt_rele() and refcnt_finalize() order memory operations so that
preceding loads and stores happen before 1->0 transition. Also ensure
that loads and stores that depend on the transition really begin only
after the transition has occurred. Otherwise the object destructor might
not see the object's latest state.
OK bluhm@
|
|
Preventing a use after free discovered by syzkaller.
ok visa@
Reported-by: syzbot+a2649c1d77e9d2463f33@syzkaller.appspotmail.com
Reported-by: syzbot+182df9087f5f182daa44@syzkaller.appspotmail.com
Reported-by: syzbot+46d03139d7ed5e81ed2f@syzkaller.appspotmail.com
Reported-by: syzbot+892e886a6113db341da1@syzkaller.appspotmail.com
|
|
other call in vop_generic_revoke().
OK semarie@
|
|
The previous value set years ago was causing amd64 kernels to spin
out when run with MP_LOCKDEBUG during boot.
ok kettenis@
|
|
This prevents descriptors from being closed concurrently on receiver side.
ok bluhm@ claudio@
|
|
Make the cf_attach member of struct cfdata const and sprinkle a few
const into subr_autoconf.c to make this work. Fixes the compilation
of sys/dev/rd.c with newly const rd_ca.
ok miod (who had a similar diff)
|
|
The old comment only mentioned that tty_nmea was used for time, but
subsequently position data was added to this line discipline.
|
|
|
|
This fixes a problem where NOTE_EXIT could be received before
the process was officially a zombie and thus not immediately
waitable. OK deraadt@ visa@
|
|
vnode_hold_list and vnode_free_list aren't used outside kern/vfs_subr.c
move `struct freelst` where used in kern/vfs_subr.c
no intented behaviour changes. survived a release(8) build.
ok millert@
|
|
libcrypto can access this sysctl on arm64 without restrictions to determine
cpu features
ok deraadt@, kettenis@
|
|
for PCB tables. It does not break userland build anymore.
pf_socket_lookup() calls in_pcbhashlookup() in the PCB layer. To
run pf in parallel, make parts of the stack MP safe. Protect the
list and hashes in the PCB tables with a mutex.
Note that the protocol notify functions may call pf via tcp_output().
As the pf lock is a sleeping rw_lock, we must not hold a mutex. To
solve this for now, collect these PCBs in inp_notify list and protect
it with exclusive netlock.
OK sashan@
|
|
code similar in non DIAGNOSTIC case. Rename refcnt variable to
refs for consistency with r_refs. Add KASSERT() in refcnt_finalize().
OK visa@
|
|
OK bluhm@ dlg@
|
|
OK bluhm@
|
|
|
|
OK dlg@ bluhm@
|
|
refcnt_shared() checks whether the object has multiple references.
When refcnt_shared() returns zero, the caller is the only reference
holder.
refcnt_read() returns a snapshot of the counter value.
refcnt_shared() suggested by dlg@.
OK dlg@ mvs@
|
|
This reverts the commit protecting the list and hashes in the PCB tables
with a mutex since the build of sysctl(8) breaks, as found by kettenis.
ok sthen
|
|
run pf in parallel, make parts of the stack MP safe. Protect the
list and hashes in the PCB tables with a mutex.
Note that the protocol notify functions may call pf via tcp_output().
As the pf lock is a sleeping rw_lock, we must not hold a mutex. To
solve this for now, collect these PCBs in inp_notify list and protect
it with exclusive netlock.
OK sashan@
|
|
to the debugger can cause a loop between the debugger and cursig()
if the signal is masked. cursig() has no way to know which signal
was already delivered to the debugger and so it delivers the same
signal over and over again.
Instead handle traps to masked signals directly in trapsignal. This
is what rev 1.293 was mostly about. If SIGTRAP was masked by the
process breakpoints no longer worked since the signal deliver to
the debugger did not happen. Doing this case in trapsignal solves
both the problem with the loop and the delivery of masked traps.
Problem reported and fix tested by matthieu@
OK kettenis@ mpi@
|
|
variables. Although not necessary everywhere, using atomic functions
exclusively for variables marked as atomic is clearer.
OK mvs@ visa@
|
|
Revert the pr_usrreqs move: syzkaller found a NULL pointer deref
and I won't be available to monitor for followup issues for a bit
|
|
ok deraadt
|
|
then be shared among protosw structures, following the same basic
direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the
proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
|
|