Age | Commit message (Collapse) | Author |
|
|
|
ok mpi@
|
|
media length check to allow EFI GPT partitions to be smaller that
the entire disk.
Consistently use GPTSECTOR instead of randomly tossing in some
literal '1's.
ok kettenis@
|
|
Use the pool cache to reduce the overhead of memory management in
function kqueue_register().
When EV_ADD is given, kqueue_register() pre-allocates a knote to avoid
potential sleeping in the middle of the critical section that spans
from knote lookup to insertion. However, the pre-allocation is useless
if the lookup finds a matching knote.
The cost of knote allocation will become significant with kqueue-based
poll(2) and select(2) because the frequency of allocation will increase.
Most of the cost appears to come from the locking inside the pool.
The pool cache amortizes it by using CPU-local caches of free knotes
as buffers.
OK dlg@ mpi@
|
|
|
|
consistent.
ok mpi@
|
|
Add a 512-byte buffer (ci_panicbuf) to each cpu_info struct on each
platform for use by panic(9). The first panic on a given CPU writes
its message to this buffer. Subsequent panics on a given CPU print
the panic message to the console but do not modify the buffer. This
aids debugging in two cases:
- If 2+ CPUs panic simultaneously there is no risk of garbled messages
in the panic buffer.
- If a CPU panics and then the operator causes a second panic while
using ddb(4), the operator can still recall the first failure on
a particular CPU.
Misc. changes to support this bigger change:
- Set panicstr atomically to identify the first CPU to reach panic().
- Tweak db_show_panic_cmd() to print all panic messages across all
CPUs. Prefix the first panic with an asterisk ('*').
- Prefer db_printf() to printf() during a panic if we have it.
Apparently it disturbs less global state.
- On amd64, tweak fault() to write the local panic buffer. This needs
more work.
Prompted by bluhm@ and deraadt@. Mostly written by deraadt@.
Discussed with bluhm@, deraadt@ and kettenis@.
Borne from a discussion on tech@ about making panic(9) more MP-safe:
https://marc.info/?l=openbsd-tech&m=162086462316143&w=2
ok kettenis@, visa@, bluhm@, deraadt@
|
|
of the usable LBA range defined by the GPT header. And then shrink
them to the bounds of the first OpenBSD partition if one is found.
While here simplify the logic, eliminate some superfluous variables
and reduce use of magic numbers.
Improvement suggested by sobrado@ ok kettenis@
|
|
Change the definition of ADJFREQ_MIN so that it does not shift
a negative value. Such shifting is undefined in standard C.
This came up when cross-compiling the kernel using ports clang.
The shifting becomes defined when compiling with option -fwrapv.
Base clang enables this option by default.
OK naddy@ cheloha@
|
|
OK mvs@
|
|
This makes kqueue use the extended callback interface with socket event
filters. Now one level of nested kernel locking is avoided, and the
callbacks run without splhigh().
The filterops no longer check NOTE_SUBMIT, and use a fixed locking
pattern instead. The f_event routines are always called with solock(),
whereas f_modify and f_process are always called without the lock.
OK mpi@
|
|
EFI GPT partition (0xEE) in the protective MBR to be smaller that the
actual size of the media.
This allows GPT disk images dd'ed onto larger physical media to be
recognized by fdisk(8) and the kernel.
Feedback from kettenis@ on various earlier versions.
|
|
operation. System calls should return -1 and set errno when they fail.
They should not return an errno value directly. This matches how
the Linux version of futex(2) behaves and what Mesa expects. This fixes
a bug in Mesa where a timeout wouldn't be reported properly.
Technically this is an ABI break. But libc and libpthread were changed
to be compatible with both the old and new ABI, and code outside of base
almost certainly expects Linux compatible behaviour. If you have not
rebuilt libc and the last few days, upgrade using a snap.
Mesa issue discovered by jsg@
ok mpi@, deraadt@
|
|
done because we have no cases where one thread should lock two sockets
simultaneously.
tested by yasuoka@
ok bluhm@ markus@
|
|
are constant. Having more const makes MP review easier. More
pointers are mapped read-only in the kernel image.
OK deraadt@ mvs@
|
|
resident set size. This replicates what the sysctl code does and fixes
a kernel crash reported by robert@
ok deraadt@
|
|
ok mpi@
|
|
Additionally make the values tuneable via sysctl.
OK deraadt@ mvs@
|
|
|
|
No intentional functional change.
|
|
0xEE (DOSPTYP_EFI) partition, or -1 no usable such partition is found.
Adopt a consistent idiom to capture the index for future use.
Clean up the gpt_chk_mbr() logic to make it clearer what constraints
are being applied when looking for the DOSTYP_EFI partition.
No intentional functional change.
|
|
solock() should be taken before sblock(). soreceive() grabs solock() and
then locks `so_rcv'. But later it releases solock() before call uimove(9).
So concurrent thread which performs soshutdown() could break sorecive()
loop. But `so_rcv' is still locked by sblock() so this soshutdown()
thread will sleep in sorflush() at sblock() call. soshutdown() thread
doesn't release solock() after sblock() call so it has no matter where to
release `so_rcv' - is will be locked until the solock() release.
That's why this strange looking code works fine. This sbunlock() movement
just after `so_rcv' cleanup affects nothing but makes the code
consistent and clean to understand.
ok mpi@
|
|
ok millert@
|
|
pointer.
ok deraadt@
|
|
vn_lock(9) and fdplock().
Reported-by: syzbot+2300a1bedc425f6f851e@syzkaller.appspotmail.com
ok visa@
|
|
|
|
At least one of them cause a deadlock involving `unplock' and mbuf allocations
('mbufpl') as reported by millert@.
|
|
In timeout_barrier(9) we take/release the kernel lock to ensure that the
given timeout has finished running (if it had been running at all).
This approach is inefficient. If we put a barrier timeout on the
queue and wait for it to run in cond_wait(9) we can block instead of
spinning for the kernel lock. We already do this for process-context
timeouts in timeout_barrier(9) anyway.
Discussed with dlg@, visa@, and mpi@.
ok dlg@
|
|
|
|
ok mpi@
|
|
|
|
ok mpi@
|
|
|
|
ok mpi@
|
|
|
|
ok mpi@
|
|
|
|
"IdontNeedEFI", a.k.a. "BIOS boot") as FS_BOOT. Often used to contain the second
stage boot loader binary on disk images.
Makes it easier to recognize/overwrite/remove the contents.
Not yet supported in fdisk(8).
Example image provided by mlarkin@
|
|
|
|
In August 2019 I tried to unlock lseek which failed since the vnode lock
could not be acquired without holding the kernel lock back then.
claudio@ recently made it possible to acquire a vnode lock without
holding the kernel lock. The kernel lock is still required around
VOP_GETATTR() as the underlying file system implementations are not
MP-safe.
ok claudio@
|
|
- Move the "hack" involving P_SINTR to avoid grabbing the SCHED_LOCK()
recursively closer to where it is necessary, in proc_stop()
- Introduce proc_unstop(), the symmetric routine to proc_stop(), which
manipulates `ps_xsig' and use it whenever a SSTOPed thread needs to be
awaken.
- Manipulate `ps_xsig' only in proc_stop/unstop()
ok kettenis@
|
|
comment 'As above...' makes sense again. Improve comments for
sysctl_int_bounded() and sysctl_bounded_arr().
OK gnezdo@ mvs@
|
|
remove it. This also fixes a defective check of the dynamic boundary
in sysctl_sysvshm().
OK mvs@ gnezdo@
|
|
Regen after unlocking getitimer(2) and setitimer(2).
ok anton@, mpi@
|
|
With the changes in kern_time.c v1.150, neither getitimer(2) nor
setitimer(2) need the kernel lock anymore.
ok anton@, mpi@
|
|
OK mvs@
|
|
(PF_ROUTE) sockets. This can be done because we have no cases where one
thread should lock two sockets simultaneously.
Against the previous version rtm_senddesync_timer() execution was moved
to process context.
Also this time `so_lock' used for routing sockets only but in the future
it will be used to other socket types too.
tested by claudio@
ok claudio@ bluhm@
|
|
OK deraadt@
|
|
functions are sysctl_int() and sysctl_rdint(). This brings us back
the 4.4BSD implementation. Then sysctl_int_bounded() builds the
magic for range checks on top. sysctl_bounded_arr() is a wrapper
around it to support multiple variables.
Introduce macros that describe the meaning of the magic boundary
values. Use these macros in obvious places.
input and OK gnezdo@ mvs@
|
|
(delayed action) sigabort() and disabling all syscalls for this process
(ie. all threads). This resulted in multiple-threads crashing over top
of themselves, and a poor debugging experience. We keep using sigabort()
rather than sigexit(), to keep the debugging process good.
Diagnosed from a report from brynet, and followup discussion with many.
|