Age | Commit message (Collapse) | Author |
|
ok deraadt@
|
|
The kernel is not quite ready for timeout_in_nsec(). Remove it and
kclock_nanotime(). Both are unused.
Prompted by jsg@.
ok kn@
|
|
|
|
allow us to turn off the screen on Apple Silicon laptops until we have a
proper display controller driver.
ok kettenis@ patrick@
|
|
|
|
Move up to comment explaining different locks to account for all structs.
OK millert mvs
|
|
ok patrick@
|
|
During resume, it isn't necessarily a problem if the UTC time we get
from inittodr(9) lags behind the system UTC clock. In particular, if
the active timecounter's frequency is low enough, tc_delta() might not
overflow across a brief suspend.
Remove the misleading warning message. The code is behaving as
intended, just not in a way I anticipated when I added the warning
message a few years ago.
Discovered by kettenis@. Root cause isolated with kettenis@.
Link: https://marc.info/?l=openbsd-tech&m=166790845619897&w=2
ok mlarkin@ kettenis@
|
|
Removes a lock around an atomic write; this lock was causing slowdowns
since the lock being requested is nearly always unavailable because it
is held while the VM is running.
Noticed by claudio@, help from mpi@, dlg@ and claudio@.
ok dv
|
|
ifconfig(8) -C is the only user in base and the if_clone_attach() comment
explains how this list is being built during autoconf(9).
After that it is only ever read. Multiple threads may traverse the list in
parallel and reading the `int' count is atomic.
OK mvs
|
|
After this mechanical move, I can unlock the individual SIOCG* in there.
OK mvs
|
|
Switch arm64 to the clockintr(9) subsystem.
- Remove the custom per-CPU clock interrupt schedule from agtimer(4).
- Remove the custom randomized statclock() pieces from agtimer(4).
- Add agtimer_rearm(), agtimer_trigger(), and wire up agtimer_intrclock.
There is one wart:
- The AArch64 spec says that a value written to CNTV_TVAL_EL0 is
"treated as a signed 32-bit integer" [1]. kettenis@ doesn't know
what to make of this. I'm capping the value at INT32_MAX for
now. It's possible I am misreading this, though.
Tested by kettenis@ on his Apple M1 mini. Tested by me on my
Raspberry Pi 4B.
Link: https://marc.info/?l=openbsd-tech&m=166776342503304&w=2
[1] "Arm Architecture Reference Manual for A-profile architecture"
issue I.a, section D17.11.27 ("CNTV_TVAL_EL0").
ok kettenis@
|
|
|
|
Switch amd64 to the clockintr(9) subsystem. There are lots of little
changes, but the bigs ones are listed here.
When using the local apic timer:
- Run the timer in one-shot mode.
- lapic_delay() is gone. We can't use it to delay(9) when running
the timer in one-shot mode.
- Add a randomized statclock(); stathz = hz.
- Add support for switching to profhz when profiling is enabled;
profhz = stathz * 10.
When using the i8254/mc146818:
- i8254's clockintr() no longer has a monopoly on hardclock().
- mc146818's rtcintr() no longer has a monopoly on statclock().
- In profiling mode, the statclock() will drift very slightly
because (profhz = 1024) does not divide evenly into one billion.
We could avoid this by setting (profhz = 512) instead and
programming the RTC to run at that rate.
Early revisions reviewed by mlarkin@. Extensively tested by mlarkin@
on a variety of physical and virtual hardware. Additional testing
from dv@ and jmc@.
Link: https://marc.info/?l=openbsd-tech&m=166776339203279&w=2
ok kettenis@ mlarkin@
|
|
parking CPUs in a WFE/WFI loop.
ok deraadt@, mlarkin@
|
|
found in pfsync_insert_state(). It is caused by two packets which happen
to belong to the same session. Think of UDP stream or two TCP SYN packets
transmitted almost simultaneously. The first such packet wins a state lock
and inserts state to table. The second packet waits for state lock
as a reader. As soon as the first packet is done with state creation
it drops the lock and is going to sent S_INS message to its peer via
pfsync. The second update meanwhile obtains the state lock as a reader.
It finds a state created by the first packet. Later the second packet
also finds out the state needs to be updated, because sync_state
is still set to PFSYNC_S_NONE. The second packet puts state to snapshot
list marking it as S_UPD. All this happens before the first packet has
a chance to make a progress. Think of the first packet loses cpu after
dropping a write lock. Once the first packet gets running again it
trips KASSERT() because sync_state is set to S_UPD.
tested by hrvoje@
OK dlg@
|
|
Another mechanical diff without semantic changes to avoid churn in actual
unlocking diffs.
OK mpi
|
|
We can't use the HPET to delay(9) after we halt it during suspend.
Disable acpihpet_delay() before we halt the HPET and reenable it after
we restart the HPET during resume.
ok mlarkin@
|
|
Not all of the clocks with a delay(9) implementation necessarily keep
ticking across suspend/resume. We need a clean way to reverse
delay_init() during suspend when those clocks stop ticking.
Hence, delay_fini(). delay_fini() resets delay_func() to
i8254_delay() if the given function pointer is the active delay(9)
implementation.
ok mlarkin@
|
|
Not all of the clocks with a delay(9) implementation necessarily keep
ticking across suspend/resume. We need a clean way to reverse
delay_init() during suspend when those clocks stop ticking.
Hence, delay_fini(). delay_fini() resets delay_func() to
i8254_delay() if the given function pointer is the active delay(9)
implementation.
ok mlarkin@
|
|
ok mpi@, jsg@, phessler@, patrick@
|
|
mostly been there, it only needed to be hooked up to our infrastructure.
With this I can e.g. correctly see the lid state on the x13s.
ok kettenis@
|
|
|
|
This is a mechanical diff without semantical changes, locking ioctls
individually inside ifioctl() rather than all of them around it.
This allows us to unlock ioctls one by one.
OK mpi
|
|
|
|
Accesses to data structures used by these syscalls are serialized by the
VM map lock with the exception of file mappings which are still protected
by the KERNEL_LOCK().
Unlocking this set of syscalls improves most of userland workloads.
Tested by many including robert@ (since 2 years), mlarkin@, kn@, sdk@,
jca@, aoyama@, naddy@, Scott Bennett and others. Thanks to all!
Joint work with kn@.
ok robert@, aja@, kettenis@, kn@, deraadt@, beck@
|
|
any, don't try and print it, and especially don't error out.
Tested on Lenovo x13s (myself) and Pinebook Poop (kn@)
ok kn@
|
|
UEFI already initializes those, so we can simply just make use of that.
That said, the ctrl/dbi region isn't the first in the register list, so
instead try and look it up first and use it if available. Furthermore,
the ATU region isn't part of the ctrl/dbi region, so if we are able to
retrieve a separate reg for the ATU, use that instead. Some reshuffling
is necessary to make that work.
Tested on my Lenovo x13s and the MacchiatoBin
ok kettenis@
|
|
cell is used as a mask for SMR to match a number of IDs. So far we have
asserted that it's always 1, so loosen the restriction and pass both cells
instead of only the sid.
ok kettenis@
|
|
ok patrick@
|
|
hrvoje popovski showed me pfsync blowing up with this. im backing
it out quickly in case something else at the hackathon makes it
harder to do later.
kn@ agrees
|
|
to monitor state changes of the kernel device tree
input from dnd ok dlg@, deraadt@
|
|
this also avoids holding NET_LOCK too long.
the main change is done by running the purge tasks in systqmp instead
of systq. the pf state list was recently reworked so iteration over
the state can be done without blocking insertions.
however, scanning a lot of states can still take a lot of time, so
this also makes the state list scanner yield if it has spent too
much time running.
the other purge tasks for source nodes, rules, and fragments have
been moved to their own timeout/task pair to simplify the time
accounting.
in my environment, before this change pf purges often took 10 to
50ms. the softclock thread runs next to it often took a similar
amount of time, presumably because they ended up spinning waiting
for each other. after this change the pf_purges are more like 6 to
12ms, and dont block softclock. most of the variability in the runs
now seems to come from contention on the net lock.
tested by me sthen@ chris@
ok sashan@ kn@ claudio@
|
|
The read/write register routines for SVM didn't acknowledge RAX in
the VMCB as the de facto RAX state. When writing gprs, vmm should
update RAX in the VMCB. When reading, it should be setting the guest
regs state based on the VMCB.
Needed for proper mmio emulation in userland.
ok mlarkin@
|
|
(SRTT) instead of the timestamp option. Since the timestamp option is
disabled on some OSs (eg. Windows) or dropped by some
firewalls/routers, in such a case the window size had been fixed at
16KB, this limits throughput at very low on high latency networks.
Also replace "tcp_now" from 2HZ tick counter to binuptime in
milliseconds to calculate the SRTT better.
tested by krw matthieu jmatthew dlg djm stu stsp
ok claudio
|
|
Added in 2017 to
Reduce contention on the NET_LOCK() by moving the nd6 address expiration
task to the `softnettq`.
This should no longer be needed thanks to sys/net/if.c r1.652 in 2022:
Activate parallel IP forwarding. Start 4 softnet tasks. Limit the
usage to the number of CPUs.
Nothing in nd6_expire() or nd6_expire_timer_update() requires protection by
the kernel lock.
The interface list and per-interface address lists remain protected by the
net lock.
Tests by Hrvoje
OK mpi
|
|
hidden uses.
|
|
|
|
Based on a diff from gerhard@, ok kettenis@
|
|
of permitted addresses, done via .nofault* sections that end up in
the linked kernel's rodata.
ok deraadt@ kettenis@
|
|
|
|
it wraps pf_state_export and has the same arguments and return type.
pfsync can just call pf_state_export instead.
ok clang
|
|
Mischa Peters reported a performance regression in 7.2 when hosting
numerous guests under vmm(4). While iterating through the list of
vms during servicing an ioctl, vmm was triggering excessive wakeup
calls due to hitting zero refcnt.
Much guidance from dlg@ and testing from Mischa. OK mlarkin@.
|
|
for em 82575, 82576, i350, and i210.
Additional testing by Hrvoje Popovski
OK dlg@
|
|
this is straightening the deck chairs. the state import and export
code are used by both the pf ioctls and pfsync, but the export code
is in pf.c and the import code is in if_pfsync. if pfsync was
disabled then the ioctl stuff wouldnt link.
moving the import code to pf.c makes it more symmetrical(?) and
robust.
tweaks and ok from kn@ sashan@
|
|
ok kettenis@
|
|
ok kettenis@
|
|
this provides a 1:1 relationship of pfopen() calls to pfclose()
calls. in turn, this makes it a lot easier to track stuff allocated
by a process and then clean it up if that process goes away
unexpectedly. the unique dev_t provided by the cloning machinery
gives us a good identifier to track this state with too.
discussed with h2k22
ok sashan@
deraadt@ agrees this is a good time to put this in
|
|
Tested on LUNA-88K2 with 4bpp/8bpp framebuffer by me.
|
|
on ACPI.
ok kettenis@
|