Age | Commit message (Collapse) | Author |
|
the callback was called, and sometimes both. So the caller of that
API could not release resources correctly.
A bunch of errors can or should not happen, replace them with an
assert. Remove redundant checks. crypto_invoke() should not return
the error, but pass it via callback.
Some old hardware drivers keep part of their inconsistency as I
cannot test them.
OK mpi@
|
|
data from struct process anymore. This changes how siginfo and onstack
are accessed and make sendsig() more MP friendly.
With and OK semarie@ OK kettenis@
|
|
while walking the page tables.
ok mpi@, deraadt@
|
|
ok deraadt
|
|
After fixing previous syzbot issues related to lock contention, the reproducer code managed to hit an issue where it can exhaust kernel memory by allocating vcpus. Since each vcpu (regardless if it's SVM or VMX-capable) requires wiring some number of pages of memory, it was possible to starve other parts of the kernel.
This change limits the total number of vcpus to 512, a conservative number given vmm(4) only supports single vcpu guests at the moment.
ok mlarkin@
|
|
are ongoing.
|
|
This prevents possible corruption due to a concurrent access between
pmap_growkernel() & pmap_create/pmap_destroy().
Discussed with and ok kettenis@
|
|
Similar to the recent change by mpi in revision 1.288, commitid:
A4zhVhOoHAIpRGBJ, raise the ipl level of the vm_pool to IPL_MPFLOOR
to prevent lock ordering issues.
ok mpi@
|
|
Syzbot found 3 issues related to the new vcpu lock. This diff adds
a write lock to vm_rwregs (needed on VMX as vmread instructions
require taking ownership of the vcpu to load the VMCS) and prevents
locking the vcpu in vm_run if we fail the cas operation for toggling
vcpu state.
In the future, we can push the locking in vm_rwregs on AMD SVM
systems.
The panics in question:
panic: rw_enter: vcpulock locking against myself
panic: lock (rwlock) vcpulock not locked
panic: vcpulock: lock not held
Reported-by: syzbot+1dab11e14aa7a159cadf@syzkaller.appspotmail.com
Reported-by: syzbot+36244e105daffa1a81b6@syzkaller.appspotmail.com
Reported-by: syzbot+c78b5644c7dc3d9b689a@syzkaller.appspotmail.com
ok mlarkin@
|
|
Some pmaps (x86, hppa) and the buffer cache rely on UVM objects to allocate
and manipulate pages. These objects should not be manipulated by uvm_fault()
and do not currently require the same locking enforcement.
Use the dummy pagers to explicitly document which UVM functions are meant to
manipulate UVM objects (uobj) that do not need the upcoming `vmobjlock' and
instead still rely on the KERNEL_LOCK().
Tested by many as part of a larger diff.
ok kettenis@, beck@
|
|
IBRS feature need an lfence instruction after every near ret. Place
them after all functions in the kernel which are implemented in
assembler. Change the retguard macro so that the end of the lfence
instruction is 16-byte aligned now. This prevents that the ret
instruction is at the end of a 32-byte boundary. The latter would
cause a performance impact on certain Intel processors which have
a microcode update to mitigate the jump conditional code erratum.
See software techniques for managing speculation on AMD processors
revision 9.17.20 mitigation G-5.
See Intel mitigations for jump conditional code erratum revision
1.0 november 2019 2.4 software guidance and optimization methods.
OK deraadt@ mortimer@
|
|
since amd64 is compiled with -msave-args we have all arguments available to print and
there's no reason to limit this to six.
discussed with kettenis@
|
|
this allows us to dynamically trace function boundaries with btrace by patching
prologues and epilogues with a breakpoint upon which the handler records the data,
sends it back to userland for btrace to consume.
currently it's hidden behind DDBPROF, and there is still a lot to cleanup and
improve, but basic scripts that observe return codes from a probed function
work.
from Tom Rollet, with various changes by me
feedback and ok mpi@
|
|
We need the kernel lock before calling some uvm functions. Fixes a
panic reported by syzbot.
Reported-by: syzbot+dd7a70eaf794705db27e@syzkaller.appspotmail.com
ok mlarkin@
|
|
|
|
Adds support for Aquantia AQC1xx family of PCIe ethernet adapters. This
driver supports 1Gbps through 10Gbps modes of operation based on the
hardware and media/switch capabilities.
The initial code was ported from NetBSD, with jmatthew@ finishing up
the Tx/Rx ring support and interrupt handler routine.
The driver only supports devices using firmware V2.
This diff enables aq(4) on riscv64 and amd64, the only platforms where
I have tested the driver, but it likely works on other architectures
as well.
|
|
Syzbot might complain about "new" panics, but to help debug a recent
report it helps to have unique rw lock names.
"sounds good to me" @mlarkin
|
|
Reported-by: syzbot+c8905496cd61610f77e2@syzkaller.appspotmail.com
ok mlarkin@
|
|
to stop speculation. This seems to be necessary when the branch
predictor hits the ret for the first time. In their white paper
to mitigate speculation attacks, AMD's retpoline example has an
explicit lfence. Adjust our retpoline assembly macro in the kernel.
OK guenther@ mortimer@ deraadt@
|
|
On Intel VMX hosts, when a guest migrates cpus, VMCS state needs
to be flushed to physical memory before being reloaded on the new
cpu. This diff adds a new ipi to allow a guest resuming on a new
cpu to signal to the old that it needs to vmclear.
To better surface the potential race conditions, unlock the kernel
after handling the ioctl to vmm and simplify the run loops for both
vmx and svm. This requires a new vcpu lock.
Tested by some on tech@. "go for it" @mlarkin
|
|
delay func. Otherwise simply delay for a second to calibrate the LAPIC.
Install the lapic delay func only if we were using the i8254 before as
delay func.
Discussed with the hackroom
ok kettenis@
|
|
a working delay func ready before the first occurence of delay(). This is
necessary on Hyper-V Gen 2 VMs where we don't use the TSC.
Discussed with the hackroom
ok kettenis@
|
|
the TSC for delays even if there is a skew between the TSCs of the cores
as this doesn't matter for delay(9).
Gets rid of te unreasonable clock speed reports on Intel Tiget Lake CPUs
where the i8254 behaves in weird ways.
ok patrick@, deraadt@, mlarkin@
|
|
From Alex Wilson, Thanks!
|
|
back in 2019.
ok mpi@
|
|
keyboard is a pseudo device which is used to expose audio and
application launch keys. My prime motivation is to get the volume mute,
increment and decrement keys to just work on my keyboard without the
need to use usbhidaction(1).
Looks reasonable to kettenis@ mpi@ and ok jcs@
|
|
ok jsg
|
|
The printfs complaining about unknown FSB_FREQ values didn't end with
a newline. jsg points out that this is because the original i386 code
then prints MSR_EBL_CR_POWERON, which was omitted when the code was
adapted for amd64.
ok jsg
|
|
This makes modifying hw.setperf and apmd -A work on robert's laptop.
Previously, it would sometimes be impossible to set hw.setperf to any
value on this machine.
Keep a delay loop that waits for the MSR write to take effect before
setting hw.cpuspeed to the new value since this is apparently needed
for some pre-ryzen processors.
Debugging, initial diff & test by robert
ok brynet
|
|
frequencies on intel processors. This way, the default hw.setperf=99
corresponds to the maximum ordinary speed while setting it to 100
enables turbo mode.
Tested in snaps for a week, positive feedback from several.
|
|
|
|
those options are incompatible with the kernel anymore. Set DYNAMIC_CRC_TABLE
and BUILDFIXED for these bootblocks, to save space on the media
ok tb mlarkin
|
|
constant. Then they are mapped as read only.
OK deraadt@ dlg@
|
|
hardware support changes include
inteldrm: better support for tiger lake
amdgpu: support for navi12, navi21 "sienna_cichlid", arcturus
amdgpu: support for cezanne "green sardine" ryzen 5000 apu
Thanks to the OpenBSD Foundation for sponsoring this work,
patrick@ for helping adapt rockchip drm, kettenis@ and mpi@
for uvm discussions and various testers.
|
|
waiting on CPUs that didn't spin up. This will allow us to spin down
CPUs in the future to save power as well.
ok mpi@
|
|
|
|
need to be invalidated. Instead of keeping a bitset of CPUs in
each pmap, have each cpu_info track which pmap it has loaded: replace
pmap->pm_cpus with cpu_info->ci_proc_pmap. This reduces the atomic
operations (and cache thrashing) and simplifies cpu_switchto()
Also, fix a defect in cpu_switchto()'s "am I loading the same cr3?"
test: ignore the CR3_REUSE_PCID bit when checking that. This makes
switching between kernel threads slightly less costly.
over a week in snaps with no complaints
looks ok to mlarkin@ kettenis@ mpi@
|
|
For example uvm_objinit() becomes uvm_obj_init(). Reduce differences
between the trees and help porting new functions needed for UVM object
locking.
No functionnal change.
|
|
AMD errata 400
"APIC Timer Interrupt Does Not Occur in Processor C-States"
is only mentioned in the revision guides for family 0fh and 10h
but we were checking for and disabling C1E on >= family 0fh.
Since family 16h all the bits of the Interrupt Pending MSR the
workaround uses are documented as read as zero. So this didn't cause
any problems on real hardware but did on EPYC based AWS t3a instances
according to Ilya Voronin who sent an initial patch to not attempt the
workaround on family 17h.
Tested on non-virtualised EPYC 7702P 17-31-00 by Hrvoje Popovski and
Ryzen 5 2600X 17-08-02 by myself.
ok mlarkin@
|
|
arm64 efid_io().
ok kettenis@
|
|
The size of kernel fonts in RAMDISKs had long been a problem on systems
with large screen resolutions booting via EFI, as previously only the 8x16
font was built into RAMDISKs. As those systems are becoming more common,
this should make the installation and update process more comfortable.
OK deraadt@, jcs@
|
|
mlarkin: "sure"
|
|
BS->AllocatePages() and BS->FreePages() as in all the other
efid_io() versions.
Don't leak the pages on success.
Bump boot version to 3.59.
ok yasuoka@
|
|
To aid in development and debugging, this adds a tracepoint prior
to vm entry and after vm exit. It captures the vcpu and run params
plus the exit code, but dt(4)/btrace(8) will need some future work
to leverage those args.
The location of the tracepoint might change in the future, but for
now this solves my issues trying to use printf's to debug vmcs state
corruption.
ok mpi@
|
|
|
|
the expansion 'func(params)'.
Allows upcoming removal of eficall.h.
|
|
|
|
prodded by jsg@
|
|
'__attribute((ms_abi))', removing the need for the EFI_CALL
abstraction.
Nuke the amd64 EFI_CALL dance from all copies of eficall.h,
remove eficall.S from the build.
ok kettenis@ yasuoka@
|
|
media length check to allow EFI GPT partitions to be smaller that
the entire disk.
Consistently use GPTSECTOR instead of randomly tossing in some
literal '1's.
ok kettenis@
|