summaryrefslogtreecommitdiff
path: root/sys/arch/amd64
AgeCommit message (Collapse)Author
2020-09-24Only perform uvm_map_inentry() checks for PROC_SP for userland pagefaults.Theo de Raadt
This should be sufficient for identifying pivoted ROP. Doing so for other traps is at best opportunistic for finding a straight-running ROP chain, but the added (and rare) sleeping point has proven to be dangerous. Discussed at length with kettenis and mortimer. ok mortimer kettenis mpi
2020-09-15abl(4) is a new driver to control the backlight brightness on Intel basedMarcus Glocker
Apple machines. The driver attaches through acpi(4) when the HID 'APP0002' is found. Thanks to kettenis@ for helping me sorting out the PCI bits. ok kettenis@
2020-09-14The uvm_map_inentry() check may sleep to grab the lock of the map.Mark Kettenis
The fault address is read from cr2 in pageflttrap() which gets called after this check and if the check sleeps, cr2 is likely to be clobbered by a page fault in another process. Fix this by reading cr2 early and pass it to pageflttrap(). ok mpi@, semarie@, deraadt@
2020-09-13change pmap wbinvd use to wbinvd_on_all_cpusJonathan Gray
with this we can revert the recent coherency workaround in mesa ok deraadt@ kettenis@
2020-09-13add an ipi for wbinvd and a linux style wbinvd_on_all_cpus() functionJonathan Gray
ok kettenis@ deraadt@
2020-09-13add SRBDS cpuid bitsJonathan Gray
2020-09-12asmc0 -> asmc*Marcus Glocker
Now that asmc(4) attaches through acpi(4), other than with isa(4), acpi(4) could attach multiple SMC chips in theory, even though in practice there will be only one SMC chip per machine. Suggested and ok kettenis@
2020-09-12Make asmc(4) attach through acpi(4) instead of isa(4).Marcus Glocker
This e.g. makes the driver also work on iMac11,2. ok kettenis@, jung@
2020-09-11Include <sys/systm.h> directly instead of relying on hidden UVM includes.Martin Pieuchot
The header is being pulled via db_machdep.h -> uvm_extern.h -> uvm_map.h
2020-09-10Introduce a helper to find a VCPU.Martin Pieuchot
from jordan@
2020-09-06amd64: add tsc_delay(), a delay(9) implementation based on the TSCcheloha
In preparation for running the lapic timer in oneshot mode on amd64 we need a replacement for lapic_delay(). Using the lapic timer itself to implement delay(9) when the timer is not running in periodic mode is complicated if not outright impossible. Meanwhile, the i8254 provides our only other amd64 delay(9) implementation and it is an extremely slow clock. On my 2GHz machine, gettick() takes ~20 microseconds to complete *without* mutex contention. On a VM it is even slower, as you must exit the VM for each inb() and outb(). So, add tsc_delay() and use it when we have a constant/invariant TSC. The TSC is a 64-bit "up-counter" so the implementation is simple. Given how slow the i8254 is on modern machines, we may want to add an HPET delay(9) implementation as a fallback for machines where the TSC drifts. The HPET itself is pretty slow, but not as slow as the i8254. Discussed with kettenis@, Mike Larkin, and naddy@. Tweaked by kettenis@. ok kettenis@
2020-09-03amd64: lapic: refactor timer programmingcheloha
We reprogram the lapic timer by hand in three separate places. This is error-prone and difficult to read. To clean things up, introduce routines for reprogramming the lapic timer in a given mode. lapic_timer_oneshot() starts a oneshot countdown. lapic_timer_periodic() starts a repeating countdown. Both of these routines call lapic_timer_start(), wherein we actually write the lapic registers. With input from dlg@. Earlier version eyeballed by mlarkin@. Suspend/resume tested by gnezdo@.
2020-09-01Fix write un-protecting of kernel memory. p was used uninitializedPatrick Wildt
at the beginning of the loop. We need to use cr3 at the start of each iteration for the top level page directory. From and ok sf@
2020-08-27Improve write un-protecting of kernel memory. For the Computrace modulePatrick Wildt
on the HP EliteBook 830 G6 we added a workaround which tries to re-map the pages where we want to place to kernel read-write. On some machines though this workaround causes a regression. Fix those by changing a few things: Only set the writeable bit if it isn't set yet. Un-protect write-protected page directories. Skip lower levels if large-page is set, since the next level is already a page. Don't do anything at all if paging is disabled. From Christian Ehrhardt ok bluhm@ tobhe@
2020-08-26Declare hw_{prod,serial,uuid,vendor,ver} in <sys/systm.h>.Visa Hankala
OK deraadt@, mpi@
2020-08-23amd64: TSC timecounter: prefix RDTSC with LFENCEcheloha
Regarding RDTSC, the Intel ISA reference says (Vol 2B. 4-545): > The RDTSC instruction is not a serializing instruction. > > It does not necessarily wait until all previous instructions > have been executed before reading the counter. > > Similarly, subsequent instructions may begin execution before the > read operation is performed. > > If software requires RDTSC to be executed only after all previous > instructions have completed locally, it can either use RDTSCP (if > the processor supports that instruction) or execute the sequence > LFENCE;RDTSC. To mitigate this problem, Linux and DragonFly use LFENCE. FreeBSD and NetBSD take a more complex route: they selectively use MFENCE, LFENCE, or CPUID depending on whether the CPU is AMD, Intel, VIA or something else. Let's start with just LFENCE. We only use the TSC as a timecounter on SSE2 systems so there is no need to conditionally compile the LFENCE. We can explore conditionally using MFENCE later. Microbenchmarking on my machine (Core i7-8650) suggests a penalty of about 7-10% over a "naked" RDTSC. This is acceptable. It's a bit of a moot point though: the alternative is a considerably weaker monotonicity guarantee when comparing timestamps between threads, which is not acceptable. It's worth noting that kernel timecounting is not *exactly* like userspace timecounting. However, they are similar enough that we can use userspace benchmarks to make conjectures about possible impacts on kernel performance. Concerns about kernel performance, in particular the network stack, were the blocking issue for this patch. Regarding networking performance, claudio@ says a 10% slower nanotime(9) or nanouptime(9) is acceptable and that shaving off "tens of cycles" is a micro-optimization. There are bigger optimizations to chase down before such a difference would matter. There is additional work to be done here. We could experiment with conditionally using MFENCE. Also, the userspace TSC timecounter doesn't have access to the adjustment skews available to the kernel timecounter. pirofti@ has suggested a scheme involving RDTSCP and an array of skews mapped into user memory. deraadt@ has suggested a scheme where the skew would be kept in the TCB. However it is done, access to the skews will improve monotonicity, which remains a problem with the TSC. First proposed by kettenis@ and pirofti@. With input from pirofti@, deraadt@, guenther@, naddy@, kettenis@, and claudio@. Based on similar changes in Linux, FreeBSD, NetBSD, and DragonFlyBSD. ok deraadt@ pirofti@ kettenis@ naddy@ claudio@
2020-08-20Fix build without NPCKBC and NUKBDkn
The "error" variable is used in one case only, so move it into scope under #ifdef. OK deraadt gnezdo
2020-08-19Use sysctl_bounded_args for simple cases in cpu_sysctl on amd64gnezdo
deraadt@: fine
2020-08-19Push KERNEL_LOCK/UNLOCK() dance inside trapsignal().Martin Pieuchot
ok kettenis@, visa@
2020-08-02additional files from libkern will be needed by clang10Theo de Raadt
from mortimer
2020-07-29atapiscsi is not needed here. (well maybe it changes the behaviour ofTheo de Raadt
the pciide subsystem a tiny bit at attach-time, but we don't have the downstream cd(4) device to attach, so let's try without)
2020-07-21acpi can use IPL_BIO (a low interrupt) since it only enqueues operations forTheo de Raadt
later processing. The use of a high interrupt will predate suspend/resume efforts, we had to redesign acpi to be non-reentrant obviously discussed with kettenis, in snaps for more than a week
2020-07-08Use CPU_IS_PRIMARY macro in identifycpu() on amd64.Frederic Cambus
OK deraadt@
2020-07-08Clean up the amd64 userland timecounter implementation a bit:Mark Kettenis
* We don't need TC_LAST * Make internal functions static to avoid namespace pollution in libc.a * Use a switch statement to harmonize with architectures providing multiple timecounters ok deraadt@, pirofti@
2020-07-07Get rid of some rasops callbacks in efifb that only call rasopsJoshua Stein
functions in them and let rasops call them directly. From John Carmack ok kettenis
2020-07-06Add support for timeconting in userland.Paul Irofti
This diff exposes parts of clock_gettime(2) and gettimeofday(2) to userland via libc eliberating processes from the need for a context switch everytime they want to count the passage of time. If a timecounter clock can be exposed to userland than it needs to set its tc_user member to a non-zero value. Tested with one or multiple counters per architecture. The timing data is shared through a pointer found in the new ELF auxiliary vector AUX_openbsd_timekeep containing timehands information that is frequently updated by the kernel. Timing differences between the last kernel update and the current time are adjusted in userland by the tc_get_timecount() function inside the MD usertc.c file. This permits a much more responsive environment, quite visible in browsers, office programs and gaming (apparently one is are able to fly in Minecraft now). Tested by robert@, sthen@, naddy@, kmos@, phessler@, and many others! OK from at least kettenis@, cheloha@, naddy@, sthen@
2020-07-06wire up kstat(4)David Gwynne
"looks right" deraadt@
2020-07-03Use an LFENCE instruction everywhere where we use RDTSC when we areMark Kettenis
doing some sort of time measurement. This is necessary since RDTSC is not a serializing instruction. We can use LFENCE as the serializing instruction instead of CPUID since all amd64 machines have SSE. This considerably reduces the jitter in TSC skew measurements. ok deraadt@, cheloha@, phessler@
2020-06-30Remove obsolete <machine/stdarg.h> header. Nowadays the varargVisa Hankala
functionality is provided by <sys/stdarg.h> using compiler builtins. Tested in a ports bulk build on amd64 by naddy@ OK naddy@ mpi@
2020-06-22Change tsc_get_timecount return from uint to u_int per sys/timetc.h.Paul Irofti
First brought up by naddy@ in the usertc thread, OK kettenis@.
2020-06-19fold the TSC value in fewer operations, same result; ok deraadt@Christian Weisgerber
2020-06-17pci_intr_establish_cpu() for establishing an interrupt no a specific cpu.David Gwynne
the cpu is specified by a struct cpu_info *, which should generally come from an intrmap. this is adapted from a diff that patrick@ sent round a few years ago for a pci_intr_map_msix_cpuid, where you asked for an msi vector on a specific cpu, and then called pci_intr_establish with the handle you get. kettenis pointed out that it's hard on some archs to carry cpu on a pci interrupt handle, so i tweaked it to turn it into a pci_intr_establish_cpu instead. jmatthew@ and i (but mostly jmatthew@ to be honest) have been experimenting with this api on multiple archs and it is working out well. i'm putting this diff in now on amd64 so people can kick the tyres a bit. tested with hacked up vmx(4), ix(4), and mcx(4)
2020-06-16make intr_barrier run sched_barrier on the cpu the interrupt pinned to.David Gwynne
intr_barrier passed NULL to sched_barrier before this, which ends up being the primary cpu. that's been mostly right until this point, but is set to change.
2020-06-15Check rdrand for success and try up to ten times, as recommended by Intel.Christian Weisgerber
Do the same for rdseed. ok deraadt@
2020-06-14crank version numberTheo de Raadt
2020-06-14asm versions of mdrandom() no longer neededTheo de Raadt
2020-06-14rewrite mdrandom() in C. previously this XOR'd against rdrand if available,Theo de Raadt
and alternatively XOR'd against TSC. now always run both sequences, and also support rdseed as a third procedure. ok kettenis naddy
2020-06-08update drm to linux 5.7Jonathan Gray
adds kernel support for amdgpu: vega20, raven2, renoir, navi10, navi14 inteldrm: icelake, tigerlake Thanks to the OpenBSD Foundation for sponsoring this work, kettenis@ for helping, patrick@ for helping adapt rockchip drm and many developers for testing.
2020-06-03let the random subsystem read the tsc for event "timestamps".David Gwynne
2020-06-02add acpihid(4) for ACPI HID event and 5-button array devicesJoshua Stein
ok kettenis
2020-05-31add umstc(4) for Microsoft Surface Type Cover keyboardsJoshua Stein
2020-05-31introduce "cpu_rnd_messybits" for use instead of nanotime in dev/rnd.c.David Gwynne
rnd.c uses nanotime to get access to some bits that change quickly between events that it can mix into the entropy pool. it doesn't use nanotime to get a monotonically increasing set or ordered and accurate timestamps, it just wants something with bits that change. there's been discussions for years about letting rnd use a clock that's super fast to read, but not necessarily accurate, but it wasn't until recently that i figured out it wasn't interested in time at all, so things like keeping a fast clock coherent between cpu cores or correct according to ntp is unecessary. this means we can just let rnd read the cycle counters on cpus and things will be fine. cpus with cycle counters that vary in their speed and arent kept consistent between cores may even be desirable in this context. so this is the first step in converting rnd.c to reading cycle counter. it copies the nanotime backend to each arch, and they can replace it with something MD as a second step later on. djm@ suggested rnd_messybytes, but we landed on cpu_rnd_messybits. thanks to visa for his eyes. ok deraadt@ visa@ deraadt@ says he will help handle any MD fallout that occurs.
2020-05-29dev/rndvar.h no longer has statistical interfaces (removed during variousTheo de Raadt
conversion steps). it only contains kernel prototypes for 4 interfaces, all of which legitimately belong in sys/systm.h, which are already included by all enqueue_randomness() users.
2020-05-28When calling rasops_init() in efifb_cnremap() and efifb_attach(), passFrederic Cambus
EFIFB_HEIGHT and EFIFB_WIDTH instead of efifb_std_descr.n{rows,cols}. Because the efifb resolution doesn't change, this ensures 'ri_emuwidth' and 'ri_emuheight' will always get the same value when we remap and later when we attach, so the text area is always displayed at the same position. This fixes display glitches happening on smaller screens or with larger fonts, which caused the content previously displayed in the area that was becoming margins when remapping to remain there. OK jsg@
2020-05-28Call cninit() after parsing boot parameter to make cninit() possibleYASUOKA Masahiko
to select the VGA or the EFI framebuffer properly. Previous initializes VGA unconditionally, it caused serious problems like the video distortion and so on. As a downside of this commit, some early panic or debug messages will not be displayed. test Andrew Daugherity, jsg ok jsg kettenis
2020-05-27raise max columns and rows in efifb to 160Jonathan Gray
This is the same change made in rev 1.21 to match the drm drivers. It was reverted as Lucas Raab reported problems with inteldrm taking over the fb with a 4k display. Lucas confirmed that this is no longer an issue. Prompted by a similar patch from John Carmack to raise the limits. ok kettenis@
2020-05-27don't limit clflush to Intel CPUsJonathan Gray
discussed with deraadt@
2020-05-26increment version numbers, due to recent RB_GOODSEED and fchmod +T changesTheo de Raadt
2020-05-25Adjust mdrandom() to also return 0 for success, -1 for failureTheo de Raadt
2020-05-25Adjust fwrandom() to return 0 for sucess, -1 for failureTheo de Raadt