summaryrefslogtreecommitdiff
path: root/sys/arch/arm64/include
AgeCommit message (Collapse)Author
2024-11-18move bus space extern to bus.h; ok mpi@Jonathan Gray
2024-11-08remove ID register values, cpu.c uses different definesJonathan Gray
2024-11-08remove PCI MMIO defines, matches recent amd64 changeJonathan Gray
2024-10-22correct name of define for ISS data abort S1PTW bitJonathan Gray
2024-10-15remove unneeded pte.h includeJonathan Gray
2024-10-14remove unneeded vmparam.h include from pte.hJonathan Gray
include vmparam.h in process_machdep for USER_SPACE_BITS
2024-10-14remove unneeded device.h includeJonathan Gray
2024-09-29remove unused cruft; ok kettenis@Jonathan Gray
2024-07-30Populate most of the remaining hwcap and hwcap2 flags based on the detectedMark Kettenis
CPU features. ok naddy@
2024-07-24If the CPU cores implement FEAT_IDST, emulate access to the CPU IDMark Kettenis
registers from userland and set HWCAP_CPUID. This will allow detection of features to be introduced into the architecture in the future without allocating new HWCAP_xxx or HWCAP2_xxx bits. We provide the same sanitized view of the CPU ID registers as is currently available through sysctl(2). Note that this introduces an unconditional read of ID_AA64MMFR2_EL1. This is known to cause problems on older versions of QEMU. If this turns out to be a problem in cases where updating QEMU is not an option, we'll have to implement a workaround. Also note that since we don't emulate the CPU ID registers on older core, this means that microarchitectural optimizations keyed of reads of MIDR_EL1 are not possible on OpenBSD. I don't think that is a real problem. ok jca@
2024-07-17Clean up the cpi_id_aa64xxx variables at the end of autoconf such thatMark Kettenis
sysclt(2) and ID register access emulation can share the variables. ok jca@
2024-07-14Add elf_aux_info(3)Jeremie Courreges-Anglas
Designed to let userland peek at AT_HWCAP and AT_HWCAP2 using an already existing interface coming from FreeBSD. Headers bits were snatched from there. Input & ok kettenis@ libc bump and sets sync will follow soon
2024-07-10Implement support for deeper idle states offered by PSCI. Reduces theMark Kettenis
idle power usage of the Vivobook S15 by almost 50%. ok patrick@
2024-07-10Hook up the Qualcomm UEFI Secure Application that handles EFI variables toMark Kettenis
efi(4) such that we can access EFI variables through ioctls on /dev/efi. ok patrick@
2024-07-10Missed some files in previous commit to split vmd into mi/md.Dave Voutila
Forgot `cvs add` and sys/dev/vmm/vmm.h changes.
2024-06-23Enable EPAN if it is available.Mark Kettenis
ok patrick@
2024-06-12remove BMAJ and CMAJ defines only used by arm64; ok deraadt@Jonathan Gray
2024-05-27Decode remaining ID_AA64ISAR1_EL1 features.Mark Kettenis
ok jsg@
2024-05-22remove prototypes with no matching function and externs with no varJonathan Gray
2024-05-07drop the MD byte-swap micro-optimizations on clang architecturesChristian Weisgerber
The compiler already translates the generic code into arithmetic byte-swap instructions or byte-swapping memory load and store instructions if available on an architecture. ok deraadt@ guenther@
2024-05-01Add per-CPU caches to the pmemrange allocator.Martin Pieuchot
The caches are used primarily to reduce contention on uvm_lock_fpageq() during concurrent page faults. For the moment only uvm_pagealloc() tries to get a page from the current CPU's cache. So on some architectures the caches are also used by the pmap layer. Each cache is composed of two magazines, design is borrowed from jeff bonwick vmem's paper and the implementation is similar to the one of pool_cache from dlg@. However there is no depot layer and magazines are refilled directly by the pmemrange allocator. This version includes splvm()/splx() dances because the buffer cache flips buffers in interrupt context. So we have to prevent recursive accesses to per-CPU magazines. Tested by naddy@, solene@, krw@, robert@, claudio@ and Laurence Tratt. ok claudio@, kettenis@
2024-04-29remove prototypes for removed functionsJonathan Gray
2024-04-19Revert per-CPU caches a double-free has been found by naddy@.Martin Pieuchot
2024-04-17Add per-CPU caches to the pmemrange allocator.Martin Pieuchot
The caches are used primarily to reduce contention on uvm_lock_fpageq() during concurrent page faults. For the moment only uvm_pagealloc() tries to get a page from the current CPU's cache. So on some architectures the caches are also used by the pmap layer. Each cache is composed of two magazines, design is borrowed from jeff bonwick vmem's paper and the implementation is similar to the one of pool_cache from dlg@. However there is no depot layer and magazines are refilled directly by the pmemrange allocator. Tested by robert@, claudio@ and Laurence Tratt. ok kettenis@
2024-03-18Add support for the new layout of the CCSIDR_EL1 register that wasMark Kettenis
introduced in Armv8.3 when the CCIDX feature is advertised. This makes us properly detect the cache size on newer CPU cores like Neoverse N2, at least when emulated by QEMU. ok jsg@
2024-03-17The feature is called SSBS instead of SBSS.Mark Kettenis
2024-03-16Set the HCR_API and HCR_APK bits in the HCR_EL2 when CPUs boot in EL2.Mark Kettenis
Otherwise using PAC instructions in EL1 will trigger a trap into EL2 that we don't handle. ok jsg@, deraadt@
2024-03-05Tighten up BTCFI by flipping the bits that make PACIASP and PACIBSPMark Kettenis
behave like BTI c instead of BTI jc. ok deraadt@, tobhe@
2024-02-25clockintr: rename "struct clockintr_queue" to "struct clockqueue"Scott Soule Cheloha
The code has outgrown the original name for this struct. Both the external and internal APIs have used the "clockqueue" namespace for some time when operating on it, and that name is eyeball-consistent with "clockintr" and "clockrequest", so "clockqueue" it is.
2024-02-03Implement Multiple Message MSI support on arm64. As on amd64 this isMark Kettenis
experimental code to assis qwx(4) development. Currently this only works on systems that use agintcmsi(4) as the MSI controller combined with the dwpcie(4) Hots/PCIe bridge. ok patrick@
2024-01-24clockintr: switch from callee- to caller-allocated clockintr structsScott Soule Cheloha
Currently, clockintr_establish() calls malloc(9) to allocate a clockintr struct on behalf of the caller. mpi@ says this behavior is incompatible with dt(4). In particular, calling malloc(9) during the initialization of a PCB outside of dt_pcb_alloc() is (a) awkward and (b) may conflict with future changes/optimizations to PCB allocation. To side-step the problem, this patch changes the clockintr subsystem to use caller-allocated clockintr structs instead of callee-allocated structs. clockintr_establish() is named after softintr_establish(), which uses malloc(9) internally to create softintr objects. The clockintr subsystem is no longer using malloc(9), so the "establish" naming is no longer apt. To avoid confusion, this patch also renames "clockintr_establish" to "clockintr_bind". Requested by mpi@. Tweaked by mpi@. Thread: https://marc.info/?l=openbsd-tech&m=170597126103504&w=2 ok claudio@ mlarkin@ mpi@
2024-01-20There are several DART variants; print some more details such that we canMark Kettenis
distinguish between them. Pay attention to the apple,dma-range property that tells us the desired DVA window. Add support for a new BUS_DMA_FIXED that allows use of bus_dmamap_load_raw(9) to map things at a pre-determined DVA. This last change is needed for the upcoming Apple KMS driver. Hopefully that is the only driver that will need this, so don't attempt to turn this into an MI feature. ok patrick@
2024-01-15We can't call kstat_create(9) when bringing up the secondary CPUs as itMark Kettenis
uses an rwlock and curproc isn't initialized yet for these CPUs at this point. As a result we hit a "locking against myself" panic if there is any lock contention. Fix this by adding a new ci_midr member to struct cpu_info which gets initialized when we identify the CPUs and use that to attach the kstat stuff. ok tobhe@, dlg@
2023-12-26Improve handling of SError interrupts. Print some useful information andMark Kettenis
allow additional information to be printed for specific CPU types. Use this to print the L2C registers on Apple CPUs which can be very useful in tracking down the source of certain SError interrupts. ok miod@, dlg@
2023-12-14NKMEMPAGES_MAX_DEFAULT is no longer used. Remove it from param.h.Claudio Jeker
OK miod@
2023-12-11Implement per-CPU caching for the page table page (vp) pool and the PTEMark Kettenis
descriptor (pted) pool in the arm64 pmap implementation. This significantly reduces the side-effects of lock contention on the kernel map lock that is (incorrectly) translated into excessive page daemon wakeups. This is not a perfect solution but it does lead to significant speedups on machines with many CPU cores. This requires adding a new pmap_init_percpu() function that gets called at the point where kernel is ready to set up the per-CPU pool caches. Dummy implementations of this function are added for all non-arm64 architectures. Some other architectures can probably benefit from providing an actual implementation that sets up per-CPU caches for pmap pools as well. ok phessler@, claudio@, miod@, patrick@
2023-12-05boot_file was removed in arm64 machdep.c rev 1.55Jonathan Gray
2023-11-29Fix unwanted sign-extension of ID register masks. Sign-extension of theMark Kettenis
GPI feature mask caused misdetection of the GPI feature when some other feature was present that was advertised in the upper 32 bits of the same ID register. Resulting in a crash as soon as the pmap code tried to set the PAC keys. Fix suggested by Marc Zyngier who found and debugged the problem. ok jsg@, deraadt@
2023-09-22move simplebusvar.h so it can be used without ifdefJonathan Gray
ok kettenis@ phessler@
2023-09-12Store ITS ID in struct interrupt_controller so it can be used to look upJonathan Matthew
the right ITS to use when establishing interrupts. ok kettenis@ patrick@
2023-08-23all platforms: separate cpu_initclocks() from cpu_startclock()Scott Soule Cheloha
To give the primary CPU an opportunity to perform clock interrupt preparation in a machine-independent manner we need to separate the "initialization" parts of cpu_initclocks() from the "start the clock interrupt" parts. Currently, cpu_initclocks() does everything all at once, so there is no space for this MI setup. Many platforms have more-or-less already done this separation by implementing a separate routine named "cpu_startclock()". This patch promotes cpu_startclock() from de facto standard to mandatory API. - Prototype cpu_startclock() in sys/systm.h alongside cpu_initclocks(). The separation of responsibility between the two routines is a bit fuzzy but the basic guidelines are as follows: + cpu_initclocks() must initialize hz, stathz, and profhz, and call clockintr_init(). + cpu_startclock() must call clockintr_cpu_init() and start the clock interrupt cycle on the calling CPU. These guidelines will shift in the future, but that's the way things stand as of *this* commit. - In initclocks(): first call cpu_initclocks(), then do MI setup, and last call cpu_startclock(). - On platforms where cpu_startclock() already exists: don't call cpu_startclock() from cpu_initclocks() anymore. - On platforms where cpu_startclock() doesn't yet exist: implement it. Usually this is as simple as dividing cpu_initclocks() in two. Tested on amd64 (i8254, lapic), arm64, i386 (i8254, lapic), macppc, mips64/octeon, and sparc64. Tested on arm/armv7 (agtimer(4)) by phessler@ and jmatthew@. Tested on m88k/luna88k by aoyama@. Tested on powerpc64 by gkoehler@ and mlarkin@. Tested on riscv64 by jmatthew@. Thread: https://marc.info/?l=openbsd-tech&m=169195251322149&w=2
2023-07-25statclock: move profil(2), GPROF code to profclock(), gmonclock()Scott Soule Cheloha
This patch isolates profil(2) and GPROF from statclock(). Currently, statclock() implements both profil(2) and GPROF through a complex mechanism involving both platform code (setstatclockrate) and the scheduler (pscnt, psdiv, and psratio). We have a machine-independent interface to the clock interrupt hardware now, so we no longer need to do it this way. - Move profil(2)-specific code from statclock() to a new clock interrupt callback, profclock(), in subr_prof.c. Each schedstate_percpu has its own profclock handle. The profclock is enabled/disabled for a given CPU when it is needed by the running thread during mi_switch() and sched_exit(). - Move GPROF-specific code from statclock() to a new clock interrupt callback, gmonclock(), in subr_prof.c. Where available, each cpu_info has its own gmonclock handle . The gmonclock is enabled/disabled for a given CPU via sysctl(2) in prof_state_toggle(). - Both profclock() and gmonclock() have a fixed period, profclock_period, that is initialized during initclocks(). - Export clockintr_advance(), clockintr_cancel(), clockintr_establish(), and clockintr_stagger() via <sys/clockintr.h>. They have external callers now. - Delete pscnt, psdiv, psratio. From schedstate_percpu, also delete spc_pscnt and spc_psdiv. The statclock frequency is not dynamic anymore so these variables are now useless. - Delete code/state related to the dynamic statclock frequency from kern_clockintr.c. The statclock frequency can still be pseudo-random, so move the contents of clockintr_statvar_init() into clockintr_init(). With input from miod@, deraadt@, and claudio@. Early revisions cleaned up by claudio. Early revisions tested by claudio@. Tested by cheloha@ on amd64, arm64, macppc, octeon, and sparc64 (sun4v). Compile- and boot- tested on i386 by mlarkin@. riscv64 compilation bugs found by mlarkin@. Tested on riscv64 by jca@. Tested on powerpc64 by gkoehler@.
2023-07-13Use the deep idle state available on Apple M1/M2 cores in the idle loop andMark Kettenis
for suspend. This state makes the CPU lose some of its register state so we need to save these registers before putting the core to sleep and restore them when we wake up. This deep idle state has a higher wakeup latency than the normal WFI idle state. Use similar logic as acpucpu(4) to decide which idle state to pick. If some cores of a cluster are in this deep idle state, turbo states become available to the cores that remain active. So stop skipping these states. This improves single-core performance a little bit. The main win is in power savings when running in a state with a high clock frequency. My M2 Pro mini goes from 14W to 6.5W when idle at the maximum clock frequency. But event at the lowest clock frequency there are small but significant power savings. ok deraadt@, tobhe@
2023-07-02all platforms, kernel: remove __HAVE_CLOCKINTR symbolScott Soule Cheloha
Every platform made the clockintr switch at least six months ago. The __HAVE_CLOCKINTR symbol is now redundant. Remove it. Prompted by claudio@. Link: https://marc.info/?l=openbsd-tech&m=168826181015032&w=2 "makes sense" mlarkin@
2023-06-10Implement support for pointer authentication (PAC) in userland. With PACMark Kettenis
it is possible to "sign" pointers with a hidden key. The signature is placed in unused bits of the pointer and can be checked later. This can be used to provide "tail CFI" that is similar to what retguard provides. Debuggers need to be aware of the fact that pointers can be signed. For this purpose a new PT_PACMASK ptrace(2) request is introduced that returns as mask that indicates the bits used for the signature. Separate masks are provided for code and data pointers even though the masks are identical in the current implementation. These masks are also written into a special note section in the core dump. ok patrick@
2023-04-28bump MAXDSIZ to 128G on amd64 and 64G on arm64Robert Nagy
discussed with kettenis@, ok deraadt@
2023-04-16Make enabling the BTI feature a per-pmap thing by storing the ATTR_GP bitMark Kettenis
in a new pm_guarded member of struct pmap and using this member to add the bits to the PTEs ok deraadt@
2023-04-16Clear BTYPE bits when setting up a signal handler and when handling aMark Kettenis
PT_CONTINUE ptrace(2) request. Otherwise we would trap if userland was interrupted at a point where it is doing an indirect branch that has set the bits but before it has executed the BTI instruction at the branch target. The PT_SETREGS request may need similar treatment, at least when the PC is changed. But Linux doesn't do this and debuggers might want full control over the BTYPE bits. So leave this alone for now. ok guenther@
2023-04-11fix double words in commentsJonathan Gray
feedback and ok jmc@ miod, ok millert@
2023-03-27Implement branch target protection using the branch target identificationMark Kettenis
feature introduced in Armv8.5. This provides "head-CFI" to complement the "tail-CFI" provided by retguard. Unfortunately most arm64 machines don't support this feature yet. But Apple M2 does support it and it seems to work there. ok deraadt@