src - OpenBSD base system

Age	Commit message (Collapse)	Author
2024-11-18	move bus space extern to bus.h; ok mpi@	Jonathan Gray

2024-11-08	remove ID register values, cpu.c uses different defines	Jonathan Gray

2024-11-08	remove PCI MMIO defines, matches recent amd64 change	Jonathan Gray

2024-10-22	correct name of define for ISS data abort S1PTW bit	Jonathan Gray

2024-10-15	remove unneeded pte.h include	Jonathan Gray

2024-10-14	remove unneeded vmparam.h include from pte.h	Jonathan Gray
	include vmparam.h in process_machdep for USER_SPACE_BITS
2024-10-14	remove unneeded device.h include	Jonathan Gray

2024-09-29	remove unused cruft; ok kettenis@	Jonathan Gray

2024-07-30	Populate most of the remaining hwcap and hwcap2 flags based on the detected	Mark Kettenis
	CPU features. ok naddy@
2024-07-24	If the CPU cores implement FEAT_IDST, emulate access to the CPU ID	Mark Kettenis
	registers from userland and set HWCAP_CPUID. This will allow detection of features to be introduced into the architecture in the future without allocating new HWCAP_xxx or HWCAP2_xxx bits. We provide the same sanitized view of the CPU ID registers as is currently available through sysctl(2). Note that this introduces an unconditional read of ID_AA64MMFR2_EL1. This is known to cause problems on older versions of QEMU. If this turns out to be a problem in cases where updating QEMU is not an option, we'll have to implement a workaround. Also note that since we don't emulate the CPU ID registers on older core, this means that microarchitectural optimizations keyed of reads of MIDR_EL1 are not possible on OpenBSD. I don't think that is a real problem. ok jca@
2024-07-17	Clean up the cpi_id_aa64xxx variables at the end of autoconf such that	Mark Kettenis
	sysclt(2) and ID register access emulation can share the variables. ok jca@
2024-07-14	Add elf_aux_info(3)	Jeremie Courreges-Anglas
	Designed to let userland peek at AT_HWCAP and AT_HWCAP2 using an already existing interface coming from FreeBSD. Headers bits were snatched from there. Input & ok kettenis@ libc bump and sets sync will follow soon
2024-07-10	Implement support for deeper idle states offered by PSCI. Reduces the	Mark Kettenis
	idle power usage of the Vivobook S15 by almost 50%. ok patrick@
2024-07-10	Hook up the Qualcomm UEFI Secure Application that handles EFI variables to	Mark Kettenis
	efi(4) such that we can access EFI variables through ioctls on /dev/efi. ok patrick@
2024-07-10	Missed some files in previous commit to split vmd into mi/md.	Dave Voutila
	Forgot `cvs add` and sys/dev/vmm/vmm.h changes.
2024-06-23	Enable EPAN if it is available.	Mark Kettenis
	ok patrick@
2024-06-12	remove BMAJ and CMAJ defines only used by arm64; ok deraadt@	Jonathan Gray

2024-05-27	Decode remaining ID_AA64ISAR1_EL1 features.	Mark Kettenis
	ok jsg@
2024-05-22	remove prototypes with no matching function and externs with no var	Jonathan Gray

2024-05-07	drop the MD byte-swap micro-optimizations on clang architectures	Christian Weisgerber
	The compiler already translates the generic code into arithmetic byte-swap instructions or byte-swapping memory load and store instructions if available on an architecture. ok deraadt@ guenther@
2024-05-01	Add per-CPU caches to the pmemrange allocator.	Martin Pieuchot
	The caches are used primarily to reduce contention on uvm_lock_fpageq() during concurrent page faults. For the moment only uvm_pagealloc() tries to get a page from the current CPU's cache. So on some architectures the caches are also used by the pmap layer. Each cache is composed of two magazines, design is borrowed from jeff bonwick vmem's paper and the implementation is similar to the one of pool_cache from dlg@. However there is no depot layer and magazines are refilled directly by the pmemrange allocator. This version includes splvm()/splx() dances because the buffer cache flips buffers in interrupt context. So we have to prevent recursive accesses to per-CPU magazines. Tested by naddy@, solene@, krw@, robert@, claudio@ and Laurence Tratt. ok claudio@, kettenis@
2024-04-29	remove prototypes for removed functions	Jonathan Gray

2024-04-19	Revert per-CPU caches a double-free has been found by naddy@.	Martin Pieuchot

2024-04-17	Add per-CPU caches to the pmemrange allocator.	Martin Pieuchot
	The caches are used primarily to reduce contention on uvm_lock_fpageq() during concurrent page faults. For the moment only uvm_pagealloc() tries to get a page from the current CPU's cache. So on some architectures the caches are also used by the pmap layer. Each cache is composed of two magazines, design is borrowed from jeff bonwick vmem's paper and the implementation is similar to the one of pool_cache from dlg@. However there is no depot layer and magazines are refilled directly by the pmemrange allocator. Tested by robert@, claudio@ and Laurence Tratt. ok kettenis@
2024-03-18	Add support for the new layout of the CCSIDR_EL1 register that was	Mark Kettenis
	introduced in Armv8.3 when the CCIDX feature is advertised. This makes us properly detect the cache size on newer CPU cores like Neoverse N2, at least when emulated by QEMU. ok jsg@
2024-03-17	The feature is called SSBS instead of SBSS.	Mark Kettenis

2024-03-16	Set the HCR_API and HCR_APK bits in the HCR_EL2 when CPUs boot in EL2.	Mark Kettenis
	Otherwise using PAC instructions in EL1 will trigger a trap into EL2 that we don't handle. ok jsg@, deraadt@
2024-03-05	Tighten up BTCFI by flipping the bits that make PACIASP and PACIBSP	Mark Kettenis
	behave like BTI c instead of BTI jc. ok deraadt@, tobhe@
2024-02-25	clockintr: rename "struct clockintr_queue" to "struct clockqueue"	Scott Soule Cheloha
	The code has outgrown the original name for this struct. Both the external and internal APIs have used the "clockqueue" namespace for some time when operating on it, and that name is eyeball-consistent with "clockintr" and "clockrequest", so "clockqueue" it is.
2024-02-03	Implement Multiple Message MSI support on arm64. As on amd64 this is	Mark Kettenis
	experimental code to assis qwx(4) development. Currently this only works on systems that use agintcmsi(4) as the MSI controller combined with the dwpcie(4) Hots/PCIe bridge. ok patrick@
2024-01-24	clockintr: switch from callee- to caller-allocated clockintr structs	Scott Soule Cheloha
	Currently, clockintr_establish() calls malloc(9) to allocate a clockintr struct on behalf of the caller. mpi@ says this behavior is incompatible with dt(4). In particular, calling malloc(9) during the initialization of a PCB outside of dt_pcb_alloc() is (a) awkward and (b) may conflict with future changes/optimizations to PCB allocation. To side-step the problem, this patch changes the clockintr subsystem to use caller-allocated clockintr structs instead of callee-allocated structs. clockintr_establish() is named after softintr_establish(), which uses malloc(9) internally to create softintr objects. The clockintr subsystem is no longer using malloc(9), so the "establish" naming is no longer apt. To avoid confusion, this patch also renames "clockintr_establish" to "clockintr_bind". Requested by mpi@. Tweaked by mpi@. Thread: https://marc.info/?l=openbsd-tech&m=170597126103504&w=2 ok claudio@ mlarkin@ mpi@
2024-01-20	There are several DART variants; print some more details such that we can	Mark Kettenis
	distinguish between them. Pay attention to the apple,dma-range property that tells us the desired DVA window. Add support for a new BUS_DMA_FIXED that allows use of bus_dmamap_load_raw(9) to map things at a pre-determined DVA. This last change is needed for the upcoming Apple KMS driver. Hopefully that is the only driver that will need this, so don't attempt to turn this into an MI feature. ok patrick@
2024-01-15	We can't call kstat_create(9) when bringing up the secondary CPUs as it	Mark Kettenis
	uses an rwlock and curproc isn't initialized yet for these CPUs at this point. As a result we hit a "locking against myself" panic if there is any lock contention. Fix this by adding a new ci_midr member to struct cpu_info which gets initialized when we identify the CPUs and use that to attach the kstat stuff. ok tobhe@, dlg@
2023-12-26	Improve handling of SError interrupts. Print some useful information and	Mark Kettenis
	allow additional information to be printed for specific CPU types. Use this to print the L2C registers on Apple CPUs which can be very useful in tracking down the source of certain SError interrupts. ok miod@, dlg@
2023-12-14	NKMEMPAGES_MAX_DEFAULT is no longer used. Remove it from param.h.	Claudio Jeker
	OK miod@
2023-12-11	Implement per-CPU caching for the page table page (vp) pool and the PTE	Mark Kettenis
	descriptor (pted) pool in the arm64 pmap implementation. This significantly reduces the side-effects of lock contention on the kernel map lock that is (incorrectly) translated into excessive page daemon wakeups. This is not a perfect solution but it does lead to significant speedups on machines with many CPU cores. This requires adding a new pmap_init_percpu() function that gets called at the point where kernel is ready to set up the per-CPU pool caches. Dummy implementations of this function are added for all non-arm64 architectures. Some other architectures can probably benefit from providing an actual implementation that sets up per-CPU caches for pmap pools as well. ok phessler@, claudio@, miod@, patrick@
2023-12-05	boot_file was removed in arm64 machdep.c rev 1.55	Jonathan Gray

2023-11-29	Fix unwanted sign-extension of ID register masks. Sign-extension of the	Mark Kettenis
	GPI feature mask caused misdetection of the GPI feature when some other feature was present that was advertised in the upper 32 bits of the same ID register. Resulting in a crash as soon as the pmap code tried to set the PAC keys. Fix suggested by Marc Zyngier who found and debugged the problem. ok jsg@, deraadt@
2023-09-22	move simplebusvar.h so it can be used without ifdef	Jonathan Gray
	ok kettenis@ phessler@
2023-09-12	Store ITS ID in struct interrupt_controller so it can be used to look up	Jonathan Matthew
	the right ITS to use when establishing interrupts. ok kettenis@ patrick@
2023-08-23	all platforms: separate cpu_initclocks() from cpu_startclock()	Scott Soule Cheloha
	To give the primary CPU an opportunity to perform clock interrupt preparation in a machine-independent manner we need to separate the "initialization" parts of cpu_initclocks() from the "start the clock interrupt" parts. Currently, cpu_initclocks() does everything all at once, so there is no space for this MI setup. Many platforms have more-or-less already done this separation by implementing a separate routine named "cpu_startclock()". This patch promotes cpu_startclock() from de facto standard to mandatory API. - Prototype cpu_startclock() in sys/systm.h alongside cpu_initclocks(). The separation of responsibility between the two routines is a bit fuzzy but the basic guidelines are as follows: + cpu_initclocks() must initialize hz, stathz, and profhz, and call clockintr_init(). + cpu_startclock() must call clockintr_cpu_init() and start the clock interrupt cycle on the calling CPU. These guidelines will shift in the future, but that's the way things stand as of this commit. - In initclocks(): first call cpu_initclocks(), then do MI setup, and last call cpu_startclock(). - On platforms where cpu_startclock() already exists: don't call cpu_startclock() from cpu_initclocks() anymore. - On platforms where cpu_startclock() doesn't yet exist: implement it. Usually this is as simple as dividing cpu_initclocks() in two. Tested on amd64 (i8254, lapic), arm64, i386 (i8254, lapic), macppc, mips64/octeon, and sparc64. Tested on arm/armv7 (agtimer(4)) by phessler@ and jmatthew@. Tested on m88k/luna88k by aoyama@. Tested on powerpc64 by gkoehler@ and mlarkin@. Tested on riscv64 by jmatthew@. Thread: https://marc.info/?l=openbsd-tech&m=169195251322149&w=2
2023-07-25	statclock: move profil(2), GPROF code to profclock(), gmonclock()	Scott Soule Cheloha
	This patch isolates profil(2) and GPROF from statclock(). Currently, statclock() implements both profil(2) and GPROF through a complex mechanism involving both platform code (setstatclockrate) and the scheduler (pscnt, psdiv, and psratio). We have a machine-independent interface to the clock interrupt hardware now, so we no longer need to do it this way. - Move profil(2)-specific code from statclock() to a new clock interrupt callback, profclock(), in subr_prof.c. Each schedstate_percpu has its own profclock handle. The profclock is enabled/disabled for a given CPU when it is needed by the running thread during mi_switch() and sched_exit(). - Move GPROF-specific code from statclock() to a new clock interrupt callback, gmonclock(), in subr_prof.c. Where available, each cpu_info has its own gmonclock handle . The gmonclock is enabled/disabled for a given CPU via sysctl(2) in prof_state_toggle(). - Both profclock() and gmonclock() have a fixed period, profclock_period, that is initialized during initclocks(). - Export clockintr_advance(), clockintr_cancel(), clockintr_establish(), and clockintr_stagger() via <sys/clockintr.h>. They have external callers now. - Delete pscnt, psdiv, psratio. From schedstate_percpu, also delete spc_pscnt and spc_psdiv. The statclock frequency is not dynamic anymore so these variables are now useless. - Delete code/state related to the dynamic statclock frequency from kern_clockintr.c. The statclock frequency can still be pseudo-random, so move the contents of clockintr_statvar_init() into clockintr_init(). With input from miod@, deraadt@, and claudio@. Early revisions cleaned up by claudio. Early revisions tested by claudio@. Tested by cheloha@ on amd64, arm64, macppc, octeon, and sparc64 (sun4v). Compile- and boot- tested on i386 by mlarkin@. riscv64 compilation bugs found by mlarkin@. Tested on riscv64 by jca@. Tested on powerpc64 by gkoehler@.
2023-07-13	Use the deep idle state available on Apple M1/M2 cores in the idle loop and	Mark Kettenis
	for suspend. This state makes the CPU lose some of its register state so we need to save these registers before putting the core to sleep and restore them when we wake up. This deep idle state has a higher wakeup latency than the normal WFI idle state. Use similar logic as acpucpu(4) to decide which idle state to pick. If some cores of a cluster are in this deep idle state, turbo states become available to the cores that remain active. So stop skipping these states. This improves single-core performance a little bit. The main win is in power savings when running in a state with a high clock frequency. My M2 Pro mini goes from 14W to 6.5W when idle at the maximum clock frequency. But event at the lowest clock frequency there are small but significant power savings. ok deraadt@, tobhe@
2023-07-02	all platforms, kernel: remove __HAVE_CLOCKINTR symbol	Scott Soule Cheloha
	Every platform made the clockintr switch at least six months ago. The __HAVE_CLOCKINTR symbol is now redundant. Remove it. Prompted by claudio@. Link: https://marc.info/?l=openbsd-tech&m=168826181015032&w=2 "makes sense" mlarkin@
2023-06-10	Implement support for pointer authentication (PAC) in userland. With PAC	Mark Kettenis
	it is possible to "sign" pointers with a hidden key. The signature is placed in unused bits of the pointer and can be checked later. This can be used to provide "tail CFI" that is similar to what retguard provides. Debuggers need to be aware of the fact that pointers can be signed. For this purpose a new PT_PACMASK ptrace(2) request is introduced that returns as mask that indicates the bits used for the signature. Separate masks are provided for code and data pointers even though the masks are identical in the current implementation. These masks are also written into a special note section in the core dump. ok patrick@
2023-04-28	bump MAXDSIZ to 128G on amd64 and 64G on arm64	Robert Nagy
	discussed with kettenis@, ok deraadt@
2023-04-16	Make enabling the BTI feature a per-pmap thing by storing the ATTR_GP bit	Mark Kettenis
	in a new pm_guarded member of struct pmap and using this member to add the bits to the PTEs ok deraadt@
2023-04-16	Clear BTYPE bits when setting up a signal handler and when handling a	Mark Kettenis
	PT_CONTINUE ptrace(2) request. Otherwise we would trap if userland was interrupted at a point where it is doing an indirect branch that has set the bits but before it has executed the BTI instruction at the branch target. The PT_SETREGS request may need similar treatment, at least when the PC is changed. But Linux doesn't do this and debuggers might want full control over the BTYPE bits. So leave this alone for now. ok guenther@
2023-04-11	fix double words in comments	Jonathan Gray
	feedback and ok jmc@ miod, ok millert@
2023-03-27	Implement branch target protection using the branch target identification	Mark Kettenis
	feature introduced in Armv8.5. This provides "head-CFI" to complement the "tail-CFI" provided by retguard. Unfortunately most arm64 machines don't support this feature yet. But Apple M2 does support it and it seems to work there. ok deraadt@