src - OpenBSD base system

Age	Commit message (Collapse)	Author
2021-02-03	Adding a hard-trap instruction after the __threxit syscall instruction	Kurt Miller
	broke pthreads on hppa. Reverting. Ok deraadt@
2020-12-13	Geode CPU does not support SSE, so MXCSR does not exists there. As	Alexander Bluhm
	our i386 compiler does not generate SSE instructions by default, it is not strictly necessary to save MXCSR content between setjmp(3) and longjmp(3). We do not want to end supporting such old processors now. Remove the stmxcsr and ldmxcsr instructions from libc. reported by Johan Huldtgren; OK jsg@ kettenis@
2020-12-06	On i386 setjmp(3) should store the FPU state and longjmp(3) restore	Alexander Bluhm
	it. There is enough space in jmp_buf to save MXCSR and CW register. Idea taken from amd64. This fixes regress/lib/libc/setjmp-fpu . OK kettenis@
2020-12-06	Introduce constants to access the setjmp(3) jmp_buf fields from	Alexander Bluhm
	i386 libc. The assembler code is more readable than with magic numbers. This brings i386 in line with amd64. No change in object file. OK kettenis@
2020-11-28	Add retguard to macppc kernel locore.S, ofwreal.S, setjmp.S	gkoehler
	This changes RETGUARD_SETUP(ffs) to RETGUARD_SETUP(ffs, %r11, %r12) and RETGUARD_CHECK(ffs) to RETGUARD_CHECK(ffs, %r11, %r12) to show that r11 and r12 are in use between setup and check, and to pick registers other than r11 and r12 in some kernel functions. ok mortimer@ deraadt@
2020-11-07	Actually m88k assembler can not handle 'nop' mnemonic, use a macro instead.	Kenji Aoyama
	ok deraadt@
2020-10-26	Retguard asm macros for powerpc libc, ld.so	gkoehler
	Add retguard to some, but not all, asm functions in libc. Edit SYS.h in libc to remove the PREFIX macros and add SYSENTRY (more like aarch64 and powerpc64), so we can insert RETGUARD_SETUP after SYSENTRY. Some .S files in this commit don't get retguard, but do stop using the old prefix macros. Tested by deraadt@, who put this diff in a macppc snap.
2020-10-21	Save and restore the MXCSR register and the FPU control word such that	Mark Kettenis
	floating-point control modes are properly restored by longjmp(3). ok guenther@
2020-10-20	Use a trap instruction that unconditionally terminates the process.	Visa Hankala
	OK deraadt@
2020-10-19	Retguard sigsetjmp on powerpc64.	mortimer
	ok deraadt@
2020-10-19	replace ad-hoc illegal instruction with the architecturally defined one	Christian Weisgerber
	("permanently undefined") ok deraadt@ kettenis@
2020-10-19	add retguard prologue/epilogue	Theo de Raadt
	ok mortimer
2020-10-19	Save and restore the FPCR register such that floating-point control modes	Mark Kettenis
	are properly restored by longjmp(3).
2020-10-18	Add powerpc64 retguard macros for setjmp / longjmp.	mortimer
	ok deraadt@
2020-10-18	SYS___threxit cannot fail, but this integration looks like a gadget.	Theo de Raadt
	Put a hard-trap instruction after the syscall instruction. ok kettenis mortimer
2020-10-16	Adapt SYS.h to use retguard macros from asm.h, so that generated system	Theo de Raadt
	calls are guarded. Adapt the first few hand-written functions to this model (a few remain) ok kettenis mortimer
2020-10-01	Mark top-level frame for new thread in both CFI and with zero	Philip Guenther
	framepointer, so gdb knows to stop. Inspired by glibc ok kettenis@
2020-08-23	amd64: TSC timecounter: prefix RDTSC with LFENCE	cheloha
	Regarding RDTSC, the Intel ISA reference says (Vol 2B. 4-545): > The RDTSC instruction is not a serializing instruction. > > It does not necessarily wait until all previous instructions > have been executed before reading the counter. > > Similarly, subsequent instructions may begin execution before the > read operation is performed. > > If software requires RDTSC to be executed only after all previous > instructions have completed locally, it can either use RDTSCP (if > the processor supports that instruction) or execute the sequence > LFENCE;RDTSC. To mitigate this problem, Linux and DragonFly use LFENCE. FreeBSD and NetBSD take a more complex route: they selectively use MFENCE, LFENCE, or CPUID depending on whether the CPU is AMD, Intel, VIA or something else. Let's start with just LFENCE. We only use the TSC as a timecounter on SSE2 systems so there is no need to conditionally compile the LFENCE. We can explore conditionally using MFENCE later. Microbenchmarking on my machine (Core i7-8650) suggests a penalty of about 7-10% over a "naked" RDTSC. This is acceptable. It's a bit of a moot point though: the alternative is a considerably weaker monotonicity guarantee when comparing timestamps between threads, which is not acceptable. It's worth noting that kernel timecounting is not exactly like userspace timecounting. However, they are similar enough that we can use userspace benchmarks to make conjectures about possible impacts on kernel performance. Concerns about kernel performance, in particular the network stack, were the blocking issue for this patch. Regarding networking performance, claudio@ says a 10% slower nanotime(9) or nanouptime(9) is acceptable and that shaving off "tens of cycles" is a micro-optimization. There are bigger optimizations to chase down before such a difference would matter. There is additional work to be done here. We could experiment with conditionally using MFENCE. Also, the userspace TSC timecounter doesn't have access to the adjustment skews available to the kernel timecounter. pirofti@ has suggested a scheme involving RDTSCP and an array of skews mapped into user memory. deraadt@ has suggested a scheme where the skew would be kept in the TCB. However it is done, access to the skews will improve monotonicity, which remains a problem with the TSC. First proposed by kettenis@ and pirofti@. With input from pirofti@, deraadt@, guenther@, naddy@, kettenis@, and claudio@. Based on similar changes in Linux, FreeBSD, NetBSD, and DragonFlyBSD. ok deraadt@ pirofti@ kettenis@ naddy@ claudio@
2020-07-27	Fix two cases where we shpould compare/store 64-bit values instead of	Mark Kettenis
	32-bit values. ok gkoehler@, drahn@
2020-07-27	Fix powerpc64's sbrk()	gkoehler
	Initialize __curbrk = &_end. It's a 64-bit pointer, so use ld/std instead of lwz/stw. ok drahn@
2020-07-18	Userland timecounter implementation for octeon	Visa Hankala
	OK naddy@; no objections from kettenis@
2020-07-17	Userland timecounter for macppc	gkoehler
	Tested by cwen@ and myself. Thanks to pirofti@ for creating the userland timecounter feature. ok kettenis@ pirofti@ deraadt@ cheloha@
2020-07-15	Userland timecounter implementation for arm64.	Mark Kettenis
	ok naddy@
2020-07-14	Fix TIB/TCB on powerpc64. Some bright sould decided that the TCB should	Mark Kettenis
	be 8 bytes in the 64-bit ABI just like in the 32-bit ABI. But that means there is no "spare" word in the TCB that we can use to store a pointer to our struct pthread. So we have to treat powerpc64 special. Also recognize that the thread pointer points 0x7000 bytes after the TCB. Since the TCB is 8 bytes this means that TCB_OFFSET should be 0x7008. Pointed out by guenther@; ok deraadt@
2020-07-11	Add usertc.c.	Mark Kettenis

2020-07-11	Add missing usertc.c file.	Mark Kettenis

2020-07-08	Userland timecounter implementation for sparc64.	Mark Kettenis
	ok deraadt@, pirofti@
2020-07-08	Clean up the amd64 userland timecounter implementation a bit:	Mark Kettenis
	* We don't need TC_LAST * Make internal functions static to avoid namespace pollution in libc.a * Use a switch statement to harmonize with architectures providing multiple timecounters ok deraadt@, pirofti@
2020-07-06	Add support for timeconting in userland.	Paul Irofti
	This diff exposes parts of clock_gettime(2) and gettimeofday(2) to userland via libc eliberating processes from the need for a context switch everytime they want to count the passage of time. If a timecounter clock can be exposed to userland than it needs to set its tc_user member to a non-zero value. Tested with one or multiple counters per architecture. The timing data is shared through a pointer found in the new ELF auxiliary vector AUX_openbsd_timekeep containing timehands information that is frequently updated by the kernel. Timing differences between the last kernel update and the current time are adjusted in userland by the tc_get_timecount() function inside the MD usertc.c file. This permits a much more responsive environment, quite visible in browsers, office programs and gaming (apparently one is are able to fly in Minecraft now). Tested by robert@, sthen@, naddy@, kmos@, phessler@, and many others! OK from at least kettenis@, cheloha@, naddy@, sthen@
2020-07-02	Use a relative branch to jump from setjmp(3) into _setjmp(4).	Mark Kettenis
	Use correct register to reference the location where we store CR.
2020-06-30	Add missing comparison instruction. Load %r12 with the indirect branch	Mark Kettenis
	address to load the correct TOC address.
2020-06-29	Use C versions of bcopy(3) and memmove(3) for now as the assembly version	Mark Kettenis
	of bcopy(9) doesn't work in its current state. ok deraadt@
2020-06-28	Use std instead of stw to store CR since we use std in sigsetjmp(3) and	Mark Kettenis
	we use ld to load it again in longjmp(3).
2020-06-28	The 2nd and 3rd argument are pointers, so use the appropriate doubleword	Mark Kettenis
	instructions. ok drahn@
2020-06-27	Add missing label.	Mark Kettenis

2020-06-26	Provide an optimized implementation of ffs(3) in libc on	Christian Weisgerber
	aarch64/powerpc/powerpc64, making use of the count leading zeros instruction. Also add a brief regression test. ok deraadt@ kettenis@
2020-06-26	Fix TCB_OFFSET_ERRNO. Adjust comments to reflect that powerpc64 uses %r13	Mark Kettenis
	as the per-thread register. ok patrick@, drahn@
2020-06-26	Avoid "bare" register numbers.	Mark Kettenis

2020-06-25	PowerPC64 libc powerpc sys files	Dale Rahn
	Initial attempt to port powerpc code to powerpc64 Expects TOC loading in ENTRY(), ok kettenis@ (some cleanup required)
2020-06-25	PowerPC64 libc string/net files	Dale Rahn
	Initial attempt to port powerpc code to powerpc64 Expects TOC loading in ENTRY(), memmove.S is the powerpc 32 bit, optimization is possible for 64 bit and handle len of > 32 bits.
2020-06-25	* empty log message *	Dale Rahn

2020-06-25	PowerPC64 libc/arch/powerpc/gdtoa files	Dale Rahn
	This is a almost a direct copy from powerpc with 64 bit mods, with two additions present in 64 arch. NOTE: long double 128 is not supported currently.
2020-06-25	Committed wrong version of file, atomic_lock is 32 bit.	Dale Rahn

2020-06-25	PowerPC64 libc gen files	Dale Rahn
	Initial attempt to port powerpc code to powerpc64 Expects TOC loading in ENTRY(), ok kettenis@
2020-06-25	PowerPC64 libc (libc powerpc top)	Dale Rahn
	Expects ELFv2 TOC loading in ENTRY(), build with -gdwarf-4 Split SYS.h into SYS.h and DEFS.h fix tabs after #define
2020-03-13	Anthony Steinhauser reports that 32-bit arm cpus have the same speculation	Theo de Raadt
	problems as 64-bit models. To resolve the syscall speculation, as a first step "nop; nop" was added after all occurances of the syscall ("swi 0") instruction. Then the kernel was changed to jump over the 2 extra instructions. In this final step, those pair of nops are converted into the speculation-blocking sequence ("dsb nsh; isb"). Don't try to build through these multiple steps, use a snapshot instead. Packages matching the new ABI will be out in a while... ok kettenis
2020-03-11	Anthony Steinhauser reports that 32-bit arm cpus have the same speculation	Theo de Raadt
	problems as 64-bit models. For the syscall instruction issue, add nop;nop after swi 0, in preparation for jumping over a speculation barrier here later. ok kettenis
2020-02-18	Now that the kernel skips the two instructions immediately following	Mark Kettenis
	a syscall, replace the double nop with a dsb nsh; isb; sequence which stops the CPU from speculating any further. This fix was suggested by Anthony Steinhauser. ok deraadt@
2020-01-26	Insert two nop instructions after each svc #0 instruction in userland.	Mark Kettenis
	The will be replaced by a speculation barrier as soon as we teach the kernel to skip over these two instructions when returning from a system call. ok patrick@, deraadt@
2019-11-10	Mark as 'protected' all the routines from the quad/ and softfloat/ subdirs,	Philip Guenther
	as well as those in arch/arm/gen/divsi3.S. This cleans up the PLTs on the 32bit archs. luna88k testing by aoyama@ "looks good" kettenis@, testing and ok deraadt@