summaryrefslogtreecommitdiff
path: root/lib/libc/arch
AgeCommit message (Collapse)Author
2021-02-03Adding a hard-trap instruction after the __threxit syscall instructionKurt Miller
broke pthreads on hppa. Reverting. Ok deraadt@
2020-12-13Geode CPU does not support SSE, so MXCSR does not exists there. AsAlexander Bluhm
our i386 compiler does not generate SSE instructions by default, it is not strictly necessary to save MXCSR content between setjmp(3) and longjmp(3). We do not want to end supporting such old processors now. Remove the stmxcsr and ldmxcsr instructions from libc. reported by Johan Huldtgren; OK jsg@ kettenis@
2020-12-06On i386 setjmp(3) should store the FPU state and longjmp(3) restoreAlexander Bluhm
it. There is enough space in jmp_buf to save MXCSR and CW register. Idea taken from amd64. This fixes regress/lib/libc/setjmp-fpu . OK kettenis@
2020-12-06Introduce constants to access the setjmp(3) jmp_buf fields fromAlexander Bluhm
i386 libc. The assembler code is more readable than with magic numbers. This brings i386 in line with amd64. No change in object file. OK kettenis@
2020-11-28Add retguard to macppc kernel locore.S, ofwreal.S, setjmp.Sgkoehler
This changes RETGUARD_SETUP(ffs) to RETGUARD_SETUP(ffs, %r11, %r12) and RETGUARD_CHECK(ffs) to RETGUARD_CHECK(ffs, %r11, %r12) to show that r11 and r12 are in use between setup and check, and to pick registers other than r11 and r12 in some kernel functions. ok mortimer@ deraadt@
2020-11-07Actually m88k assembler can not handle 'nop' mnemonic, use a macro instead.Kenji Aoyama
ok deraadt@
2020-10-26Retguard asm macros for powerpc libc, ld.sogkoehler
Add retguard to some, but not all, asm functions in libc. Edit SYS.h in libc to remove the PREFIX macros and add SYSENTRY (more like aarch64 and powerpc64), so we can insert RETGUARD_SETUP after SYSENTRY. Some .S files in this commit don't get retguard, but do stop using the old prefix macros. Tested by deraadt@, who put this diff in a macppc snap.
2020-10-21Save and restore the MXCSR register and the FPU control word such thatMark Kettenis
floating-point control modes are properly restored by longjmp(3). ok guenther@
2020-10-20Use a trap instruction that unconditionally terminates the process.Visa Hankala
OK deraadt@
2020-10-19Retguard sigsetjmp on powerpc64.mortimer
ok deraadt@
2020-10-19replace ad-hoc illegal instruction with the architecturally defined oneChristian Weisgerber
("permanently undefined") ok deraadt@ kettenis@
2020-10-19add retguard prologue/epilogueTheo de Raadt
ok mortimer
2020-10-19Save and restore the FPCR register such that floating-point control modesMark Kettenis
are properly restored by longjmp(3).
2020-10-18Add powerpc64 retguard macros for setjmp / longjmp.mortimer
ok deraadt@
2020-10-18SYS___threxit cannot fail, but this integration looks like a gadget.Theo de Raadt
Put a hard-trap instruction after the syscall instruction. ok kettenis mortimer
2020-10-16Adapt SYS.h to use retguard macros from asm.h, so that generated systemTheo de Raadt
calls are guarded. Adapt the first few hand-written functions to this model (a few remain) ok kettenis mortimer
2020-10-01Mark top-level frame for new thread in both CFI and with zeroPhilip Guenther
framepointer, so gdb knows to stop. Inspired by glibc ok kettenis@
2020-08-23amd64: TSC timecounter: prefix RDTSC with LFENCEcheloha
Regarding RDTSC, the Intel ISA reference says (Vol 2B. 4-545): > The RDTSC instruction is not a serializing instruction. > > It does not necessarily wait until all previous instructions > have been executed before reading the counter. > > Similarly, subsequent instructions may begin execution before the > read operation is performed. > > If software requires RDTSC to be executed only after all previous > instructions have completed locally, it can either use RDTSCP (if > the processor supports that instruction) or execute the sequence > LFENCE;RDTSC. To mitigate this problem, Linux and DragonFly use LFENCE. FreeBSD and NetBSD take a more complex route: they selectively use MFENCE, LFENCE, or CPUID depending on whether the CPU is AMD, Intel, VIA or something else. Let's start with just LFENCE. We only use the TSC as a timecounter on SSE2 systems so there is no need to conditionally compile the LFENCE. We can explore conditionally using MFENCE later. Microbenchmarking on my machine (Core i7-8650) suggests a penalty of about 7-10% over a "naked" RDTSC. This is acceptable. It's a bit of a moot point though: the alternative is a considerably weaker monotonicity guarantee when comparing timestamps between threads, which is not acceptable. It's worth noting that kernel timecounting is not *exactly* like userspace timecounting. However, they are similar enough that we can use userspace benchmarks to make conjectures about possible impacts on kernel performance. Concerns about kernel performance, in particular the network stack, were the blocking issue for this patch. Regarding networking performance, claudio@ says a 10% slower nanotime(9) or nanouptime(9) is acceptable and that shaving off "tens of cycles" is a micro-optimization. There are bigger optimizations to chase down before such a difference would matter. There is additional work to be done here. We could experiment with conditionally using MFENCE. Also, the userspace TSC timecounter doesn't have access to the adjustment skews available to the kernel timecounter. pirofti@ has suggested a scheme involving RDTSCP and an array of skews mapped into user memory. deraadt@ has suggested a scheme where the skew would be kept in the TCB. However it is done, access to the skews will improve monotonicity, which remains a problem with the TSC. First proposed by kettenis@ and pirofti@. With input from pirofti@, deraadt@, guenther@, naddy@, kettenis@, and claudio@. Based on similar changes in Linux, FreeBSD, NetBSD, and DragonFlyBSD. ok deraadt@ pirofti@ kettenis@ naddy@ claudio@
2020-07-27Fix two cases where we shpould compare/store 64-bit values instead ofMark Kettenis
32-bit values. ok gkoehler@, drahn@
2020-07-27Fix powerpc64's sbrk()gkoehler
Initialize __curbrk = &_end. It's a 64-bit pointer, so use ld/std instead of lwz/stw. ok drahn@
2020-07-18Userland timecounter implementation for octeonVisa Hankala
OK naddy@; no objections from kettenis@
2020-07-17Userland timecounter for macppcgkoehler
Tested by cwen@ and myself. Thanks to pirofti@ for creating the userland timecounter feature. ok kettenis@ pirofti@ deraadt@ cheloha@
2020-07-15Userland timecounter implementation for arm64.Mark Kettenis
ok naddy@
2020-07-14Fix TIB/TCB on powerpc64. Some bright sould decided that the TCB shouldMark Kettenis
be 8 bytes in the 64-bit ABI just like in the 32-bit ABI. But that means there is no "spare" word in the TCB that we can use to store a pointer to our struct pthread. So we have to treat powerpc64 special. Also recognize that the thread pointer points 0x7000 bytes after the TCB. Since the TCB is 8 bytes this means that TCB_OFFSET should be 0x7008. Pointed out by guenther@; ok deraadt@
2020-07-11Add usertc.c.Mark Kettenis
2020-07-11Add missing usertc.c file.Mark Kettenis
2020-07-08Userland timecounter implementation for sparc64.Mark Kettenis
ok deraadt@, pirofti@
2020-07-08Clean up the amd64 userland timecounter implementation a bit:Mark Kettenis
* We don't need TC_LAST * Make internal functions static to avoid namespace pollution in libc.a * Use a switch statement to harmonize with architectures providing multiple timecounters ok deraadt@, pirofti@
2020-07-06Add support for timeconting in userland.Paul Irofti
This diff exposes parts of clock_gettime(2) and gettimeofday(2) to userland via libc eliberating processes from the need for a context switch everytime they want to count the passage of time. If a timecounter clock can be exposed to userland than it needs to set its tc_user member to a non-zero value. Tested with one or multiple counters per architecture. The timing data is shared through a pointer found in the new ELF auxiliary vector AUX_openbsd_timekeep containing timehands information that is frequently updated by the kernel. Timing differences between the last kernel update and the current time are adjusted in userland by the tc_get_timecount() function inside the MD usertc.c file. This permits a much more responsive environment, quite visible in browsers, office programs and gaming (apparently one is are able to fly in Minecraft now). Tested by robert@, sthen@, naddy@, kmos@, phessler@, and many others! OK from at least kettenis@, cheloha@, naddy@, sthen@
2020-07-02Use a relative branch to jump from setjmp(3) into _setjmp(4).Mark Kettenis
Use correct register to reference the location where we store CR.
2020-06-30Add missing comparison instruction. Load %r12 with the indirect branchMark Kettenis
address to load the correct TOC address.
2020-06-29Use C versions of bcopy(3) and memmove(3) for now as the assembly versionMark Kettenis
of bcopy(9) doesn't work in its current state. ok deraadt@
2020-06-28Use std instead of stw to store CR since we use std in sigsetjmp(3) andMark Kettenis
we use ld to load it again in longjmp(3).
2020-06-28The 2nd and 3rd argument are pointers, so use the appropriate doublewordMark Kettenis
instructions. ok drahn@
2020-06-27Add missing label.Mark Kettenis
2020-06-26Provide an optimized implementation of ffs(3) in libc onChristian Weisgerber
aarch64/powerpc/powerpc64, making use of the count leading zeros instruction. Also add a brief regression test. ok deraadt@ kettenis@
2020-06-26Fix TCB_OFFSET_ERRNO. Adjust comments to reflect that powerpc64 uses %r13Mark Kettenis
as the per-thread register. ok patrick@, drahn@
2020-06-26Avoid "bare" register numbers.Mark Kettenis
2020-06-25PowerPC64 libc powerpc sys filesDale Rahn
Initial attempt to port powerpc code to powerpc64 Expects TOC loading in ENTRY(), ok kettenis@ (some cleanup required)
2020-06-25PowerPC64 libc string/net filesDale Rahn
Initial attempt to port powerpc code to powerpc64 Expects TOC loading in ENTRY(), memmove.S is the powerpc 32 bit, optimization is possible for 64 bit and handle len of > 32 bits.
2020-06-25*** empty log message ***Dale Rahn
2020-06-25PowerPC64 libc/arch/powerpc/gdtoa filesDale Rahn
This is a almost a direct copy from powerpc with 64 bit mods, with two additions present in 64 arch. NOTE: long double 128 is not supported currently.
2020-06-25Committed wrong version of file, atomic_lock is 32 bit.Dale Rahn
2020-06-25PowerPC64 libc gen filesDale Rahn
Initial attempt to port powerpc code to powerpc64 Expects TOC loading in ENTRY(), ok kettenis@
2020-06-25PowerPC64 libc (libc powerpc top)Dale Rahn
Expects ELFv2 TOC loading in ENTRY(), build with -gdwarf-4 Split SYS.h into SYS.h and DEFS.h fix tabs after #define
2020-03-13Anthony Steinhauser reports that 32-bit arm cpus have the same speculationTheo de Raadt
problems as 64-bit models. To resolve the syscall speculation, as a first step "nop; nop" was added after all occurances of the syscall ("swi 0") instruction. Then the kernel was changed to jump over the 2 extra instructions. In this final step, those pair of nops are converted into the speculation-blocking sequence ("dsb nsh; isb"). Don't try to build through these multiple steps, use a snapshot instead. Packages matching the new ABI will be out in a while... ok kettenis
2020-03-11Anthony Steinhauser reports that 32-bit arm cpus have the same speculationTheo de Raadt
problems as 64-bit models. For the syscall instruction issue, add nop;nop after swi 0, in preparation for jumping over a speculation barrier here later. ok kettenis
2020-02-18Now that the kernel skips the two instructions immediately followingMark Kettenis
a syscall, replace the double nop with a dsb nsh; isb; sequence which stops the CPU from speculating any further. This fix was suggested by Anthony Steinhauser. ok deraadt@
2020-01-26Insert two nop instructions after each svc #0 instruction in userland.Mark Kettenis
The will be replaced by a speculation barrier as soon as we teach the kernel to skip over these two instructions when returning from a system call. ok patrick@, deraadt@
2019-11-10Mark as 'protected' all the routines from the quad/ and softfloat/ subdirs,Philip Guenther
as well as those in arch/arm/gen/divsi3.S. This cleans up the PLTs on the 32bit archs. luna88k testing by aoyama@ "looks good" kettenis@, testing and ok deraadt@