Age | Commit message (Collapse) | Author |
|
ok jca@
|
|
|
|
ok kettenis@
|
|
ok kettenis@
|
|
Truncate the character arguments of strchr() and strrchr() to eight bits
so that the implied char conversion would work correctly. Otherwise the
functions would always return NULL when the character argument is
negative.
OK miod@
|
|
siglongjmp(3) to decide wehther we need to restore the signal mask.
ok deraadt@, drahn@
|
|
turns out same as a diff drahn didn't commit
ok kettenis
|
|
|
|
it causes restored stack to be incorrect.
|
|
Simplify integer loading, use 'li <dest>, <value>' instead of x0/zero register
Adjust _JB_SIGMASK to not collide with saved registers.
|
|
we do on other architectures.
ok mpi@
|
|
Directly update cerror as offset of thread pointer, with
optimizations on error brnaching
ok kettenis@
|
|
based off a combination of aarch64/powerpc64
ok kettenis@
|
|
the CERROR handling code had a gross mistake in that that it didn't
continue processing the code after the macro if no error occurred.
ok kettenis@
|
|
ok kettenis@
|
|
asm defines, copied from aarch64.
|
|
Makefile.inc was missed in previous commit
ok kettenis@
|
|
largely derived from aarch64 code.
usertc.c taken from hppa
with cleanup to Symbols.list and tfork_thread.S
Further cleanup and enhancement will be performed in-tree.
ok kettenis@
|
|
hardware. Implement fp[gs]etround(3) and fp[gs]etsticky(3) and tweak
the fp[gs]etmask(3) implementation to provide the right weak symbols.
This implementation deliberately ignores the additional
"round to nearest, away from zero" as this interface is derived from
i386-specific code and the i387 FPU doesn't implement such a rounding
mode. This is a legacy API and code should use <fenv.h> instead.
ok drahn@
|
|
implements a variation on the traditional "to nearest" rounding mode that
rounds away from zero when tied. The upcoming C2x includes support for that
and LLVM already implements this so provide an implementation that matches
our system compiler.
ok drahn@
|
|
Based on arm64 versions
this implementation is missing jmpxor security enhancement.
Good enough deraadt@
|
|
ok drahn@
|
|
the same dummy fpgetmask(3) and fpsetmask(3) implementation as arm64.
ok drahn@
|
|
direct copy from aarch64
constants were rechecked using the qnan.c program.
|
|
adopted from aarch64, no native ffs() for now, use C version.
after corrections from kettenis@
|
|
ok kettenis.
|
|
broke pthreads on hppa. Reverting. Ok deraadt@
|
|
our i386 compiler does not generate SSE instructions by default,
it is not strictly necessary to save MXCSR content between setjmp(3)
and longjmp(3). We do not want to end supporting such old processors
now. Remove the stmxcsr and ldmxcsr instructions from libc.
reported by Johan Huldtgren; OK jsg@ kettenis@
|
|
it. There is enough space in jmp_buf to save MXCSR and CW register.
Idea taken from amd64. This fixes regress/lib/libc/setjmp-fpu .
OK kettenis@
|
|
i386 libc. The assembler code is more readable than with magic
numbers. This brings i386 in line with amd64. No change in object
file.
OK kettenis@
|
|
This changes RETGUARD_SETUP(ffs) to RETGUARD_SETUP(ffs, %r11, %r12)
and RETGUARD_CHECK(ffs) to RETGUARD_CHECK(ffs, %r11, %r12)
to show that r11 and r12 are in use between setup and check, and to
pick registers other than r11 and r12 in some kernel functions.
ok mortimer@ deraadt@
|
|
ok deraadt@
|
|
Add retguard to some, but not all, asm functions in libc. Edit SYS.h
in libc to remove the PREFIX macros and add SYSENTRY (more like
aarch64 and powerpc64), so we can insert RETGUARD_SETUP after
SYSENTRY. Some .S files in this commit don't get retguard, but do
stop using the old prefix macros.
Tested by deraadt@, who put this diff in a macppc snap.
|
|
floating-point control modes are properly restored by longjmp(3).
ok guenther@
|
|
OK deraadt@
|
|
ok deraadt@
|
|
("permanently undefined")
ok deraadt@ kettenis@
|
|
ok mortimer
|
|
are properly restored by longjmp(3).
|
|
ok deraadt@
|
|
Put a hard-trap instruction after the syscall instruction.
ok kettenis mortimer
|
|
calls are guarded. Adapt the first few hand-written functions to this
model (a few remain)
ok kettenis mortimer
|
|
framepointer, so gdb knows to stop. Inspired by glibc
ok kettenis@
|
|
Regarding RDTSC, the Intel ISA reference says (Vol 2B. 4-545):
> The RDTSC instruction is not a serializing instruction.
>
> It does not necessarily wait until all previous instructions
> have been executed before reading the counter.
>
> Similarly, subsequent instructions may begin execution before the
> read operation is performed.
>
> If software requires RDTSC to be executed only after all previous
> instructions have completed locally, it can either use RDTSCP (if
> the processor supports that instruction) or execute the sequence
> LFENCE;RDTSC.
To mitigate this problem, Linux and DragonFly use LFENCE. FreeBSD and
NetBSD take a more complex route: they selectively use MFENCE, LFENCE,
or CPUID depending on whether the CPU is AMD, Intel, VIA or something
else.
Let's start with just LFENCE. We only use the TSC as a timecounter on
SSE2 systems so there is no need to conditionally compile the LFENCE.
We can explore conditionally using MFENCE later.
Microbenchmarking on my machine (Core i7-8650) suggests a penalty of
about 7-10% over a "naked" RDTSC. This is acceptable. It's a bit of
a moot point though: the alternative is a considerably weaker
monotonicity guarantee when comparing timestamps between threads,
which is not acceptable.
It's worth noting that kernel timecounting is not *exactly* like
userspace timecounting. However, they are similar enough that we can
use userspace benchmarks to make conjectures about possible impacts on
kernel performance.
Concerns about kernel performance, in particular the network stack,
were the blocking issue for this patch. Regarding networking
performance, claudio@ says a 10% slower nanotime(9) or nanouptime(9)
is acceptable and that shaving off "tens of cycles" is a
micro-optimization. There are bigger optimizations to chase down
before such a difference would matter.
There is additional work to be done here. We could experiment with
conditionally using MFENCE. Also, the userspace TSC timecounter
doesn't have access to the adjustment skews available to the kernel
timecounter. pirofti@ has suggested a scheme involving RDTSCP and an
array of skews mapped into user memory. deraadt@ has suggested a
scheme where the skew would be kept in the TCB. However it is done,
access to the skews will improve monotonicity, which remains a problem
with the TSC.
First proposed by kettenis@ and pirofti@. With input from pirofti@,
deraadt@, guenther@, naddy@, kettenis@, and claudio@. Based on
similar changes in Linux, FreeBSD, NetBSD, and DragonFlyBSD.
ok deraadt@ pirofti@ kettenis@ naddy@ claudio@
|
|
32-bit values.
ok gkoehler@, drahn@
|
|
Initialize __curbrk = &_end.
It's a 64-bit pointer, so use ld/std instead of lwz/stw.
ok drahn@
|
|
OK naddy@; no objections from kettenis@
|
|
Tested by cwen@ and myself. Thanks to pirofti@ for creating the
userland timecounter feature.
ok kettenis@ pirofti@ deraadt@ cheloha@
|
|
ok naddy@
|
|
be 8 bytes in the 64-bit ABI just like in the 32-bit ABI. But that means
there is no "spare" word in the TCB that we can use to store a pointer
to our struct pthread. So we have to treat powerpc64 special.
Also recognize that the thread pointer points 0x7000 bytes after the TCB.
Since the TCB is 8 bytes this means that TCB_OFFSET should be 0x7008.
Pointed out by guenther@; ok deraadt@
|