Age | Commit message (Collapse) | Author |
|
ok jsg@, aoyama@
|
|
ok claudio@
|
|
|
|
Designed to let userland peek at AT_HWCAP and AT_HWCAP2 using an already
existing interface coming from FreeBSD. Headers bits were snatched from
there. Input & ok kettenis@
libc bump and sets sync will follow soon
|
|
ok mglocker@
|
|
It survives 3.5 days "make build" and makes about 1.5% faster on 3 CPU
machine:-)
ok miod@ phessler@ dlg@
|
|
Having differences between architectures is asking for problems. And
adding a barrier here just makes sense in most cases. This is also what
cpu_relax() provides in Linux land.
ok kettenis@ claudio@
|
|
This removes one of the SCHED_LOCK usages in arch.
OK miod@
|
|
in theory these are safe to use in code that runs under the kernel lock
they are nasty trips when converting code to run without the kernel lock.
ok mpi@, claudio@
|
|
|
|
The code has outgrown the original name for this struct. Both the
external and internal APIs have used the "clockqueue" namespace for
some time when operating on it, and that name is eyeball-consistent
with "clockintr" and "clockrequest", so "clockqueue" it is.
|
|
time this file was introduced close to 30 years ago.
|
|
|
|
Currently, clockintr_establish() calls malloc(9) to allocate a
clockintr struct on behalf of the caller. mpi@ says this behavior is
incompatible with dt(4). In particular, calling malloc(9) during the
initialization of a PCB outside of dt_pcb_alloc() is (a) awkward and
(b) may conflict with future changes/optimizations to PCB allocation.
To side-step the problem, this patch changes the clockintr subsystem
to use caller-allocated clockintr structs instead of callee-allocated
structs.
clockintr_establish() is named after softintr_establish(), which uses
malloc(9) internally to create softintr objects. The clockintr subsystem
is no longer using malloc(9), so the "establish" naming is no longer apt.
To avoid confusion, this patch also renames "clockintr_establish" to
"clockintr_bind".
Requested by mpi@. Tweaked by mpi@.
Thread: https://marc.info/?l=openbsd-tech&m=170597126103504&w=2
ok claudio@ mlarkin@ mpi@
|
|
OK miod@
|
|
descriptor (pted) pool in the arm64 pmap implementation. This
significantly reduces the side-effects of lock contention on the kernel
map lock that is (incorrectly) translated into excessive page daemon
wakeups. This is not a perfect solution but it does lead to significant
speedups on machines with many CPU cores.
This requires adding a new pmap_init_percpu() function that gets called
at the point where kernel is ready to set up the per-CPU pool caches.
Dummy implementations of this function are added for all non-arm64
architectures. Some other architectures can probably benefit from
providing an actual implementation that sets up per-CPU caches for
pmap pools as well.
ok phessler@, claudio@, miod@, patrick@
|
|
This patch isolates profil(2) and GPROF from statclock(). Currently,
statclock() implements both profil(2) and GPROF through a complex
mechanism involving both platform code (setstatclockrate) and the
scheduler (pscnt, psdiv, and psratio). We have a machine-independent
interface to the clock interrupt hardware now, so we no longer need to
do it this way.
- Move profil(2)-specific code from statclock() to a new clock
interrupt callback, profclock(), in subr_prof.c. Each
schedstate_percpu has its own profclock handle. The profclock is
enabled/disabled for a given CPU when it is needed by the running
thread during mi_switch() and sched_exit().
- Move GPROF-specific code from statclock() to a new clock interrupt
callback, gmonclock(), in subr_prof.c. Where available, each cpu_info
has its own gmonclock handle . The gmonclock is enabled/disabled for
a given CPU via sysctl(2) in prof_state_toggle().
- Both profclock() and gmonclock() have a fixed period, profclock_period,
that is initialized during initclocks().
- Export clockintr_advance(), clockintr_cancel(), clockintr_establish(),
and clockintr_stagger() via <sys/clockintr.h>. They have external
callers now.
- Delete pscnt, psdiv, psratio. From schedstate_percpu, also delete
spc_pscnt and spc_psdiv. The statclock frequency is not dynamic
anymore so these variables are now useless.
- Delete code/state related to the dynamic statclock frequency from
kern_clockintr.c. The statclock frequency can still be pseudo-random,
so move the contents of clockintr_statvar_init() into clockintr_init().
With input from miod@, deraadt@, and claudio@. Early revisions
cleaned up by claudio. Early revisions tested by claudio@. Tested by
cheloha@ on amd64, arm64, macppc, octeon, and sparc64 (sun4v).
Compile- and boot- tested on i386 by mlarkin@. riscv64 compilation
bugs found by mlarkin@. Tested on riscv64 by jca@. Tested on
powerpc64 by gkoehler@.
|
|
Every platform made the clockintr switch at least six months ago.
The __HAVE_CLOCKINTR symbol is now redundant. Remove it.
Prompted by claudio@.
Link: https://marc.info/?l=openbsd-tech&m=168826181015032&w=2
"makes sense" mlarkin@
|
|
ever ran on, and it's unlikely to ever be implemented, so remove it.
ok jsg@
|
|
"this makes sense" miod@
|
|
against classic BROP with a range-checking wrapper in front of copyin() and
copyinstr() which ensures the userland source doesn't overlap the main program
text, ld.so text, signal tramp text (it's mapping is hard to distinguish
so it comes along for the ride), or libc.so text. ld.so tells the kernel
libc.so text range with msyscall(2). The range checking for 2-4 elements is
done without locking (because all 4 ranges are immutable!) and is inexpensive.
write(sock, &open, 400) now fails with EFAULT. No programs have been
discovered which require reading their own text segments with a system call.
On a machine without mmu enforcement, a test program reports the following:
userland kernel
ld.so readable unreadable
mmap xz unreadable unreadable
mmap x readable readable
mmap nrx readable readable
mmap nwx readable readable
mmap xnwx readable readable
main readable unreadable
libc unmapped? readable unreadable
libc mapped readable unreadable
ok kettenis, additional help from miod
|
|
is ELF" world. Eliminate use of them in m88k code.
ok aoyama@
|
|
- Initialize tick_nsec during cpu_initclocks()
We have no control over the interrupt clock on luna88k, so this switch
is trivial.
Bringup help and testing from aoyama@ and miod@.
Link: https://marc.info/?l=openbsd-tech&m=166776371203450&w=2
ok aoyama@ mlarkin@
|
|
Use that define to shunt uvm_swapout_threads(), which is a noop when
pmap_collect() does nothing.
ok mpi@
|
|
ok miod@ guenther@
|
|
back in 2019.
ok mpi@
|
|
waiting on CPUs that didn't spin up. This will allow us to spin down
CPUs in the future to save power as well.
ok mpi@
|
|
Add a 512-byte buffer (ci_panicbuf) to each cpu_info struct on each
platform for use by panic(9). The first panic on a given CPU writes
its message to this buffer. Subsequent panics on a given CPU print
the panic message to the console but do not modify the buffer. This
aids debugging in two cases:
- If 2+ CPUs panic simultaneously there is no risk of garbled messages
in the panic buffer.
- If a CPU panics and then the operator causes a second panic while
using ddb(4), the operator can still recall the first failure on
a particular CPU.
Misc. changes to support this bigger change:
- Set panicstr atomically to identify the first CPU to reach panic().
- Tweak db_show_panic_cmd() to print all panic messages across all
CPUs. Prefix the first panic with an asterisk ('*').
- Prefer db_printf() to printf() during a panic if we have it.
Apparently it disturbs less global state.
- On amd64, tweak fault() to write the local panic buffer. This needs
more work.
Prompted by bluhm@ and deraadt@. Mostly written by deraadt@.
Discussed with bluhm@, deraadt@ and kettenis@.
Borne from a discussion on tech@ about making panic(9) more MP-safe:
https://marc.info/?l=openbsd-tech&m=162086462316143&w=2
ok kettenis@, visa@, bluhm@, deraadt@
|
|
This fixes compile errors (actually warnings) on m88k in sys/net/pf.c
revision 1.1116 changes.
Diff from Miod Vallat, tested on GENERIC and GENERIC.MP by me.
|
|
The "snowflake" uniqueness of every MD trap impl often gets in the way
of precisely & correctly interfacing to MI layers. The differences
also complicates review, and causes new MI requirements to be
incorrectly written. Thus an architecture will fall behind, not just
because they are slow or rare, but because the code behaviour becomes
increasingly incorrect. It is sad.
|
|
|
|
This diff exposes parts of clock_gettime(2) and gettimeofday(2) to
userland via libc eliberating processes from the need for a context
switch everytime they want to count the passage of time.
If a timecounter clock can be exposed to userland than it needs to set
its tc_user member to a non-zero value. Tested with one or multiple
counters per architecture.
The timing data is shared through a pointer found in the new ELF
auxiliary vector AUX_openbsd_timekeep containing timehands information
that is frequently updated by the kernel.
Timing differences between the last kernel update and the current time
are adjusted in userland by the tc_get_timecount() function inside the
MD usertc.c file.
This permits a much more responsive environment, quite visible in
browsers, office programs and gaming (apparently one is are able to fly
in Minecraft now).
Tested by robert@, sthen@, naddy@, kmos@, phessler@, and many others!
OK from at least kettenis@, cheloha@, naddy@, sthen@
|
|
functionality is provided by <sys/stdarg.h> using compiler builtins.
Tested in a ports bulk build on amd64 by naddy@
OK naddy@ mpi@
|
|
rnd.c uses nanotime to get access to some bits that change quickly
between events that it can mix into the entropy pool. it doesn't
use nanotime to get a monotonically increasing set or ordered and
accurate timestamps, it just wants something with bits that change.
there's been discussions for years about letting rnd use a clock
that's super fast to read, but not necessarily accurate, but it
wasn't until recently that i figured out it wasn't interested in
time at all, so things like keeping a fast clock coherent between
cpu cores or correct according to ntp is unecessary. this means we
can just let rnd read the cycle counters on cpus and things will
be fine. cpus with cycle counters that vary in their speed and
arent kept consistent between cores may even be desirable in this
context.
so this is the first step in converting rnd.c to reading cycle
counter. it copies the nanotime backend to each arch, and they can
replace it with something MD as a second step later on.
djm@ suggested rnd_messybytes, but we landed on cpu_rnd_messybits.
thanks to visa for his eyes.
ok deraadt@ visa@
deraadt@ says he will help handle any MD fallout that occurs.
|
|
Nothing uses the header anymore.
OK deraadt@ mpi@
|
|
This will make mutex spinning time visible in top(1), and also might
improve stability.
The major change in this is that the old assembly code acquires
mutexes with an atomic exchange operation, but releases them with a
regular store, but the new code always uses atomic exchange
operations.
The mutex.h changes to the macros conform to <sys/mutex.h> to be able
to reset the system while in ddb.
Suggested from Miod Vallat, tested by me. The stability in heavy load
is greatly improved in my case.
|
|
resetting it in child_return() and update the comment in tcb.h to reflect
reality
ok miod@ aoyama@
|
|
ci_mp_atomic_{begin,end} are 6th and 7th elements of cpu_info
structure. Actually that is dummy structure used in early boot stage,
but for consistency, move ci_srp_hazards position in cpu_info.
ok mpi@
|
|
including cpu.h machine/intr.h etc without first including param.h when
MULTIPROCESSOR is defined.
ok visa@
|
|
4MB which is far too low especially when the platform is able to run MP.
New limits are, amd64 = 256M; arm64, mips64, sparc64 = 64M; alpha, arm,
hppa, i386, powerpc = 32M; m88k, sh = 8M
Still rather conservative numbers but much better than before. At least
some hangs of arm64 build boxes was caused by this.
OK kettenis@, visa@
|
|
The src/lib/libc/thread/rthread.c 1.8 change adds #include
<sys/atomic.h> in userland code.
Current m88k atomic.h contents are inside of #if defined(_KERNEL)
guard, then, nothing is defined for userland program.
So we need adding some defines to compile it on m88k.
The original diff is suggested from Miod Vallat, modified by the
advice from mpi@ and kettenis@.
ok kettenis@
|
|
OK deraadt@ mpi@
|
|
ok visa@
|
|
needs (looking at you sgi, but others required this before). This is for
the circumstances we need pagesize known at compile time, not getpagesize()
runtime. Use it for malloc storage sizes, for shm, and to set pthread stack
default sizes. The stack sizes were a mess, and pushing them towards
page-aligned is healthy move (which will also be needed by the coming
stack register checker)
ok guenther kettenis, discussion with stefan
|
|
ok kettenis@, visa@
|
|
Remove `mtx_lock' from i386, add volatile before `mtx_owner' where it
was missing.
Inputs from kettenis@, ok visa@
|
|
extend ddb(4) "ps /o" output to print which CPU is currently holding the
KERNEL_LOCK().
Tested by dhill@, ok visa@
|
|
pthread_exit from libpthread to libc, along with low-level bits to
support them. Major bump to both libc and libpthread.
Requested by libressl team. Ports testing by naddy@
ok kettenis@
|
|
|
|
in struct mdproc. With that, all archs have those and the __HAVE_MD_TCB
macro can be unifdef'ed as always defined.
ok kettenis@ visa@ jsing@
|