summaryrefslogtreecommitdiff
path: root/sys/kern/init_main.c
AgeCommit message (Collapse)Author
2017-04-28Add futex(2) syscall based on a sane subset of its Linux equivalent.Martin Pieuchot
The syscall is marked NOLOCK and only FUTEX_WAIT grabs the KERNEL_LOCK() because of PCATCH and the signal nightmare. Serialization of threads is currently done with a global & exclusive rwlock. Note that the current implementation still use copyin(9) which is not guaranteed to be atomic. Committing now such that remaining issues can be addressed in-tree. With inputs from guenther@, kettenis@ and visa@. ok deraadt@, visa@
2017-04-20Add a port of witness(4) lock validation tool from FreeBSD.Visa Hankala
Go-ahead from kettenis@, guenther@, deraadt@
2017-03-06domaininit() doesn't need splnet().Martin Pieuchot
At this stage the scheduler isn't setup, which means the 'softnet' isn't running yet, so input packets aren't processed. Prodded by a question from guenther@, ok bluhm@
2017-02-12Split up fork1():Philip Guenther
- FORK_THREAD handling is a totally separate function, thread_fork(), that is only used by sys___tfork() and which loses the flags, func, arg, and newprocp parameters and gains tcb parameter to guarantee the new thread's TCB is set before the creating thread returns - fork1() loses its stack and tidptr parameters Common bits factor out: - struct proc allocation and initialization moves to thread_new() - maxthread handling moves to fork_check_maxthread() - setting the new thread running moves to fork_thread_start() The MD cpu_fork() function swaps its unused stacksize parameter for a tcb parameter. luna88k testing by aoyama@, alpha testing by dlg@ ok mpi@
2017-01-21p_comm is the process's command and isn't per thread, so move it fromPhilip Guenther
struct proc to struct process. ok deraadt@ kettenis@
2017-01-01copyright++;Jonathan Gray
2016-11-14Automatically create a default lo(4) interface per rdomain.Martin Pieuchot
In order to stop abusing lo0 for all rdomains, a new loopback interface will be created every time a rdomain is created. The unit number will be the same as the rdomain, i.e. lo1 will be attached to rdomain 1. If this loopback interface is already in use it wont be possible to create the corresponding rdomain. In order to know which lo(4) interface is attached to a rdomain, its index is stored in the rtable/rdomain map. This is a long overdue since the introduction of rtable/rdomain. It also fixes a recent regression due to resetting the rdomain of an incoming packet reported by semarie@, Andreas Bartelt and Nils Frohberg. ok claudio@
2016-11-07Split PID from TID, giving processes a PID unrelated to the TID of theirPhilip Guenther
initial thread ok jsing@ kettenis@
2016-10-24move the mbstat structure to percpu countersDavid Gwynne
each cpus counters still have to be protected by splnet, but this is better thana single set of counters protected by a global mutex. ok bluhm@
2016-10-21add generalised access to per cpu data structures and counters.David Gwynne
both the cpumem and counters api simply allocates memory for each cpu in the system that can be used for arbitrary per cpu data (via cpumem), or a versioned set of counters per cpu (counters). there is an alternate backend for uniprocessor systems that basically turns the percpu data access into an immediate access to a single allocation. there is also support for percpu data structures that are available at boot time by providing an allocation for the boot cpu. after autoconf, these allocations have to be resized to provide for all cpus that were enumerated by boot. ok mpi@
2016-09-22Introduce a new 'softclock' thread that will be used to execute timeoutMartin Pieuchot
callbacks needing a process context. The function timeout_set_proc(9) has to be used instead of timeout_set(9) when a timeout callback needs a process context. Note that if such a timeout is waiting, understand sleeping, for a non negligible amount of time it might delay other timeouts needing a process context. dlg@ agrees with this as a temporary solution. Manpage tweaks from jmc@ ok kettenis@, bluhm@, mikeb@
2016-09-18add missing call to db_ctf_init().Jasper Lievisse Adriaanse
this was part of the larger diff that was ok guenther@ mpi@, somehow I forgot to commit this particular piece.
2016-09-04Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernelMartin Pieuchot
profiling framework. Code patching is used to enable probes when entering functions. The probes will call a mcount()-like function to match the behavior of a GPROF kernel. Currently only available on amd64 and guarded under DDBPROF. Support for other archs will follow soon. A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0 to be able to use this feature. Inputs and ok guenther@
2016-09-03Write the system time back to the RTC every 30 minutes.Christian Weisgerber
This fixes the problem that long-running machines which were not shut down properly would reboot with a badly offset system time. hints and ok kettenis@
2016-09-03Do not reinitialize __guard_local if it is 0. This cannot be doneTheo de Raadt
anymore, since it is now RO. It is the bootloader's job to initialize it correctly. If the bootloader fails to perform that, you silently lose. The road to building an always-available rng is served by us depending on it :)
2016-09-02move links from http to https://www.openbsd.org/Theo Buehler
ok beck
2016-05-17Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.Alexander Bluhm
Permanently holding /dev/console open in the kernel works only until init(8) calls revoke(2). After that the console device vnode cannot be used anymore. It still resulted in a hanging init(8) if it tried to syslog(3) something. With the backout also dmesg -s works again.
2016-05-10If sendsyslog(2) is called with LOG_CONS before syslogd(8) has beenAlexander Bluhm
started and before init(8) has opened the console, the kernel could crash as the console device has not been initialized. Open /dev/console in the kernel before starting init(8) and keep it open. This way sendsyslog(2) can be called early in the system. OK beck@ deraadt@
2016-05-10SROP mitigation. sendsig() stores a (per-process ^ &sigcontext) cookieTheo de Raadt
inside the sigcontext. sigreturn(2) checks syscall entry was from the exact PC addr in the (per-process ASLR) sigtramp, verifies the cookie, and clears it to prevent sigcontext reuse. not yet tested on landisk, sparc, *88k, socppc. ok kettenis
2016-05-03Stop using a soft-interrupt context to process incoming network packets.Martin Pieuchot
Use a new task that runs holding the KERNEL_LOCK to execute mp-unsafe code. Our current goal is to progressively move input functions to the unlocked task. This gives a small performance boost confirmed by Hrvoje Popovski's IPv4 forwarding measurement: before: after: send receive send receive 400kpps 400kpps 400kpps 400kpps 500kpps 500kpps 500kpps 500kpps 600kpps 600kpps 600kpps 600kpps 650kpps 650kpps 650kpps 640kpps 700kpps 700kpps 700kpps 700kpps 720kpps 640kpps 720kpps 710kpps 800kpps 640kpps 800kpps 650kpps 1.4Mpps 570kpps 1.4Mpps 590kpps 14Mpps 570kpps 14Mpps 590kpps ok kettenis@, bluhm@, dlg@
2016-03-19Remove the unused flags argument from VOP_UNLOCK().natano
torture tested on amd64, i386 and macppc ok beck mpi stefan "the change looks right" deraadt
2016-01-03copyright++;Jonathan Gray
2015-12-11Replace mountroothook_establish(9) by config_mountroot(9) a narrower APIMartin Pieuchot
similar to config_defer(9). ok mikeb@, deraadt@
2015-11-08keep all the setperf timeout(9) handling in one place; ok tedu@Christian Weisgerber
2015-10-07Initialize the routing table before domains.Martin Pieuchot
The routing table is not an optional component of the network stack and initializing it inside the "routing domain" requires some ugly introspection in the domain interface. This put the rtable* layer at the same level of the if* level. These two subsystem are organized around the two global data structure used in the network stack: - the global &ifnet list, to be used in process context only, and - the routing table which can be read in interrupt context. This change makes the rtable_* layer domain-aware and extends the "struct domain" such that INET, INET6 and MPLS can specify the length of the binary key used in lookups. This allows us to keep, or move towards, AF-free route and rtable layers. While here stop the madness and pass the size of the maximum key length in *byte* to rn_inithead0(). ok claudio@, mikeb@
2015-08-30Use a global table for domains instead of building a list at run time.Martin Pieuchot
As a side effect there's no need to run if_attachdomain() after the list of domains has been built. ok claudio@, reyk@
2015-07-09Disable pool_gc on m88k if MULTIPROCESSOR; we don't have enough volunteersMiod Vallat
for human sacrifices to get this fixed in a reasonably near future, and the tree must build.
2015-07-02introduce srp, which according to the manpage i wrote is short forDavid Gwynne
"shared reference pointers". srp allows concurrent access to a data structure by multiple cpus while avoiding interlocking cpu opcodes. it manages its own reference counts and the garbage collection of those data structure to avoid use after frees. internally srp is a twisted version of hazard pointers, which are a relative of RCU. jmatthew wrote the bulk of a hazard pointer implementation and changed bpf to use it to allow mpsafe access to bpfilters. however, at s2k15 we were trying to apply it to other data structures but the memory overhead of every hazard pointer would have blown out significantly in several uses cases. a bulk of our time at s2k15 was spent reworking hazard pointers into srp. this diff adds the srp api and adds the necessary metadata to struct cpuinfo on our MP architectures. srp on uniprocessor platforms has alternate code that is optimised because it knows there'll be no concurrent access to data by multiple cpus. srp is made available to the system via param.h, so it should be available everywhere in the kernel. the docs likely need improvement cos im too close to the implementation. ok mpi@
2015-06-24reenable the pool gc task.David Gwynne
the problems it tickled by working outside the biglock on archs with mutex and clock interaction have been fixed, as evidenced by the softnet taskq. ok deraadt@
2015-05-18Reenable the page zeroing thread on MP m88k kernels.Miod Vallat
2015-05-05emul_native is only used for kernel threads which can't dump core, soPhilip Guenther
delete coredump_trad(), uvm_coredump(), cpu_coredump(), struct md_coredump, and various #includes that are superfluous. This leaves compat_linux processes without a coredump callback. If that ability is desired, someone should update it to use coredump_elf32() and verify the results... ok kettenis@
2015-05-01reenable page zeroing thread on SMP mips kernels.Miod Vallat
2015-04-12disable the pool gc. there are reports of strange lockups on various mpDavid Gwynne
archs and this is the only interesting diff in the window.
2015-04-07introduce a garbage collector for (very) idle pool pages.David Gwynne
now that idle pool pages are timestamped we can tell how long theyve been idle. this adds a task that runs every second that iterates over all the pools looking for pages that have been idle for 8 seconds so it can free them. this idea probably came from a conversation with tedu@ months ago. ok tedu@ kettenis@
2015-02-10Factor out the common bits of process_new() and main()'s code forPhilip Guenther
setting up process0, 'cause I'm sick of forgetting to update main() when touching process_new() ok blambert@ miod@
2015-02-09Stop using USRSTACK as the edge of the stack, but rather use the vmspaceMiod Vallat
vm_minsaddr or vm_maxsaddr, depending upon the direction the stack goes in. This should have no effect on the existing behaviourrr. ok kettenis@ deraadt@
2015-01-19unneccessary cmask variable; ok guentherTheo de Raadt
2015-01-13Many architectures call initmsgbuf() really really early, before uvm isMark Kettenis
initialized. Calling malloc(9) at that point is not a good idea. So initialize consbuf later. Fixes dmesg -s on sparc64 (and probably a few other architectures). ok miod@, deraadt@
2014-12-31copyright_year=$(date +%Y)Joel Sing
2014-12-28The greatest happiness is to scatter inferiour APIs, to drive themKenneth R Westerback
before you, to see their files reduced to ashes, to see those who love them shrouded in tears, and to gather into your API all their invocations. In other words, workq is no more. There is only taskq. ok kettenis@ dlg@ (creator of taskq) jmc@
2014-12-17Prefer MADV_* over POSIX_MADV_* in kernel for consistency: the latterPhilip Guenther
doesn't have all the values and therefore can't be used everywhere. ok deraadt@ kettenis@
2014-12-16primary change: move uvm_vnode out of vnode, keeping only a pointer.Ted Unangst
objective: vnode.h doesn't include uvm_extern.h anymore. followup changes: include uvm_extern.h or lock.h where necessary. ok and help from deraadt
2014-12-15Use MAP_INHERIT_* for the 'inh' argument to the UMV_MAPFLAG() macro,Philip Guenther
eliminating the must-be-kept-in-sync UVM_INH_* macros ok deraadt@ tedu@
2014-12-10convert bcopy to memcpy. ok millertTed Unangst
2014-11-18Disable the page zeroing thread on MULTIPROCESSOR mips64 kernels as well.Miod Vallat
Regression spotted by tobiasu@. XXX I wonder if the page zeroing thread shouldn't perform explicit XXX pmap_update(pmap_kernel()) calls after each page zeroing... but that XXX might not be enough.
2014-11-16Replace a plethora of historical protection options with justTheo de Raadt
PROT_NONE, PROT_READ, PROT_WRITE, and PROT_EXEC from mman.h. PROT_MASK is introduced as the one true way of extracting those bits. Remove UVM_ADV_* wrapper, using the standard names. ok doug guenther kettenis
2014-10-25Do not launch the page zeroing thread on MULTIPROCESSOR m88k systems. ThisMiod Vallat
causes a deadlock between reaper and zerothread I am currently investigating.
2014-10-17redo the performance throttling in the kernel.Ted Unangst
introduce a new sysctl, hw.perfpolicy, that governs the policy. when set to anything other than manual, hw.setperf then becomes read only. phessler was heading in this direction, but this is slightly different. :)
2014-10-13disable pagezero thread on hppa, until failure gets diagnosed, ok miod kettenisTheo de Raadt
2014-10-11back out; does not even compileTheo de Raadt