summaryrefslogtreecommitdiff
path: root/sys/kern
AgeCommit message (Collapse)Author
2016-12-19Introduce the NET_LOCK() a rwlock used to serialize accesses to the partsMartin Pieuchot
of the network stack that are not yet ready to be executed in parallel or where new sleeping points are not possible. This first pass replace all the entry points leading to ip_output(). This is done to not introduce new sleeping points when trying to acquire ART's write lock, needed when a new L2 entry is created via the RT_RESOLVE. Inputs from and ok bluhm@, ok dlg@
2016-12-18Include sys/proc.h when compiled with SYSCALL_DEBUG to get access toPatrick Wildt
struct proc. Also bump the printf of "code" to %ld and remove a few casts to long as register_t is always long. ok kettenis@
2016-11-29m_free() and m_freem() test for NULL. Simplify callers which had their ownJonathan Gray
NULL tests. ok mpi@
2016-11-28Remove NULL checks before m_free{m,}().Martin Pieuchot
ok reyk@, rzalamena@
2016-11-23Some socket splicing tests on loopback hang with large mbufs andAlexander Bluhm
reduced buffer size. If the send buffer size is less than the size of a single mbuf, it will never fit. So if the send buffer is empty, split the large mbuf and move only a part. OK claudio@
2016-11-22Enforce that ifioctl() is called at IPL_SOFTNET.Martin Pieuchot
This will allow us to keep locking simple as soon as we trade splsoftnet() for a rwlock. ok bluhm@
2016-11-22Enforce that pr_ctlinput, pr_slowtimo and pr_fasttimo are calledMartin Pieuchot
at IPL_SOFTNET. This will allow us to keep locking simple as soon as we trade splsoftnet() for a rwlock. ok bluhm@
2016-11-22Enforce that pr_ctloutput is called at IPL_SOFTNET.Martin Pieuchot
This will allow us to keep locking simple as soon as we trade splsoftnet() for a rwlock. ok bluhm@
2016-11-21Kill rtioctl() stub, returning EOPNOTSUPP since tree import.Martin Pieuchot
ok jsg@
2016-11-21Enforce that pr_usrreq functions are called at IPL_SOFTNET.Martin Pieuchot
This will allow us to keep locking simple as soon as we trade splsoftnet() for a rwlock. ok bluhm@, claudio@
2016-11-21let pool page allocators advertise what sizes they can provide.David Gwynne
to keep things concise i let the multi page allocators provide multiple sizes of pages, but this feature was implicit inside pool_init and only usable if the caller of pool_init did not specify a page allocator. callers of pool_init can now suplly a page allocator that provides multiple page sizes. pool_init will try to fit 8 items onto a page still, but will scale its page size down until it fits into what the allocator provides. supported page sizes are specified as a bit field in the pa_pagesz member of a pool_allocator. setting the low bit in that word indicates that the pages can be aligned to their size.
2016-11-15Bring back the SB_LOCK and SB_WANT flags to lock the socket buffersAlexander Bluhm
in process context. The read/write lock introduced in rev 1.64 would create lock ordering problems with the upcoming SOCKET_LOCK() mechanism. The current tsleep() in sblock() must be replaced with rwsleep(&socketlock) later. The sb_flags are protected by KERNEL_LOCK(). They must not be accessed from interrupt context, but nowadays softnet() is not an interrupt anyway. OK mpi@
2016-11-14Automatically create a default lo(4) interface per rdomain.Martin Pieuchot
In order to stop abusing lo0 for all rdomains, a new loopback interface will be created every time a rdomain is created. The unit number will be the same as the rdomain, i.e. lo1 will be attached to rdomain 1. If this loopback interface is already in use it wont be possible to create the corresponding rdomain. In order to know which lo(4) interface is attached to a rdomain, its index is stored in the rtable/rdomain map. This is a long overdue since the introduction of rtable/rdomain. It also fixes a recent regression due to resetting the rdomain of an incoming packet reported by semarie@, Andreas Bartelt and Nils Frohberg. ok claudio@
2016-11-14Remove splnet() from socket kqueue code.Martin Pieuchot
splnet() was necessary when link state changes were executed from hardware interrupt handlers, nowdays all the changes are serialized by the KERNEL_LOCK() so assert that it is held instead. ok mikeb@
2016-11-13Fix typo in comment: it's vm.loadavg, not kern.loadavg.Theo Buehler
From patrick keshishian
2016-11-11Export p_cpuid via sysctl for all processes; ok guentherMike Belopuhov
2016-11-09Do not call splsoftnet() recursively, this won't work with a lock.Martin Pieuchot
closef() on a socket will call soclose() which call splsoftnet(). So make sure we release the IPL level first in error paths. Found by Nils Frohberg while testing another diff. ok mikeb@, bluhm@
2016-11-09Do not dereference a variable without initializing it beforehand.Martin Pieuchot
Fix a typo introduced in m_pullup(9) refactoring and found the hard way by semarie@ while testing another diff. ok mikeb@, dlg@
2016-11-07rename some types and functions to make the code easier to read.David Gwynne
pool_item_header is now pool_page_header. the more useful change is pool_list is now pool_cache_item. that's what items going into the per cpu pool caches are cast to, and they get linked together to make a list. the functions operating on what is now pool_cache_items have been renamed to make it more obvious what they manipulate.
2016-11-07Split PID from TID, giving processes a PID unrelated to the TID of theirPhilip Guenther
initial thread ok jsing@ kettenis@
2016-11-02poison the TAILQ_ENTRY in items in the per cpu pool cache.David Gwynne
2016-11-02add poisoning of items on the per cpu caches.David Gwynne
it copies the existing pool code, except it works on pool_list structures instead of pool_item structures. after this id like to poison the words used by the TAILQ_ENTRY in the pool_list struct that arent used until a list of items is moved into the global depot.
2016-11-02use a TAILQ to maintain the list of item lists used by the percpu code.David Gwynne
it makes it more readable, and fixes a bug in pool_list_put where it was returning the next item in the current list rather than the next list to be freed.
2016-11-02add per cpu caches for free pool items.David Gwynne
this is modelled on whats described in the "Magazines and Vmem: Extending the Slab Allocator to Many CPUs and Arbitrary Resources" paper by Jeff Bonwick and Jonathan Adams. the main semantic borrowed from the paper is the use of two lists of free pool items on each cpu, and only moving one of the lists in and out of a global depot of free lists to mitigate against a cpu thrashing against that global depot. unlike slabs, pools do not maintain or cache constructed items, which allows us to use the items themselves to build the free list rather than having to allocate arrays to point at constructed pool items. the per cpu caches are build on top of the cpumem api. this has been kicked a bit by hrvoje popovski and simon mages (thank you). im putting it in now so it is easier to work on and test. ok jmatthew@
2016-10-27For consistency, allow symlinkat(2) in the same way as symlink(2);Ingo Schwarze
no need to wait until the first program using it breaks... "could make sense" semarie@ (and thanks for the cluestick) OK deraadt@
2016-10-27use ncpusfound to size the percpu allocations.David Gwynne
ncpus is used on half the architectures to indicate the number of cpus that have been hatched, and is used on them in things like ddb to figure out how many cpus to shut down again. ncpusfound is incremented during autoconf on MP machines to show how big ncpus will probably become. percpu is initted after autoconf but before cpus are hatched, so this works well.
2016-10-27refactor m_pullup a bit.David Gwynne
the most important change is that if the requested data is already in the first mbuf in the chain, return quickly. if that isnt true, the code will try to use the first mbuf to fit the requested data. if that isnt true, it will prepend an mbuf, and maybe a cluster, to fit the requested data. m_pullup will now try to maintain the alignment of the original payload, even when prepending a new mbuf for it. ok mikeb@
2016-10-27add a new pool for 2k + 2 byte (mcl2k2) clusters.David Gwynne
a certain vendor likes to make chips that specify the rx buffer sizes in kilobyte increments. unfortunately it places the ethernet header on the start of the rx buffer, which means if you give it a mcl2k cluster, the ethernet header will not be ETHER_ALIGNed cos mcl2k clusters are always allocated on 2k boundarys (cos they pack into pages well). that in turn means the ip header wont be aligned correctly. the current workaround on these chips has been to let non-strict alignment archs just use the normal 2k cluster, but use whatever cluster can fit 2k + 2 on strict archs. that turns out to be the 4k cluster, meaning we waste nearly 2k of space on every packet. properly aligning the ethernet header and ip headers gives a performance boost, even on non-strict archs.
2016-10-24avoid using realloc in the name of things that dont work like realloc.David Gwynne
cpumem_realloc and counters_realloc actually allocated new per cpu data for new cpus, they didnt resize the existing allocation. specifically, this renames cpumem_reallod to cpumem_malloc_ncpus, and counters_realloc to counters_alloc_ncpus. ok (and with some fixes by) bluhm@
2016-10-24move the mbstat structure to percpu countersDavid Gwynne
each cpus counters still have to be protected by splnet, but this is better thana single set of counters protected by a global mutex. ok bluhm@
2016-10-24non-MP vs MP codepaths were confusingly split between the .c and .h file.Theo de Raadt
Unify these by placing #ifdef MULTIPROCESSOR inside the functions, then collapse further to reduce _KERNEL blocks ok dlg
2016-10-23unbreak by fixing obvious pastosChristian Weisgerber
2016-10-23handle non-INET6 kernels in some wayTheo de Raadt
2016-10-23dns hijacking must be af specific. move it into the port check function,Ted Unangst
and redirect inet6 sockets to the ::1 flavor of localhost.
2016-10-22Factor out pr->ps_vmspace into a local variable for fill_kproc()Philip Guenther
ok jsing@ kettenis@
2016-10-22Adjust allocpid() to take into account lastpidPhilip Guenther
ok jsing@ kettensi@
2016-10-22Delete dead copy of pr->ps_vmspace; uvmspace_exec() can change it anywayPhilip Guenther
ok kettenis@ jsing@
2016-10-21pledge changes needed to support pledging vmd(8) on i386, forthcoming.Mike Larkin
ok deraadt@
2016-10-21consistently zero the allocated memory in both the MP and UP cases.David Gwynne
from markus@
2016-10-21add generalised access to per cpu data structures and counters.David Gwynne
both the cpumem and counters api simply allocates memory for each cpu in the system that can be used for arbitrary per cpu data (via cpumem), or a versioned set of counters per cpu (counters). there is an alternate backend for uniprocessor systems that basically turns the percpu data access into an immediate access to a single allocation. there is also support for percpu data structures that are available at boot time by providing an allocation for the boot cpu. after autoconf, these allocations have to be resized to provide for all cpus that were enumerated by boot. ok mpi@
2016-10-19Change process_{domem,auxv_offset}() to take a process instead of a proc.Philip Guenther
Make process_auxv_offset() take and release a reference of the vmspace like process_domem() does. ok kettenis@
2016-10-19Change pmap_proc_iflush() to take a process instead of a procPhilip Guenther
powerpc: rename second argument of pmap_proc_iflush() to match other archs ok kettenis@
2016-10-15Process groups can't be removed if a zombie process is in them, soPhilip Guenther
ispidtaken() can rely on pgfind() for all pgrp checks and can simply use zombiefind() for the zombie check ok jca@
2016-10-10white space fixes.David Gwynne
no functional change
2016-10-10copy the offset of data inside mbufs in m_copym().David Gwynne
this is cheap since it is basic math. it also means that payloads which have been aligned carefully will also be aligned in their copy. ok yasuoka@ claudio@
2016-10-09With systrace and procfs gone, process_checkioperm() and process_domem()Philip Guenther
are for option PTRACE only ok kettenis@
2016-10-09sowakeup() is only called from sorwakeup() and sowwakeup(). BothAlexander Bluhm
have an splsoftassert(IPL_SOFTNET) now, so sowakeup() does not need to call splsoftnet() anymore. From mpi@'s netlock diff; OK mikeb@
2016-10-08upon further review, port numbers go all the way up to ushort maxTed Unangst
2016-10-08initialize the port variable before sysctl, since it's also read out.Ted Unangst
2016-10-08Add ktracing of the fds returned by pipe() and socketpair()Philip Guenther
ok deraadt@