Age | Commit message (Collapse) | Author |
|
|
|
it copies the existing pool code, except it works on pool_list
structures instead of pool_item structures.
after this id like to poison the words used by the TAILQ_ENTRY in
the pool_list struct that arent used until a list of items is moved
into the global depot.
|
|
it makes it more readable, and fixes a bug in pool_list_put where it
was returning the next item in the current list rather than the next
list to be freed.
|
|
this is modelled on whats described in the "Magazines and Vmem:
Extending the Slab Allocator to Many CPUs and Arbitrary Resources"
paper by Jeff Bonwick and Jonathan Adams.
the main semantic borrowed from the paper is the use of two lists
of free pool items on each cpu, and only moving one of the lists
in and out of a global depot of free lists to mitigate against a
cpu thrashing against that global depot.
unlike slabs, pools do not maintain or cache constructed items,
which allows us to use the items themselves to build the free list
rather than having to allocate arrays to point at constructed pool
items.
the per cpu caches are build on top of the cpumem api.
this has been kicked a bit by hrvoje popovski and simon mages (thank you).
im putting it in now so it is easier to work on and test.
ok jmatthew@
|
|
no need to wait until the first program using it breaks...
"could make sense" semarie@ (and thanks for the cluestick)
OK deraadt@
|
|
ncpus is used on half the architectures to indicate the number of
cpus that have been hatched, and is used on them in things like ddb
to figure out how many cpus to shut down again.
ncpusfound is incremented during autoconf on MP machines to show
how big ncpus will probably become. percpu is initted after autoconf
but before cpus are hatched, so this works well.
|
|
the most important change is that if the requested data is already
in the first mbuf in the chain, return quickly.
if that isnt true, the code will try to use the first mbuf to fit
the requested data.
if that isnt true, it will prepend an mbuf, and maybe a cluster,
to fit the requested data.
m_pullup will now try to maintain the alignment of the original
payload, even when prepending a new mbuf for it.
ok mikeb@
|
|
a certain vendor likes to make chips that specify the rx buffer
sizes in kilobyte increments. unfortunately it places the ethernet
header on the start of the rx buffer, which means if you give it a
mcl2k cluster, the ethernet header will not be ETHER_ALIGNed cos
mcl2k clusters are always allocated on 2k boundarys (cos they pack
into pages well). that in turn means the ip header wont be aligned
correctly.
the current workaround on these chips has been to let non-strict
alignment archs just use the normal 2k cluster, but use whatever
cluster can fit 2k + 2 on strict archs. that turns out to be the
4k cluster, meaning we waste nearly 2k of space on every packet.
properly aligning the ethernet header and ip headers gives a
performance boost, even on non-strict archs.
|
|
cpumem_realloc and counters_realloc actually allocated new per cpu data
for new cpus, they didnt resize the existing allocation.
specifically, this renames cpumem_reallod to cpumem_malloc_ncpus, and
counters_realloc to counters_alloc_ncpus.
ok (and with some fixes by) bluhm@
|
|
each cpus counters still have to be protected by splnet, but this
is better thana single set of counters protected by a global mutex.
ok bluhm@
|
|
Unify these by placing #ifdef MULTIPROCESSOR inside the functions, then
collapse further to reduce _KERNEL blocks
ok dlg
|
|
|
|
|
|
and redirect inet6 sockets to the ::1 flavor of localhost.
|
|
ok jsing@ kettenis@
|
|
ok jsing@ kettensi@
|
|
ok kettenis@ jsing@
|
|
ok deraadt@
|
|
from markus@
|
|
both the cpumem and counters api simply allocates memory for each cpu in
the system that can be used for arbitrary per cpu data (via cpumem), or
a versioned set of counters per cpu (counters).
there is an alternate backend for uniprocessor systems that basically
turns the percpu data access into an immediate access to a single
allocation.
there is also support for percpu data structures that are available at
boot time by providing an allocation for the boot cpu. after autoconf,
these allocations have to be resized to provide for all cpus that were
enumerated by boot.
ok mpi@
|
|
Make process_auxv_offset() take and release a reference of the vmspace like
process_domem() does.
ok kettenis@
|
|
powerpc: rename second argument of pmap_proc_iflush() to match other archs
ok kettenis@
|
|
ispidtaken() can rely on pgfind() for all pgrp checks and can simply
use zombiefind() for the zombie check
ok jca@
|
|
no functional change
|
|
this is cheap since it is basic math. it also means that payloads
which have been aligned carefully will also be aligned in their
copy.
ok yasuoka@ claudio@
|
|
are for option PTRACE only
ok kettenis@
|
|
have an splsoftassert(IPL_SOFTNET) now, so sowakeup() does not need
to call splsoftnet() anymore.
From mpi@'s netlock diff; OK mikeb@
|
|
|
|
|
|
ok deraadt@
|
|
all dns socket connections will be redirected to localhost:port.
this could be a sockopt on the listening socket, but sysctl is
an easier interface to work with right now.
ok deraadt
|
|
splsoftnet() if the function does a splsoftassert(IPL_SOFTNET)
anyway.
|
|
From mpi@'s netlock diff; OK mikeb@
|
|
set variables that will be later used as the size argument to
free(NULL calls. This should be harmless as free returns early if the
address is NULL without checking the size. Initialise these variables
before the call to ensure they are never passed to another function
uninitialised.
ok tedu@ millert@ deraadt@
|
|
Use a local variable struct process *pr to simplify expressions
ok deraadt@
|
|
ok mpi@ mikeb@
|
|
as noted by haesbaert, this is necessary to avoid deadlocks because
the scheduler can call back into the timeout subsystem while its
holding its own locks.
this happened in two places. firstly, in softclock() it would take
timeout_mutex to find pending work. if that pending work needs a
process context, it would queue the work for the thread and call
wakeup, which enters the scheduler locks. if another cpu is trying
to tsleep (or msleep) with a timeout specified, the sleep code would
be holding the sched lock and call timeout_add, which takes
timeout_mutex.
this is solved by deferring the wakeup to after timeout_mutex is
left. this also has the benefit of mitigating the number of wakeups
done per softclock tick.
secondly, the timeout worker thread takes timeout_mutex and calls
msleep when there's no work to do (ie, the queue is empty). msleep
will take the sched locks. again, if another cpu does a tsleep
with a timeout, you get a deadlock.
to solve this im using sleep_setup and sleep_finish to sleep on an
empty queue, which is safe to do outside the lock as it is comparisons
of the queue head pointers, not derefs of the contents of the queue.
as long as the sleeps and wakeups are ordered correctly with the
enqueue and dequeue operations under the mutex, this all works.
you can think of the queue as a single descriptor ring, and the
wakeup as an interrupt.
the second deadlock was identified by guenther@
ok tedu@ mpi@
|
|
from Sebastien Marie
|
|
|
|
Years ago Theo made read(2) return 0 on directories, instead of dumping
the directory content. Another behavior is allowed as an extension by
POSIX, returning an EISDIR error, as used on a few other systems. This
behavior is deemed more useful as it helps spotting errors. This
implies that it might break some setups.
Ports bulk builds by ajacoutot@ and naddy@, ok millert@ bluhm@ naddy@
deraadt@
|
|
the comparison is always true.
ok jca@, tedu@
|
|
|
|
|
|
|
|
Add sysctl kern.allowkmem (default 0) which controls the ability to open
/dev/mem or /dev/kmem at securelevel > 0. Over 15 years we converted 99%
of utilities in the tree to operate on sysctl-nodes (either by themselves
or via code hiding in the guts of -lkvm).
pstat -d and -v & procmap are affected and continued use of them will
require kern.allowkmem=1 in /etc/sysctl.conf. acpidump (and it's
buddy sendbug) are affected, but we'll work out a solution soon.
There will be some impact in ports.
ok kettenis guenther
|
|
ok guenther
|
|
ok guenther
|
|
callbacks needing a process context.
The function timeout_set_proc(9) has to be used instead of timeout_set(9)
when a timeout callback needs a process context.
Note that if such a timeout is waiting, understand sleeping, for a non
negligible amount of time it might delay other timeouts needing a process
context.
dlg@ agrees with this as a temporary solution.
Manpage tweaks from jmc@
ok kettenis@, bluhm@, mikeb@
|
|
paths of libevent). This interface was the first generation of what
eventually became getentropy(2) and arc4random(3) -- june 1997!
Ports scan by sthen, general agreement guenther
|
|
we enter networking code. Fixes an splassert() found by David Hill.
OK mikeb@
|