Age | Commit message (Collapse) | Author |
|
coordinate with other mbufs so you can add all the pointers without
taking the extref lock.
looks good deraadt@
|
|
snuck in.
someone who knows how cpp/cc works can explain to me why this
compiled.
|
|
of pools mpsafe too.
this calles pool_setipl(IPL_NET) against the mbuf and cluster pools,
and removes the use of splnet().
the other locking done in the mbuf layer is for external cluster
references. again, they relied on splnet to serialise these operations.
because there is no shared memory associated with external clusters
(except the cluster itself, which is completely dedicated to data
payload, not meta info like a refcount or lock), this has been
replaced with a single mutex that all reference ops are serialised
with.
tested by me, jmatthew@, bcook@, and phessler@
|
|
|
|
and km_free(9) calls.
ok tedu@, mlarkin@
|
|
ok dlg
|
|
|
|
deep down in the suspend path, where it is really hard to recover from
allocation failure. So allocate the piglet early on in the suspend path.
Also change the piglet and piglet allocation functions to use km_alloc(9)
instead of doing pmemrange magic. This removes a bunch of code which, in the
case of the piglet allocation, is broken since it results in a NULL pointer
dereference. Also switch the piglet allocation to not wait. If we can't
allocate 16MB of phys contig memory on a halfway modern machine we're almost
certainly under a lot of memory pressure and we're better off not trying to
hibernate anyway.
ok mlarkin@
|
|
in pool_setlowat.
this was stopping arm things from getting spare items into their
pmap entry pools, so things that really needed them in a delicate
part of boot were failing.
reported by rapha@
co-debugging with miod@
|
|
and SPCF_HALTED, these flags only make sense on secondary CPUs which are
unlikely to be present on a SP kernel.
ok kettenis@
|
|
for subr_poison.c will not get compiled at all on !DIAGNOSTIC kernels.
Found the hard way by deraadt@
|
|
least). after this i am confident that pools are mpsafe, ie, can
be called without the kernel biglock being held.
the page allocation and setup code has been split into four parts:
pool_p_alloc is called without any locks held to ask the pool_allocator
backend to get a page and page header and set up the item list.
pool_p_insert is called with the pool lock held to insert the newly
minted page on the pools internal free page list and update its
internal accounting.
once the pool has finished with a page it calls the following:
pool_p_remove is called with the pool lock help to take the now
unnecessary page off the free page list and uncount it.
pool_p_free is called without the pool lock and does a bunch of
checks to verify that the items arent corrupted and have all been
returned to the page before giving it back to the pool_allocator
to be freed.
instead of pool_do_get doing all the work for pool_get, it is now
only responsible for doing a single item allocation. if for any
reason it cant get an item, it just returns NULL. pool_get is now
responsible for checking if the allocation is allowed (according
to hi watermarks etc), and for potentially sleeping waiting for
resources if required.
sleeping for resources is now built on top of pool_requests, which
are modelled on how the scsi midlayer schedules access to scsibus
resources.
the pool code now calls pool_allocator backends inside its own
calls to KERNEL_LOCK and KERNEL_UNLOCK, so users of pools dont
have to hold biglock to call pool_get or pool_put.
tested by krw@ (who found a SMALL_KERNEL issue, thank you)
noone objected
|
|
|
|
on all relevant device hierarchies in the appropriate order. For now this
means mpath(4) and mainbus(4), doing mpath(4) before mainbus(4) when
suspending or powering down and doing mpath(4) after mainbus(4) when
resuming such that mpath(4) can realy on the underlying hardware being
in a functional state.
Fixes problems with unflushed disk caches on machines where mpath(4) takes
control of some of your disks.
ok dlg@
|
|
|
|
OK guenther@
|
|
|
|
No functional change as pid_t is defined as int32_t.
OK miod@
|
|
of EINVAL like other sysctl things do.
|
|
some pool users (eg, mbufs and mbuf clusters) protect calls to pools
with their own locks that operate at high spl levels, rather than
pool_setipl() to have pools protect themselves.
this means pools mtx_enter doesnt necessarily prevent interrupts
that will use a pool, so we get code paths that try to mtx_enter
twice, which blows up.
reported by vlado at bsdbg dot net and matt bettinger
diagnosed by kettenis@
|
|
|
|
poked by kspillner@
ok miod@
|
|
related to disk stastics for almost 17 years, and the remaining
userland-visible defines duplicate those found in <sys/sched.h>.
Move the remaining _KERNEL defines to <sys/tty.h> where they belong, and
update all users to cope with this.
ok kettenis@
|
|
ok mpi@ kspillner@
|
|
CIRCLEQ_* is deprecated and not called in the tree. The other queue types
have *_END macros which were added for symmetry with CIRCLEQ_END. They are
defined as NULL. There's no reason to keep the other *_END macro calls.
ok millert@
|
|
yield() if the cpu is marked SHOULDYIELD.
ok miod@ tedu@ phessler@
|
|
confirmation: it was only used for netiso, which was deleted a *decade* ago
ok mpi@ claudio@ ports scan by sthen@
|
|
no functional change.
|
|
ok miod@ mpi@
|
|
months that I broke it before the 5.5 release.
confirmed as not being required by ports by sthen@, ajacoutot@, dcoppa@
|
|
of pr_phoffset.
ok doug@ guenther@
|
|
|
|
cpu_info.
|
|
this moves the size of the pool page (not arch page) out of the
pool allocator into struct pool. this lets us create only two pools
for the automatically determined large page allocations instead of
256 of them.
while here support using slack space in large pages for the
pool_item_header by requiring km_alloc provide pool page aligned
memory.
lastly, instead of doing incorrect math to figure how how many arch
pages to use for large pool pages, just use powers of two.
ok mikeb@
|
|
|
|
discussion, help and ok guenther@
|
|
provide the magic.
ok matthew@ dlg@
|
|
this should provide a degree of scan resistance, and also serves as a
midway point for further development of multi queue algorithms.
i've tried to minimize the risk and degree of regressions.
probably ok beck
|
|
|
|
|
|
when creating them: pipe2(), dup3(), accept4(), MSG_CMSG_CLOEXEC,
SOCK_CLOEXEC. Includes SOCK_NONBLOCK support.
ok matthew@
|
|
for the protective ones when creating a fake label, but do, for the system
ones, so that we may eventually copy boot code to them.
From Markus Mueller
|
|
|
|
cut it out of the code to simplify things.
ok mikeb@
|
|
The interface has been disabled by default for about 4 years and
currently there's not much value in having it around at all.
ok deraadt
|
|
add an explicit rwlock around the global state (the pool list and serial
number) rather than rely on implicit process exclusion, splhigh and splvm.
the only things touching the global state come from process context so we
can get away with an rwlock instead of a mutex. thankfully.
ok matthew@
|
|
ok miod@, who has offerred to help with any MD fallout
ok guenther@
|
|
and a count of the mbufs.
struct mbuf_list and the ml_foo() apis can be used to build lists of
mbufs where you dont need locking (eg, on the stack).
struct mbuf_queue and mq_foo() wrap mbuf_lists with a mutex, and
limits the number of mbufs that can be queued. they can be useful
for moving mbufs between contexts/subsystems.
with help from jmc@ for the manpage bits
mpi@ is keen
|
|
containing an item when its returned to the pool. this means you
need to do an inexact comparison between an items address and the
page address, cos a pool page can contain many items.
previously this used RB_FIND with a compare function that would do math
on every node comparison to see if one node (the key) was within the other
node (the tree element).
this cuts it over to using RB_NFIND to find the closest tree node
instead of the exact tree node. the node compares turns into simple
< and > operations, which inline very nicely with the RB_NFIND. the
constraint (an item must be within a page) is then checked only
once after the NFIND call.
feedback from matthew@ and tedu@
|
|
made it so struct pool was only visible to _KERNEL. tedu broke it
too when he added the size argument to the kernel free
functions.
this fixes both issues. the main change is to provide a local version of
struct pool with just the bit (pr_size) needed for extent to run.
if extents take advantage of more malloc/pool features (eg, {M,PR}_ZERO
then this will need to be updated again.
found by and based on a diff from Theo Buehler
ok mpi@
|