summaryrefslogtreecommitdiff
path: root/sys/kern/uipc_mbuf.c
AgeCommit message (Collapse)Author
2017-02-07enable per cpu caches on the mbuf pools.David Gwynne
this didnt make sense previously since the mbuf pools had item limits that meant the cpus had to coordinate via a single counter to make sure the limit wasnt exceeded. mbufs are now limited by how much memory can be allocated for pages from the system. individual pool items are no longer counted and therefore do not have to be coordinated. ok bluhm@ as part of a larger diff.
2017-02-07move the mbuf pools to m_pool_init and a single global memory limitDavid Gwynne
this replaces individual calls to pool_init, pool_set_constraints, and pool_sethardlimit with calls to m_pool_init. m_pool_init inits the mbuf pools with the mbuf pool allocator, and because of that doesnt set per pool limits. ok bluhm@ as part of a larger diff
2017-02-07add m_pool_init(), a wrapper around pool_init for mbuf clusters.David Gwynne
m_pool_init is basically a call to pool_init with everythign except the size and alignment specified, and a call to pool_set_constraints so the memroy is always dma reachable. it also wires up the memory with the custom mbuf pool allocator. ok bluhm@ as part of a larger diff
2017-02-07provide a custom pool page allocator for mbufs, but dont use it yet.David Gwynne
the custom allocator is basically a wrapper around the multi page pool allocator, but it has a single global memory limit managed by the wrapper. currently each of the mbuf pools has their own memory limit (or none in the case of the myx pool) independent of the other pools. this means each pool can allocate up to nmbclust worth of mbufs, rather than all of them sharing the one limit. wrapping the allocator like this means we can move to a single memory limit for all mbufs in the system. ok bluhm@ as part of a larger diff
2017-02-05Always allocate counters memory using type M_COUNTERS.Jeremie Courreges-Anglas
This makes the API simpler, and is probably more useful than spreading counters memory other several types, making it harder to track. Prodded by mpi, ok mpi@ stsp@
2017-01-25Clear the reference of the original mbuf chain after m_split()'ingMartin Pieuchot
a mbuf and properly intialize m_len. From FreeBSD via Imre Vadasz. ok bluhm@
2016-11-29m_free() and m_freem() test for NULL. Simplify callers which had their ownJonathan Gray
NULL tests. ok mpi@
2016-11-09Do not dereference a variable without initializing it beforehand.Martin Pieuchot
Fix a typo introduced in m_pullup(9) refactoring and found the hard way by semarie@ while testing another diff. ok mikeb@, dlg@
2016-10-27refactor m_pullup a bit.David Gwynne
the most important change is that if the requested data is already in the first mbuf in the chain, return quickly. if that isnt true, the code will try to use the first mbuf to fit the requested data. if that isnt true, it will prepend an mbuf, and maybe a cluster, to fit the requested data. m_pullup will now try to maintain the alignment of the original payload, even when prepending a new mbuf for it. ok mikeb@
2016-10-27add a new pool for 2k + 2 byte (mcl2k2) clusters.David Gwynne
a certain vendor likes to make chips that specify the rx buffer sizes in kilobyte increments. unfortunately it places the ethernet header on the start of the rx buffer, which means if you give it a mcl2k cluster, the ethernet header will not be ETHER_ALIGNed cos mcl2k clusters are always allocated on 2k boundarys (cos they pack into pages well). that in turn means the ip header wont be aligned correctly. the current workaround on these chips has been to let non-strict alignment archs just use the normal 2k cluster, but use whatever cluster can fit 2k + 2 on strict archs. that turns out to be the 4k cluster, meaning we waste nearly 2k of space on every packet. properly aligning the ethernet header and ip headers gives a performance boost, even on non-strict archs.
2016-10-24avoid using realloc in the name of things that dont work like realloc.David Gwynne
cpumem_realloc and counters_realloc actually allocated new per cpu data for new cpus, they didnt resize the existing allocation. specifically, this renames cpumem_reallod to cpumem_malloc_ncpus, and counters_realloc to counters_alloc_ncpus. ok (and with some fixes by) bluhm@
2016-10-24move the mbstat structure to percpu countersDavid Gwynne
each cpus counters still have to be protected by splnet, but this is better thana single set of counters protected by a global mutex. ok bluhm@
2016-10-10white space fixes.David Gwynne
no functional change
2016-10-10copy the offset of data inside mbufs in m_copym().David Gwynne
this is cheap since it is basic math. it also means that payloads which have been aligned carefully will also be aligned in their copy. ok yasuoka@ claudio@
2016-09-15all pools have their ipl set via pool_setipl, so fold it into pool_init.David Gwynne
the ioff argument to pool_init() is unused and has been for many years, so this replaces it with an ipl argument. because the ipl will be set on init we no longer need pool_setipl. most of these changes have been done with coccinelle using the spatch below. cocci sucks at formatting code though, so i fixed that by hand. the manpage and subr_pool.c bits i did myself. ok tedu@ jmatthew@ @ipl@ expression pp; expression ipl; expression s, a, o, f, m, p; @@ -pool_init(pp, s, a, o, f, m, p); -pool_setipl(pp, ipl); +pool_init(pp, s, a, ipl, f, m, p);
2016-09-15we dont need m_copym0 with m_copym as a single wrapper, so merge them.David Gwynne
cos m_copym only does shallow copies, we can make the code do them unconditionally. for millert@
2016-09-15remove m_copym2 as its use has been replaced by m_dup_pktDavid Gwynne
ok millert@ mpi@ henning@ claudio@ markus@
2016-09-13avoid extensive mbuf allocation for IPsec by replacing m_inject(4)Markus Friedl
with m_makespace(4) from freebsd; ok mpi@, bluhm@, mikeb@, dlg@
2016-09-03Limit all mbuf cluster pools to the same memory size. Having limitsAlexander Bluhm
by number would allow the large clusters using too much memory. Set size of mclsizes array explicitly to keep it in sync with mclpools. OK claudio@
2016-06-13On localhost a user program may create a socket splicing loop.Alexander Bluhm
After writing data into this loop, it was spinning forever causing a kernel hang. Detect the loop by counting how often the same mbuf is spliced. If that happens 128 times, assume that there is a loop and abort the splicing with ELOOP. Bug found by tedu@; OK tedu@ millert@ benno@
2016-05-23remove the function pointer from mbufs. this memory is shared with dataTed Unangst
via unions, and we don't want to make it easy to control the target. instead an integer index into an array of acceptable functions is used. drivers using custom functions must register them to receive an index. ok deraadt
2016-04-15remove ml_filter, mq_filter, niq_filter.David Gwynne
theyre currently unused, so no functional change.
2016-04-08add m_purge for freeing a list of mbufs linked via m_nextpktDavid Gwynne
this tweaks m_freem so it returns the m_nextpkt from the mbuf it freed, like how m_free returns the m_next from the mbuf it frees. ok mpi@
2016-04-06correct the order of arguments to m_get in m_dup_pktDavid Gwynne
2016-03-29- packet must keep reference to statekeyAlexandr Nedvedicky
this is the second attempt to get it in, the first attempt got backed out on Jan 31 2016 the change also contains fixes contributed by Stefan Kempf in earlier iteration. OK srhen@
2016-03-22dont mix up the len and flats argument to MCLGETI in m_dup_pktDavid Gwynne
2016-02-23provide m_dup_pkt() for doing fast deep mbuf copies with a specified alignmentDavid Gwynne
if a physical interface receives a multicast/broadcast packet and has carp interfaces on it, that packet needs to be copied for reception by each of those carp interfaces. previously it was using m_copym2, but that doesn't respect the alignment of the source packet. this meant the ip header in the copies were aligned incorrectly for the network stack, which breaks strict alignment archs. m_dup_pkt lets carp specify that the payload needs an ETHER_ALIGN adjustment, so the ip header inside will be aligned correctly. reported and tested by anthony eden who hit this on armv7 i reproduced the problem on sparc64 and verified the fix on amd64 and sparc64 ok mpi@ mikeb@ deraadt@
2016-01-31- m_pkthdr.pf.statekey changes are not ready for 5.9, I must back them outAlexandr Nedvedicky
OK sthen@
2016-01-07- retrying to commit earlier change, which got backed outAlexandr Nedvedicky
- yet another tiny step towards MP PF. This time we need to make sure statekey attached to packet stays around, while accepted packet is routed through IP stack. this time I'm also bringing fix contributed by Stefan Kempf. Stefan's fix makes sure we grab reference in m_dup_pkthdr() OK bluhm@
2015-12-23revert previous:Jasper Lievisse Adriaanse
---------------------------------------------------------------------- revision 1.961 date: 2015/12/22 13:33:26; author: sashan; state: Exp; lines: +153 -44; commitid: oBRhtWcDV0ThviVT; - yet another tiny step towards MP PF. This time we need to make sure statekey attached to packet stays around, while accepted packet is routed through IP stack. OK mpi@, henning@ ---------------------------------------------------------------------- there have been multiple reports of KASSERT(!pf_state_key_isvalid(sk)) being triggered without much effort, so back this out for now.
2015-12-22- yet another tiny step towards MP PF. This time we need to make sureAlexandr Nedvedicky
statekey attached to packet stays around, while accepted packet is routed through IP stack. OK mpi@, henning@
2015-11-21Retire ml_requeue(9) and mq_requeue(9).Martin Pieuchot
As Kenjiro Cho pointed out it is very hard to cancel a dequeue operation for some queueing disciplines when such it keeps some internal states. As you can see, APIs can also Live Fast & Die Young. ok dlg@
2015-11-13Use ph_ prefix for tag-related fields.Martin Pieuchot
ok dlg@
2015-11-12Prefix flowid with ph_ and print it in m_print().Martin Pieuchot
ok dlg@
2015-11-02provide ml_purge and mq_purge.David Gwynne
these are modelled on IF_PURGE or IFQ_PURGE. they m_freem all the mbufs on an mbuf list or queue. ok jmatthew@ mpi@
2015-10-30Let m_resethdr() clear the whole mbuf packet header, not only theAlexander Bluhm
pf part. This allows to reuse this function in socket splicing. Reset the mbuf flags that are related to the packet header, but preserve the data flags. pair(4) tested by reyk@; sosplice(9) tested by bluhm@; OK mikeb@ reyk@
2015-10-30Add m_resethdr() to clear any state (pf, tags, flags) of an mbuf packet.Reyk Floeter
Start using it in pair(4) to clear state on the receiving interface; m_resethdr() will also be used in other parts of the stack. OK bluhm@ mikeb@
2015-10-22rename ml_join to ml_enlist and expose it to the rest of the kernel.David Gwynne
2015-08-14provide ml_requeue and mq_requeue for prepending mbufs on lists/queuesDavid Gwynne
ok mpi@ claudio@
2015-07-15m_free() can now accept NULL, as a normal free() function. This makesTheo de Raadt
calling code simpler. ok stsp mpi
2015-06-16Store a unique ID, an interface index, rather than a pointer to theMartin Pieuchot
receiving interface in the packet header of every mbuf. The interface pointer should now be retrieved when necessary with if_get(). If a NULL pointer is returned by if_get(), the interface has probably been destroy/removed and the mbuf should be freed. Such mechanism will simplify garbage collection of mbufs and limit problems with dangling ifp pointers. Tested by jmatthew@ and krw@, discussed with many. ok mikeb@, bluhm@, dlg@
2015-05-31If the first list was empty, ml_join() did not not clear the secondAlexander Bluhm
list after transferring all elements away. Reorder the conditionals to make sure that ml_init() is always called for a non empty second list. This makes all cases consistent and is less surprising. OK dlg@
2015-04-13Now that if_input() set the receiving interface pointer on mbufs for usMartin Pieuchot
there's no need to do it in m_devget(9). Stop passing an ``ifp'' will help for upcoming interface pointer -> index conversion. While here remove unused ``ifp'' argument from m_clget(9) and kill two birds^W layer violations in one commit. ok henning@
2015-03-14Remove some includes include-what-you-use claims don'tJonathan Gray
have any direct symbols used. Tested for indirect use by compiling amd64/i386/sparc64 kernels. ok tedu@ deraadt@
2015-02-07make mq_enlist drop mbufs is the queues length is exceeded.David Gwynne
ok mpi@ claudio@ henning@ and more at s2k15
2015-02-07add ml_filter and mq_filter functions to the mbuf list and queue apis.David Gwynne
this lets you run a filter function against each mbuf on a list or queue. if the filter matches on an mbuf, it can return non-zero to have ml_filter or mq_filter remove the mbuf and return it as part of a chain of mbufs. ok mpi@ claudio@ henning@ and s2k15 generally.
2014-12-11convert bcopy to memcpy/memmove. ok krwTed Unangst
2014-11-05change the mbuf pool wait channel name from mbpl to mbufpl. "mb"David Gwynne
isnt descriptive enough for me. ok deraadt@
2014-10-03if you're adding the first cluster reference, you dont have toDavid Gwynne
coordinate with other mbufs so you can add all the pointers without taking the extref lock. looks good deraadt@
2014-10-03i moved some macros into functions, and a trailing \ on a statementDavid Gwynne
snuck in. someone who knows how cpp/cc works can explain to me why this compiled.