summaryrefslogtreecommitdiff
path: root/sys/kern/uipc_mbuf.c
AgeCommit message (Collapse)Author
2019-06-10add m_microtime for getting the wall clock time associated with a packetDavid Gwynne
if the packet has the M_TIMESTAMP csum_flag, ph_timestamp is added to the boottime clock, otherwise it just uses microtime().
2019-02-10revert revert revert. there are many other archs that use custom allocs.Ted Unangst
2019-02-10make it possible to reduce kmem pressure by letting some pools use a moreTed Unangst
accomodating allocator. an interrupt safe pool may also be used in process context, as indicated by waitok flags. thanks to the garbage collector, we can always free pages in process context. the only complication is where to put the pages. solve this by saving the allocation flags in the pool page header so the free function can examine them. not actually used in this diff. (coming soon.) arm testing and compile fixes from phessler
2019-02-01make m_pullup use the first mbuf with data to measure alignment.David Gwynne
this fixes an issue found by a regress test on sparc64 by claudio, and between us took about half a day of work to understand and fix at a2k19. ok claudio@
2019-01-09Eliminate an else branch from m_extunref().Visa Hankala
OK millert@ bluhm@
2019-01-08If the mbuf cluster in m_zero() is read only, propagate the M_ZEROIZEAlexander Bluhm
flag to the other references. Then the final m_free() will clear the memory. OK claudio@
2019-01-07It is possible to call m_zero with a read-only cluster. In that case justClaudio Jeker
return. Hopefully the other reference holder has the M_ZEROIZE flag set as well. Triggered by syzkaller. OK deradt@ visa@ Reported-by: syzbot+c578107d70008715d41f@syzkaller.appspotmail.com
2018-11-30Trivial MH_ALIGN/M_ALIGN to m_align conversions.Claudio Jeker
OK bluhm@
2018-11-12Introduce m_align() a function that works like M_ALIGN() but works withClaudio Jeker
all types of mbufs. Also introduce some KASSERT in the m_*space() functions to ensure that no negative number is returned. This also introduces two internal macros M_SIZE() & M_DATABUF() which return the right size and start pointer of the mbuf data area. Use it in a few obvious places to simplify code. OK bluhm@
2018-11-09M_LEADINGSPACE() and M_TRAILINGSPACE() are just wrappers forClaudio Jeker
m_leadingspace() and m_trailingspace(). Convert all callers to call directly the functions and remove the defines. OK krw@, mpi@
2018-09-13Add reference counting for inet pcb, this will be needed when weAlexander Bluhm
start locking the socket. An inp can be referenced by the PCB queue and hashes, by a pf mbuf header, or by a pf state key. OK visa@
2018-09-10Instead of calculating the mbuf packet header length here and there,Alexander Bluhm
put the algorithm into a new function m_calchdrlen(). Also set an uninitialized m_len to 0 in NFS code. OK claudio@
2018-09-10During fragment reassembly, mbuf chains with packet headers wereAlexander Bluhm
created. Add a new function m_removehdr() do convert packet header mbufs within the chain to regular mbufs. Assert that the mbuf at the beginning of the chain has a packet header. found by Maxime Villard in NetBSD; from markus@; OK claudio@
2018-03-18NULL deref on armv7 performing NFS, within 10 seconds.Theo de Raadt
Previous commit has no OK's or discussion about testing.
2018-03-13make m_pullup skip over empty mbufs when finding the payload alignment.David Gwynne
2018-03-12make m_adj keep m_data aligned when removing all the data in an mbuf.David Gwynne
previously it took a shortcut when emptying an mbuf by only setting m_len to 0, but leaving m_data alone. this interacts badly with m_pullup, which tries to maintain the alignment of the data payload. if there was a 14 byte ethernet header on its own that was m_adjed off, and then the stack wants an ip header, m_pullup would put the ip header on the ethernet header alignment, which is off by 2 bytes. found by stsp@ with pair(4) on sparc64. ok stsp@ too
2018-01-16garbage collect an unused variableSebastian Benoit
ok dlg@
2017-12-29Make sure that pf_mbuf_link_state_key() does not overwrite anAlexander Bluhm
existing statekey in the mbuf header. Reset the statekey in m_dup_pkthdr(). suggested by and OK sahan@
2017-12-29Make the functions which link the pf state keys to mbufs, inpcbs,Alexander Bluhm
or other states more consistent. OK visa@ sashan@ on a previous version
2017-10-12Move sysctl_mq() where it can safely mess with mbuf queue internals.Martin Pieuchot
ok visa@, bluhm@, deraadt@
2017-09-15Coverity complained that the while loop at the end of m_adj() couldAlexander Bluhm
dereference m if it is NULL. See CID 501458. - Remove the m NULL check from the final for loop, it is not necessary. This cannot happen due to the length calculation. The inconsistent code caused the coverity issue. - Move the m = mp close to all the loops where the mbuf chain is traversed. - Use mp to access the m_pkthdr consistently. - Move the next assignemnt from for (;;m = m->m_next) to the end of the loop to make it consistent to the previous for (;;) where the total length is calculated. OK visa@ mpi@
2017-05-27Put an assert that M_PKTHDR is set before accessing m_pkthdr in theAlexander Bluhm
mbuf functions. OK claudio@
2017-05-27Refactor m_makespace() using MCLGETI to simplify the logic of this function.Claudio Jeker
Still quite complicated but more legible in the end and it will do less M_GET calls for huge packets. OK bluhm@
2017-05-08add a compile time assertion MSIZE == sizeof(struct mbuf)Ted Unangst
ok kettenis mpi tom
2017-02-07enable per cpu caches on the mbuf pools.David Gwynne
this didnt make sense previously since the mbuf pools had item limits that meant the cpus had to coordinate via a single counter to make sure the limit wasnt exceeded. mbufs are now limited by how much memory can be allocated for pages from the system. individual pool items are no longer counted and therefore do not have to be coordinated. ok bluhm@ as part of a larger diff.
2017-02-07move the mbuf pools to m_pool_init and a single global memory limitDavid Gwynne
this replaces individual calls to pool_init, pool_set_constraints, and pool_sethardlimit with calls to m_pool_init. m_pool_init inits the mbuf pools with the mbuf pool allocator, and because of that doesnt set per pool limits. ok bluhm@ as part of a larger diff
2017-02-07add m_pool_init(), a wrapper around pool_init for mbuf clusters.David Gwynne
m_pool_init is basically a call to pool_init with everythign except the size and alignment specified, and a call to pool_set_constraints so the memroy is always dma reachable. it also wires up the memory with the custom mbuf pool allocator. ok bluhm@ as part of a larger diff
2017-02-07provide a custom pool page allocator for mbufs, but dont use it yet.David Gwynne
the custom allocator is basically a wrapper around the multi page pool allocator, but it has a single global memory limit managed by the wrapper. currently each of the mbuf pools has their own memory limit (or none in the case of the myx pool) independent of the other pools. this means each pool can allocate up to nmbclust worth of mbufs, rather than all of them sharing the one limit. wrapping the allocator like this means we can move to a single memory limit for all mbufs in the system. ok bluhm@ as part of a larger diff
2017-02-05Always allocate counters memory using type M_COUNTERS.Jeremie Courreges-Anglas
This makes the API simpler, and is probably more useful than spreading counters memory other several types, making it harder to track. Prodded by mpi, ok mpi@ stsp@
2017-01-25Clear the reference of the original mbuf chain after m_split()'ingMartin Pieuchot
a mbuf and properly intialize m_len. From FreeBSD via Imre Vadasz. ok bluhm@
2016-11-29m_free() and m_freem() test for NULL. Simplify callers which had their ownJonathan Gray
NULL tests. ok mpi@
2016-11-09Do not dereference a variable without initializing it beforehand.Martin Pieuchot
Fix a typo introduced in m_pullup(9) refactoring and found the hard way by semarie@ while testing another diff. ok mikeb@, dlg@
2016-10-27refactor m_pullup a bit.David Gwynne
the most important change is that if the requested data is already in the first mbuf in the chain, return quickly. if that isnt true, the code will try to use the first mbuf to fit the requested data. if that isnt true, it will prepend an mbuf, and maybe a cluster, to fit the requested data. m_pullup will now try to maintain the alignment of the original payload, even when prepending a new mbuf for it. ok mikeb@
2016-10-27add a new pool for 2k + 2 byte (mcl2k2) clusters.David Gwynne
a certain vendor likes to make chips that specify the rx buffer sizes in kilobyte increments. unfortunately it places the ethernet header on the start of the rx buffer, which means if you give it a mcl2k cluster, the ethernet header will not be ETHER_ALIGNed cos mcl2k clusters are always allocated on 2k boundarys (cos they pack into pages well). that in turn means the ip header wont be aligned correctly. the current workaround on these chips has been to let non-strict alignment archs just use the normal 2k cluster, but use whatever cluster can fit 2k + 2 on strict archs. that turns out to be the 4k cluster, meaning we waste nearly 2k of space on every packet. properly aligning the ethernet header and ip headers gives a performance boost, even on non-strict archs.
2016-10-24avoid using realloc in the name of things that dont work like realloc.David Gwynne
cpumem_realloc and counters_realloc actually allocated new per cpu data for new cpus, they didnt resize the existing allocation. specifically, this renames cpumem_reallod to cpumem_malloc_ncpus, and counters_realloc to counters_alloc_ncpus. ok (and with some fixes by) bluhm@
2016-10-24move the mbstat structure to percpu countersDavid Gwynne
each cpus counters still have to be protected by splnet, but this is better thana single set of counters protected by a global mutex. ok bluhm@
2016-10-10white space fixes.David Gwynne
no functional change
2016-10-10copy the offset of data inside mbufs in m_copym().David Gwynne
this is cheap since it is basic math. it also means that payloads which have been aligned carefully will also be aligned in their copy. ok yasuoka@ claudio@
2016-09-15all pools have their ipl set via pool_setipl, so fold it into pool_init.David Gwynne
the ioff argument to pool_init() is unused and has been for many years, so this replaces it with an ipl argument. because the ipl will be set on init we no longer need pool_setipl. most of these changes have been done with coccinelle using the spatch below. cocci sucks at formatting code though, so i fixed that by hand. the manpage and subr_pool.c bits i did myself. ok tedu@ jmatthew@ @ipl@ expression pp; expression ipl; expression s, a, o, f, m, p; @@ -pool_init(pp, s, a, o, f, m, p); -pool_setipl(pp, ipl); +pool_init(pp, s, a, ipl, f, m, p);
2016-09-15we dont need m_copym0 with m_copym as a single wrapper, so merge them.David Gwynne
cos m_copym only does shallow copies, we can make the code do them unconditionally. for millert@
2016-09-15remove m_copym2 as its use has been replaced by m_dup_pktDavid Gwynne
ok millert@ mpi@ henning@ claudio@ markus@
2016-09-13avoid extensive mbuf allocation for IPsec by replacing m_inject(4)Markus Friedl
with m_makespace(4) from freebsd; ok mpi@, bluhm@, mikeb@, dlg@
2016-09-03Limit all mbuf cluster pools to the same memory size. Having limitsAlexander Bluhm
by number would allow the large clusters using too much memory. Set size of mclsizes array explicitly to keep it in sync with mclpools. OK claudio@
2016-06-13On localhost a user program may create a socket splicing loop.Alexander Bluhm
After writing data into this loop, it was spinning forever causing a kernel hang. Detect the loop by counting how often the same mbuf is spliced. If that happens 128 times, assume that there is a loop and abort the splicing with ELOOP. Bug found by tedu@; OK tedu@ millert@ benno@
2016-05-23remove the function pointer from mbufs. this memory is shared with dataTed Unangst
via unions, and we don't want to make it easy to control the target. instead an integer index into an array of acceptable functions is used. drivers using custom functions must register them to receive an index. ok deraadt
2016-04-15remove ml_filter, mq_filter, niq_filter.David Gwynne
theyre currently unused, so no functional change.
2016-04-08add m_purge for freeing a list of mbufs linked via m_nextpktDavid Gwynne
this tweaks m_freem so it returns the m_nextpkt from the mbuf it freed, like how m_free returns the m_next from the mbuf it frees. ok mpi@
2016-04-06correct the order of arguments to m_get in m_dup_pktDavid Gwynne
2016-03-29- packet must keep reference to statekeyAlexandr Nedvedicky
this is the second attempt to get it in, the first attempt got backed out on Jan 31 2016 the change also contains fixes contributed by Stefan Kempf in earlier iteration. OK srhen@
2016-03-22dont mix up the len and flats argument to MCLGETI in m_dup_pktDavid Gwynne