summaryrefslogtreecommitdiff
path: root/sys/net/if_var.h
AgeCommit message (Collapse)Author
2018-01-10get rid of struct carp_if by moving the srpl into struct ifnet if_carp.David Gwynne
currently carp uses a struct carp_if to hold an srp list head, which is accessed by both if_carp in struct ifnet, and via the if input handlers list. this gets rid of some indirection by making if_carp itself the list head, rather than a pointer to the list head via a struct carp_if. it also makes accessing the list consistent by only using if_carp to get to it. ok mpi@
2018-01-08Convert IF_CLONE_INITIALIZER() into C99 initializer.Alexander Bluhm
OK mpi@
2018-01-04Include timeout & tasks in 'struct ifnet' instead of always allocatingMartin Pieuchot
them as M_TEMP. ok visa@
2018-01-02Move the NET_LOCK() inside the switch and start documenting which fieldMartin Pieuchot
is protected by which lock. ok bluhm@, visa@
2017-12-15add ifiqueues for mp safety and nics with multiple rx rings.David Gwynne
currently there is a single mbuf_queue per interface, which all rings on a nic shove packets onto. while the list inside this queue is protected by a mutex, the counters around it (ie, ipackets, ibytes, idrops) are not. this means updates can be lost, and reading the statistics is also inconsistent. having a single queue means that busy rx rings can dominate and then starve the others. ifiqueue structs are like ifqueue structs. they provide per ring queues, and independent counters for each ring. when ifdata is read for userland, these counters are aggregated. having a queue per ring now allows for per ring backpressure to be applied. MCLGETI will have it's day again. right now we assume every interface wants an input queue and unconditionally provide one. individual interfaces can opt into more. im not completely happy about the shape of this atm, but shuffling it around more makes the diff bigger. ok visa@
2017-11-17add if_rxr_livelocked so rxr users can request backpressure themselves.David Gwynne
right now the rx ring moderation code makes a decision globally that a machine is livelocked, and uses that to apply backpressure on all the rx rings. we're moving toward having the network stack run on multiple cpus, and fed from multiple rx rings. if_rxr_livelocked lets a driver apply backpressure explicitely if something tells it that whatever is consuming previous packets cannot keep up. while here expose the current ring watermark with if_rxr_cwm. tweaks and ok visa@
2017-10-31- add one more softnet taskqAlexandr Nedvedicky
NOTE: code still runs with single softnet task. change definition of SOFTNET_TASKS in net/if.c, if you want to have more than one softnet task OK mpi@, OK phessler@
2017-10-12Move sysctl_mq() where it can safely mess with mbuf queue internals.Martin Pieuchot
ok visa@, bluhm@, deraadt@
2017-05-08Added initial IPv6 multicast routing support for multiple rdomains:Rafael Zalamena
* don't share mifs (multicast interface) between rdomains * allow multiple routing sockets connected at the same time if they are in different rdomains. ok bluhm@
2017-01-24add support for multiple transmit ifqueues per network interface.David Gwynne
an ifq to transmit a packet is picked by the current traffic conditioner (ie, priq or hfsc) by providing an index into an array of ifqs. by default interfaces get a single ifq but can ask for more using if_attach_queues(). the vast majority of our drivers still think there's a 1:1 mapping between interfaces and transmit queues, so their if_start routines take an ifnet pointer instead of a pointer to the ifqueue struct. instead of changing all the drivers in the tree, drivers can opt into using an if_qstart routine and setting the IFXF_MPSAFE flag. the stack provides a compatability wrapper from the new if_qstart handler to the previous if_start handlers if IFXF_MPSAFE isnt set. enabling hfsc on an interface configures it to transmit everything through the first ifq. any other ifqs are left configured as priq, but unused, when hfsc is enabled. getting this in now so everyone can kick the tyres. ok mpi@ visa@ (who provided some tweaks for cnmac).
2017-01-21Make the if_flags member unsigned. This was prompted by clangPatrick Wildt
complaining that assigning the MULTICAST flag, which sets the uppermost bit, would invert the meaning of MULTICAST flag's numeric value. ok claudio@ deraadt@ tom@ visa@
2017-01-06Remove the global viftable vector that holds the virtual interfacesRafael Zalamena
configuration and instead use ifnet to store the configuration and counters. With this we can safely use multicast routing daemons on multiple domains without vif id colisions. ok mpi@
2016-11-14Automatically create a default lo(4) interface per rdomain.Martin Pieuchot
In order to stop abusing lo0 for all rdomains, a new loopback interface will be created every time a rdomain is created. The unit number will be the same as the rdomain, i.e. lo1 will be attached to rdomain 1. If this loopback interface is already in use it wont be possible to create the corresponding rdomain. In order to know which lo(4) interface is attached to a rdomain, its index is stored in the rtable/rdomain map. This is a long overdue since the introduction of rtable/rdomain. It also fixes a recent regression due to resetting the rdomain of an incoming packet reported by semarie@, Andreas Bartelt and Nils Frohberg. ok claudio@
2016-11-08RIP ifa_ifwithnet()Martin Pieuchot
ok vgross@
2016-09-04When auto-creating an interface when opening a /dev/{tun,tap,switch}Reyk Floeter
device, inherit the rdomain from the calling process. This adds an rdomain argument to if_clone_create(). OK mpi@ henning@
2016-09-03Use per-ifp tasks to process incoming packets.Martin Pieuchot
Reduce the number of if_get/if_put from one per packet to one per ring since we now know that all the packets are coming from the same interface. Improve forwarding performances by 10Kpps in Hrvoje Popovski's test setup. ok bluhm@, henning@, dlg@
2016-09-01Import switch(4), an in-kernel OpenFlow switch which can work alone.Kazuya Goda
switch(4) currently supports OpenFlow 1.3.5. Currently, it's disabled by the kernel config. With help from yasuoka@ reyk@ jsg@. ok deraadt@ yasuoka@ reyk@ henning@
2016-06-10Add the "llprio" field to struct ifnet, and the corresponding keywordVincent Gross
to ifconfig. "llprio" allows one to set the priority of packets that do not go through pf(4), as the case is for arp(4) or bpf(4). ok sthen@ mikeb@
2016-04-15remove ml_filter, mq_filter, niq_filter.David Gwynne
theyre currently unused, so no functional change.
2016-04-13We're always ready! So send IFQ_SET_READY() to the bitbucket.Martin Pieuchot
2015-12-18Remove leftover prototype.Visa Hankala
ok mpi@
2015-12-09Keep all ether prototypes in one place.Martin Pieuchot
2015-12-09rework the if_start mpsafe serialisation so it can serialise arbitrary workDavid Gwynne
work is represented by struct task. the start routine is now wrapped by a task which is serialised by the infrastructure. if_start_barrier has been renamed to ifq_barrier and is now implemented as a task that gets serialised with the start routine. this also adds an ifq_restart() function. it serialises a call to ifq_clr_oactive and calls the start routine again. it exists to avoid a race that kettenis@ identified in between when a start routine discovers theres no space left on a ring, and when it calls ifq_set_oactive. if the txeof side of the driver empties the ring and calls ifq_clr_oactive in between the above calls in start, the queue will be marked oactive and the stack will never call the start routine again. by serialising the ifq_set_oactive call in the start routine and ifq_clr_oactive calls we avoid that race. tested on various nics ok mpi@
2015-12-08if_stop is unused, so kill it.David Gwynne
ok mpi@
2015-12-08split the interface send queue (struct ifqueue) implementation out.David Gwynne
the intention is to make it more clear what belongs to a transmit queue and what belongs to an interface. suggested by and ok mpi@
2015-12-05Keep kernel definitions under _KERNEL to unbreak ports that includeMartin Pieuchot
<net/if_var.h> because some other operating systems have defines in there. ok jasper@
2015-12-03ip_send()/ip6_send() allow PF to send response packet in ipsoftnet task.Alexandr Nedvedicky
this avoids current recursion to pf_test() function. the change also switches icmp_error()/icmp6_error() to use ip_send()/ip6_send() so they are safe for PF. The idea comes from Markus Friedl. bluhm, mikeb and mpi helped me a lot to get it into shape. OK bluhm@, mpi@
2015-12-03Use SRPL_HEAD() and SRPL_ENTRY() to be consistent with and allow toMartin Pieuchot
fallback to a SLIST. ok dlg@, jasper@
2015-12-03rework if_start to allow nics to provide an mpsafe start routine.David Gwynne
existing start routines will still be called under the kernel lock and at IPL_NET. mpsafe start routines will be serialised so only one instance of each interfaces function will be running in the kernel at any point in time. this guarantees packets will be dequeued in order, and the start routines dont have to lock against themselves because if_start does it for them. the code to do that is based on the scsi runqueue code. this also provides an if_start_barrier() function that should wait until any currently running instances of if_start have finished. a driver can opt in to the mpsafe if_start call by doing the following: 1. setting ifp->if_xflags = IFXF_MPSAFE 2. only calling if_start() instead of its own start routine 3. clearing IFF_RUNNING before calling if_start_barrier() on its way down 4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback) to simplify the implementation the tx mitigation code has been removed. tested by several ok mpi@ jmatthew@
2015-12-02Remove forward declarations that are no longer needed, times and APIs areMartin Pieuchot
changing.
2015-11-27Keep lo(4) definitions inside if_loop.cMartin Pieuchot
2015-11-25replace IFF_OACTIVE manipulation with mpsafe operations.David Gwynne
there are two things shared between the network stack and drivers in the send path: the send queue and the IFF_OACTIVE flag. the send queue is now protected by a mutex. this diff makes the oactive functionality mpsafe too. IFF_OACTIVE is part of if_flags. there are two problems with that. firstly, if_flags is a short and we dont have any MI atomic operations to manipulate a short. secondly, while we could make the IFF_OACTIVE operates mpsafe, all changes to other flags would have to be made safe at the same time, otherwise a read-modify-write cycle on their updates could clobber the oactive change. instead, this moves the oactive mark into struct ifqueue and provides an API for changing it. there's ifq_set_oactive, ifq_clr_oactive, and ifq_is_oactive. these are modelled on ifsq_set_oactive, ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd. this diff includes changes to all the drivers manipulating IFF_OACTIVE to now use the ifsq_{set,clr_is}_oactive API too. ok kettenis@ mpi@ jmatthew@ deraadt@
2015-11-23There's no longer a need to include <net/hfsc.h> in <net/if_var.h>Martin Pieuchot
2015-11-21simplify ifq_deq_rollback by only having it unlock.David Gwynne
hfsc needed a rollback ifqop to requeue the mbuf because it used ml_dequeue in the begin op. now it uses MBUF_LIST_FIRST to get a ref to the first mbuf in deq_begin. now the disciplines dont need a rollback op, so ifq_deq_rollback can be simplified to just releasing the mutex. based on a discussion with kenjiro cho
2015-11-20i made a mistake. rename ifq_enq and ifq_deq to ifq_enqueue and ifq_dequeueDavid Gwynne
fixing it now before i regret it more.
2015-11-20shuffle struct ifqueue so in flight mbufs are protected by a mutex.David Gwynne
the code is refactored so the IFQ macros call newly implemented ifq functions. the ifq code is split so each discipline (priq and hfsc in our case) is an opaque set of operations that the common ifq code can call. the common code does the locking, accounting (ifq_len manipulation), and freeing of the mbuf if the disciplines enqueue function rejects it. theyre kind of like bufqs in the block layer with their fifo and nscan disciplines. the new api also supports atomic switching of disciplines at runtime. the hfsc setup in pf_ioctl.c has been tweaked to build a complete hfsc_if structure which it attaches to the send queue in a single operation, rather than attaching to the interface up front and building up a list of queues. the send queue is now mutexed, which raises the expectation that packets can be enqueued or purged on one cpu while another cpu is dequeueing them in a driver for transmission. a lot of drivers use IFQ_POLL to peek at an mbuf and attempt to fit it on the ring before committing to it with a later IFQ_DEQUEUE operation. if the mbuf gets freed in between the POLL and DEQUEUE operations, fireworks will ensue. to avoid this, the ifq api introduces ifq_deq_begin, ifq_deq_rollback, and ifq_deq_commit. ifq_deq_begin allows a driver to take the ifq mutex and get a reference to the mbuf they wish to try and tx. if there's space, they can ifq_deq_commit it to remove the mbuf and release the mutex. if there's no space, ifq_deq_rollback simply releases the mutex. this api was developed to make updating the drivers using IFQ_POLL easy, instead of having to do significant semantic changes to avoid POLL that we cannot test on all the hardware. the common code has been tested pretty hard, and all the driver modifications are straightforward except for de(4). if that breaks it can be dealt with later. ok mpi@ jmatthew@
2015-11-18Factorize the bits to check if a L2 route is connected, wether it isMartin Pieuchot
attached to a carp(4) or bridge(4) member, to not dereference rt_ifp directly. ok visa@
2015-11-11Store the index of the lo0 interface instead of a pointer to itsMartin Pieuchot
descriptor. Allow to get rid of two if_ref() in the output paths. ok dlg@
2015-10-25Introduce if_rtrequest() the successor of ifa_rtrequest().Martin Pieuchot
L2 resolution depends on the protocol (encoded in the route entry) and an ``ifp''. Not having to care about an ``ifa'' makes our life easier in our MP effort. Fewer dependencies between data structures implies fewer headaches. Discussed with bluhm@, ok claudio@
2015-10-24Add pair(4), a vether-based virtual Ethernet driver to interconnectReyk Floeter
rdomains and bridges on the local system. This can be used to route through local rdomains, to create L2 devices (like trunks) between them, and many other things. Discussed with many, with input from mpi@ OK sthen@ phessler@ yasuoka@ mikeb@
2015-10-22Kill link_rtrequest(), introduce in 1990 to "fix" the resultMartin Pieuchot
of rt_getifa() when adding link level route from outside the kernel. ok claudio@
2015-10-12the pattr argument to IFQ_ENQUEUE is unused, so let's get rid of it.David Gwynne
also the comment above IFQ_ENQUEUE that says the pattr argument is unused. ok mpi@
2015-10-05Add ifi_oqdrops and its alias to struct if_data.Masao Uebayashi
Necessary bumps in Ports will be handled by sthen@. OK mpi@ dlg@
2015-09-30sleep until all references to an interface have been released during detach.David Gwynne
this is done by moving to the refcnt api and using refcnt_finalize. tested by Hrjove Popovski ok mpi@
2015-09-28Remove "if_tp" from the "struct ifnet".Martin Pieuchot
Instead of violating a layer of abstraction by keeping per pseudo-driver informations in "struct ifnet", the port trunk is now passed as a cookie to the interface input handler (ifih). The time of per pseudo-driver hack in the network stack is over! ok mikeb@
2015-09-27pull the m_freem calls out of hfsc_enqueue by having IFQ_ENQUEUE freeDavid Gwynne
the mbuf in both the hfsc and priq error paths. ok mikeb@ mpi@ claudio@ henning@
2015-09-13There's no point in abstracting ifp->if_output() as long as pf_test()Martin Pieuchot
needs to see lo0 in the output path. ok claudio@
2015-09-13Run the interface watchdog timer routine as a task such that we have processMark Kettenis
context. ok mpi@, claudio@
2015-09-12Stop overwriting the rt_ifp pointer of RTF_LOCAL routes with lo0ifp.Martin Pieuchot
Use instead the RTF_LOCAL flag to loop local traffic back to the corresponding protocol queue. With this change rt_ifp is now always the same as rt_ifa->ifa_ifp. ok claudio@
2015-09-12Introduce if_input_local() a function to feed local traffic back toMartin Pieuchot
the protocol queues. It basically does what looutput() was doing but having a generic function will allow us to get rid of the loopback hack overwwritting the rt_ifp field of RTF_LOCAL routes. ok mikeb@, dlg@, claudio@