summaryrefslogtreecommitdiff
path: root/sys/net
AgeCommit message (Collapse)Author
2016-06-14Convert the links between art data structures used during lookups into srps.Jonathan Matthew
art_lookup and art_match now return an active srp_ref, which the caller must leave when it's done with the returned route (if any). This allows lookups to be done without holding any locks. The art_table and art_node garbage collectors are still responsible for freeing items removed from the routing table, so they now use srp_finalize to wait out any active references, and updates are done using srp_swap operations. ok dlg@ mpi@
2016-06-10Add the "llprio" field to struct ifnet, and the corresponding keywordVincent Gross
to ifconfig. "llprio" allows one to set the priority of packets that do not go through pf(4), as the case is for arp(4) or bpf(4). ok sthen@ mikeb@
2016-06-08Revert previous, it breaks regression tests.Martin Pieuchot
2016-06-08Move ND resoluton logic from nd6_output() to nd6_storelladdr() andMartin Pieuchot
rename it to nd6_resolve(). This allows us to get rid of non-Ethernet hacks by moving Ethernet specific logic in the appropriate layer. ok sthen@
2016-06-07Multicast packet are already duplicated in bridge_process() soMartin Pieuchot
no need to loop another copy on the receiving interface. Reported by and ok uebayasi@
2016-06-07Use rtalloc(9) instead of ifa_ifwithnet() to find an interfaceMartin Pieuchot
when adding a route to gateway to ensure a most specific match. This makes "# route add" coherent to "# route get" even with p2p interfaces. Fix a problem reported by Mart Tõnso. ok vgross@
2016-06-07per trending style, add continue to empty loops.Ted Unangst
ok mglocker
2016-06-03Remove superfluous parenthesis to shut up clang, from David Hill.Martin Pieuchot
2016-06-03defer the freeing of art tables and nodes to a task.David Gwynne
this will allow us to sleep in srp_finalize before freeing the memory. the defer is done by putting the tables and nodes on a list which is serviced by a task. the task removes all the entries from the list and pool_puts them. the art_tables gc code uses at_parent as its list entry, and the art_node gc code uses a union with the an_dst pointer. both at_parent and an_dst are only used when theyre active as part of an art data structure, and are not used in lookups. once the art is done with them we can reuse these pointers safely. ok mpi@
2016-06-03set rt_expire times against time_uptime, not time_second.David Gwynne
time_second is unix time so it can be affected by clock changes. time_uptime is monotonic so it isnt affected by clock changes. that in turn means route expiries wont jump with clock changes if set against time_uptime. the expiry is translated into unix time for export to userland though. ok mpi@
2016-06-02pool_setipl at IPL_SOFTNET for all the art structures.David Gwynne
2016-06-02always clean up the heap in art_table_delete, even for the last at_refcntDavid Gwynne
in the future a table may also be referenced by a cpu reading it with srp as well as the art rtable, so try and make sure it is always usable. ok mpi@
2016-06-01Remove ART-specific hack now that route reference counts is similarMartin Pieuchot
to the original BSD routing table. All route(8) and arp(8) tests still pass. Fix a harmless underflow reported by Hrvoje Popovski.
2016-06-01shuffle the code in rtable_insert so it inserts a populated art_node.David Gwynne
this makes the node usable as soon as it is in the tree, rather than after it inserts the rtentry on the node. ok mpi@
2016-06-01s/stall/stale/ in a comment about old interfaces.David Gwynne
ok mpi@
2016-06-01rtref and rtfree around moving the rt in rtable_mpath_reprio so the listDavid Gwynne
operations cant drop the refcount to 0. ok mpi@
2016-06-01move all the art_node initialisation to art_get in art.cDavid Gwynne
ok mpi@
2016-05-31Ensure that a valid route entry is passed to ether_output() if L2Martin Pieuchot
resolution is required. This will allow us to enforce that no route entry is inserted in the routing table after ether_output(). This is now possible because if_output() is no longer called with a NULL route argument. Tested by Hrvoje Popovski, ok visa@, bluhm@
2016-05-31Flush dynamic route entries attached to an interface when its link stateMartin Pieuchot
becomes DOWN. This follows the same reasonning as for L2 (cloned) entries. Hopefully enough to fix tedu@'s stale RTF_DYNAMIC routes when switching WiFi network during suspend/resume. ok sthen@
2016-05-31Do not call nd6_output() without route entry argument.Martin Pieuchot
ok sthen@, bluhm@
2016-05-31Plug a route entry leak triggered under memory pressure.Martin Pieuchot
Help to track the leak from Hrvoje Popovski, ok bluhm@
2016-05-30Set pppoe(4) control frames to high (NC, "network control")Stuart Henderson
priority. This is translated into an 802.1p priority tag when sent over a vlan interface, reducing the risk of them being crowded out by data packets on a busy link. Some users have problems with ISPs that place specific requirements on vlan priority (typically the packet header value must be '0', relating to priority 1). This diff doesn't fix that yet, but gives a single place to patch to change tags on control packets without affecting normal vlan priority operation on other interfaces. ok mikeb.
2016-05-30Insert a hack to deal with interfaces removing the VLAN header beforeMartin Pieuchot
the packet has been feed to the pseudo-interfaces input handlers. To fix that without introducing a layer violation we should be able to disable HW-vlan on parent when in use with different pseudo-interfaces. In the case of bridge(4) for example it makes no sense to let the interface remove the VLAN header if the kernel has to add it back for every packet. Fix issues reported by sebastia@ and markus@ From dlg@, ok claudio@
2016-05-28Backout pf.c r1.972, pf_norm.c r1.184, ok claudioStuart Henderson
pf_test calls pf_refragment6 with dst=NULL, which is passed down to rtable_match which attempts to dereference it.
2016-05-24Do not call nd6_output() without route entry argument.Martin Pieuchot
ok bluhm@
2016-05-23remove the function pointer from mbufs. this memory is shared with dataTed Unangst
via unions, and we don't want to make it easy to control the target. instead an integer index into an array of acceptable functions is used. drivers using custom functions must register them to receive an index. ok deraadt
2016-05-23Pass a route entry to if_output() instead of relying on arpresolve() magic.Martin Pieuchot
This refactoring aims to reduce the number of places where a route entry is inserted in the routing table. ok bluhm@
2016-05-18Remove some superflous if_get(9)/if_put(9) dances now that ARP inputMartin Pieuchot
routines are call directly by ether_input(). ok visa@, dlg@
2016-05-18rework the srp api so it takes an srp_ref struct that the caller provides.David Gwynne
the srp_ref struct is used to track the location of the callers hazard pointer so later calls to srp_follow and srp_enter already know what to clear. this in turn means most of the caveats around using srps go away. specifically, you can now: - switch cpus while holding an srp ref - ie, you can sleep while holding an srp ref - you can take and release srp refs in any order the original intent was to simplify use of the api when dealing with complicated data structures. the caller now no longer has to track the location of the srp a value was fetched from, the srp_ref effectively does that for you. srp lists have been refactored to use srp_refs instead of srpl_iter structs. this is in preparation of using srps inside the ART code. ART is a complicated data structure, and lookups require overlapping holds of srp references. ok mpi@ jmatthew@
2016-05-10make the bpf tap functions take const struct mbuf *David Gwynne
this makes it more obvious that the bpf code should only read packets, never modify them. now possible because the paths that care about M_FILDROP set it after calling bpf_mtap. ok mpi@ visa@ deraadt@
2016-05-10make bpf_mtap callers set the M_FILDROP flag if they care about it.David Gwynne
ok mpi@
2016-05-08Do not export the IFXF_MPSAFE flag to userland, it is a kernel-onlyMartin Pieuchot
hint. ok kettenis@, deraadt@
2016-05-03Stop using a soft-interrupt context to process incoming network packets.Martin Pieuchot
Use a new task that runs holding the KERNEL_LOCK to execute mp-unsafe code. Our current goal is to progressively move input functions to the unlocked task. This gives a small performance boost confirmed by Hrvoje Popovski's IPv4 forwarding measurement: before: after: send receive send receive 400kpps 400kpps 400kpps 400kpps 500kpps 500kpps 500kpps 500kpps 600kpps 600kpps 600kpps 600kpps 650kpps 650kpps 650kpps 640kpps 700kpps 700kpps 700kpps 700kpps 720kpps 640kpps 720kpps 710kpps 800kpps 640kpps 800kpps 650kpps 1.4Mpps 570kpps 1.4Mpps 590kpps 14Mpps 570kpps 14Mpps 590kpps ok kettenis@, bluhm@, dlg@
2016-05-03Put back a panic() if an incoming packet already has a statekey.Martin Pieuchot
Apparently nobody can hit this condition anymore or people do not report bugs if their kernel do not panic. ok dlg@, sashan@
2016-05-02Simplify life for routing table implementations by requiring that rtable_walkJonathan Matthew
callbacks return EAGAIN if they modify the routing table. While we're here, simplify life for rtable_walk callers by moving the loop that restarts the walk on EAGAIN into rtable_walk itself. Flushing cloned routes on interface state changes becomes a bit more inefficient, but this can be improved later. ok mpi@ dlg@
2016-05-01Remove a bogus "else" that was causing breakage with LCP echoes,Stuart Henderson
bug introduced in r1.138. Reported at https://twitter.com/DarkSoul4242/status/722365165262405633 (twitter is *NOT* the place to report bugs!) and in https://marc.info/?l=openbsd-bugs&m=145988918010707&w=2, pointed out by tb@
2016-04-29Make if_output() return EAFNOSUPPORT instead of just dropping packetsKenneth R Westerback
and pretending the output succeeded. Packets are still dropped! Idea from jsg@ following same change to bridge(4). ok mpi@
2016-04-27Remove unused arguments from rt_checkgate().Martin Pieuchot
Since the rtalloc(9) rewrite no route lookup is done in this function so there's no need for a destination or a rtable ID.
2016-04-19tabs, not spacesDavid Gwynne
no functional change
2016-04-19make setting a vlan interfaces lladdr more likely to workDavid Gwynne
the recent vlan code sets the vlan interfaces mac address to the parent interfaces mac address when it is brought up, and resets it when the vlan interface is brought down. now, if you set a mac address manually (eg, ifconfig vlanX lladdr f0:0b:a7:ba:2b:00), vlan(4) ignores the parents mac address and never resets its own. to make this work, setting a custom lladdr on a vlan interface makes the parent interface promisc so the packets wont be filtered by the hardware interface. setting the mac address to 00:00:00:00:00:00 resets this behavior and makes the interface inherit the parents mac again. issue reported by and fix tested by paul de weerd
2016-04-18Remove the hack that prevents changing pppoe params at runtime.Mike Belopuhov
The EBUSY hack imposes an order on the ifconfig commands issued against the pppoe interface used to configure the sppp layer below. To counter this we use the ENETRESET trick that other drivers use to tell the pppoe layer that sppp has requested a stop/init reset sequence to proceed which we oblige with in case pppoe is UP and RUNNING. Tested by semarie@ and Jan Schreiber <jes@posteo.de>, thanks!
2016-04-15remove ml_filter, mq_filter, niq_filter.David Gwynne
theyre currently unused, so no functional change.
2016-04-15rename ifv_p to ifv_ifp0David Gwynne
this makes it more clear to the casual reader that it refers to the parent interface, which is consistently referred to as ifp0 in the rest of the vlan (and carp) code. this is a good idea from mpi@
2016-04-15rework vlan config to make it mpsafe and done by standard ioctlsDavid Gwynne
configuration of the vlan parent interface and the vlan id should come via the IFPARENT and VNETID ioctls now. the vlan specific ioctls are still available via a compat layer, but that will go away a bit further into this release cycle. the parent interface may only be configured while the vlan is down. the vnetid may be changed at runtime, but will generate link state changes across that event. the vlan is implicitily brought up when an address is assigned, which brings it in line with all our other network drivers. the legacy vlan ioctl still imply bringing the interface up because that's what it used to do. the code that brings vlans up and down is now simplified because it no longer supports changing the parent at run time. most of that code now adds state to the parent when bringing the vlan up, and bringing the interface down just removes it in reverse. these simplifications in turn make it possible for us to transmit packets on vlan interfaces without holding the big lock, so its now marked as MPSAFE. ok jmatthew@ sthen@ mpi@
2016-04-15replace m_copym2 with m_dup_pkt for the dup-to handling.David Gwynne
note that this uses max_linkhdr as the adjustment arg. this follows what the ip stack does when generating packets as it provides space for link headers (like ethernet headers) to be prepended on the new packet. ok henning@
2016-04-14Enable device cloning for bpf. This allows to have just one bpf deviceMartin Natano
node in /dev, that services all bpf consumers (up to 1024). Also, disallow the usage of all but the first minor device, so accidental use of another minor device will attract attention. Cloning bpf offers some advantages: - Users with high bpf usage won't have to clutter their /dev with device nodes. - A lot of programs in base use a pattern like this to acces bpf: int fd, n = 0; do { (void)snprintf(device, sizeof device, "/dev/bpf%d", n++); fd = open(device, mode); } while (fd < 0 && errno == EBUSY); Those can now be replaced by a simple open(), without loop. ok mikeb "right time in the cycle to try" deraadt
2016-04-13We're always ready! So send IFQ_SET_READY() to the bitbucket.Martin Pieuchot
2016-04-13Keep all pools in the same place.Martin Pieuchot
ok jmatthew@
2016-04-12Remove unneeded art_free().Martin Pieuchot
Reported by and ok jmatthew@
2016-04-12Set bridge(4)'s if_output to a dummy function returning EAFNOSUPPORT asMartin Pieuchot
it should not be used to output packets but we have to respect the ifp driver API to some extend. Prevent a panic found the hardway by espie@. ok claudio@, mikeb@, jsg@, krw@