summaryrefslogtreecommitdiff
path: root/sys/net
AgeCommit message (Collapse)Author
2015-09-09Kill a couple of if_get()s only needed to increment per-ifp IPv6 stats.Martin Pieuchot
We do not export those per-ifp statistics and they will soon all die. "We're putting inet6 on a diet" claudio@ ok dlg@, mikeb@, claudio@
2015-09-09convert bpf to using an srp list for the list of descriptors.David Gwynne
this replaces the hand rolled list. the code has always used hand rolled lists, but that gets a bit cumbersome when theyre SRPs. requested ages ago by mpi@
2015-09-06The pppx_if_pl pool will never be used in interrupt context, so pass theMark Kettenis
PR_WAITOK flag to pool_init and pass NULL as the pool allocator. ok dlg@
2015-09-04The pf_osfp_pl and pf_osfp_entry_pl never get used in interrupt context.Mark Kettenis
Drop the explicit pool backend allocator here and add PR_WAITOK to the flags passed to pool_init(9). The pfi_addr_pl and pf_rule_pl can get used in interrupt context though. So simply drop the explicit pool backend allocator without adding PR_WAITOK to the flags passed to pool_init(9). ok mikeb@
2015-09-04pflow_flush() still needs sc->send_nam; free it later.Florian Obser
2015-09-04Make every subsystem using a radix tree call rn_init() and pass theMartin Pieuchot
length of the key as argument. This way every consumer of the radix tree has a chance to explicitly initialize the shared data structures and no longer rely on another subsystem to do the initialization. As a bonus ``dom_maxrtkey'' is no longer used an die. ART kernels should now be fully usable because pf(4) and IPSEC properly initialized the radix tree. ok chris@, reyk@
2015-09-04Fix an mbuf use-after-fruit in pflow_clone_create().Martin Pieuchot
Issue reported by semarie@ on bugs@ who also isolated the use-after-fruit to pflow(4) using dlg@'s tracing mbuf diff. Inputs from and ok florian@, semarie@, benno@
2015-09-03Unconditionally set the RTF_UP flags when adding a route to the table.Martin Pieuchot
This makes dhclient(8) configured default routes usable without relying on the link-state change hooks not present in RAMDISK kernels. ok krw@, claudio@
2015-09-01Replace sockaddr casts with the proper satosin(), ... calls.Alexander Bluhm
From David Hill; OK mpi@; tested kspillner@; tweaks bluhm@
2015-09-01- route-to, dup-to, reply-to should not override the block actionAlexandr Nedvedicky
Spotted by Dilli Paudel <dilli ! paudel at oracle ! com> ok jung@, ok mikeb@
2015-09-01Introduce rtisvalid(9) a function to check if a (cached) route entryMartin Pieuchot
can be used or should be released by rtfree(9). It currently checks if the route is UP and is not attached to a stall ifa. ok bluhm@, claudio@
2015-09-01Do not try to find a possible ``ifa'' in rt_ifal_del(9) and trust theMartin Pieuchot
checks done in rtrequest1(9). This chunk has been introduced in 1991 when rtrequest1(RTM_DELETE...) was not doing a route lookup and no longer make any sense. ok bluhm@
2015-09-01dont need the kernel lock for mpsafe bpfs (again)David Gwynne
2015-09-01reintroduce bpf.c r1.121.David Gwynne
this differs slightly from 1.121 in that it uses the new srp_follow() to walk the list of descriptors on an interface. this is instead of interleaving srp_enter() and srp_leave(), which can lead to races and corruption if you're touching the same SRPs at different IPLs on the same CPU. ok deraadt@ jmatthew@
2015-08-30Use a global table for domains instead of building a list at run time.Martin Pieuchot
As a side effect there's no need to run if_attachdomain() after the list of domains has been built. ok claudio@, reyk@
2015-08-28Fix compiling a kernel without NBPFILTER > 0.Reyk Floeter
OK mikeb@
2015-08-26Use the specialized m_copym2() preserving the alignment of the payloadMartin Pieuchot
in bridge_localbroadcast() too. This should fix another alignment issue kettenis@ is seeing. ok dlg@
2015-08-25#if INET && INET6 -> #ifdef INET6, the kernel no longer defines INETJonathan Gray
since July. The code involved deals with af-to handling.
2015-08-24Always increment the reference counter of the returned route entry inMartin Pieuchot
rtrequest1(9). This simplifies rtfree(9) dances and will prevent another CPU to free the entry before we're done with it as soon as routing functions can be executed in parallel. ok bluhm@, mikeb@
2015-08-24The bridge list is a relict, delete the remaining LIST_REMOVE.Alexander Bluhm
This fixes a crash during ifconfig bridge0 destroy. OK mpi@
2015-08-24Rework the code to decide when to perform DAD to no longer rely on theMartin Pieuchot
IN6_IFF_NODAD pseudo-flag not being set. This was just a flag for spaghetti code that should not exist in the first place. Tested by sebastia@, ok sthen@
2015-08-24In kernel initialize struct sockaddr_in and sockaddr_in6 to zeroAlexander Bluhm
everywhere to avoid passing around pointers to uninitialized stack memory. While there, fix the call to in6_recoverscope() in fill_drlist(). OK deraadt@ mpi@
2015-08-23back out bpf+srp. its blowing up in a bridge setup.David Gwynne
ill debug this out of the tree.
2015-08-23bpf+srp is blowing up, so its being backed out. bpf will need the big lock.David Gwynne
2015-08-20Make ART internals free of 'struct sockaddr'.Martin Pieuchot
Keep route entry/BSD compatibility goos in the rtable layer. The way addresses and masks (prefix-lengths) are encoded is really tied to the radix-tree implementation. Since we decided to no longer support non-contiguous masks, we could get rid of some extra "sockaddr" allocations and reduce the memory grows related to the use of a multibit-trie.
2015-08-20In an email dated 11 Feb 2015, Yoichi Hariguchi accepted to re-licenseMartin Pieuchot
his reference ART implementation from a BSD 4-clause to ISC. Thanks a lot to him!
2015-08-20Import an alternative routing table backend based on Yoichi Hariguchi'sMartin Pieuchot
ART implementation. ART (Allotment Routing Table) is a multibit-trie algorithm invented by D. Knuth while reviewing Yoichi's SMART [0] (Smart Multi-Array Routing Table) paper. This implementation, unlike the one from the KAME project, supports variable stride lengths which makes it easier to adapt the consumed memory/speed trade-off. It also let you use a bigger first-level table, what other algorithms such as POPTRIE [1] need to implement separately. Adaptation to the OpenBSD kernel has been done with two different data structures. ART nodes and route entries are managed separately which makes the algorithm implementation free of any MULTIPATH logic. This implementation does not include Path Compression. [0] http://www.hariguchi.org/art/smart.pdf [1] http://conferences.sigcomm.org/sigcomm/2015/pdf/papers/p57.pdf ok dlg@, reyk@
2015-08-19PF must keep IPv6 fragment size as chosen by sender also for packets,Alexandr Nedvedicky
which are routed on behalf route-to action. OK bluhm@
2015-08-19Convert all calls to rtrequest1() and the following error checkAlexander Bluhm
into a common pattern. In the man page clarify the usage of the returned route. OK mpi@ mikeb@ jmc@
2015-08-19Use rtfree(9) instead of decrementing rt_refcnt in rt_getifa().Martin Pieuchot
Note that it is safe to keep a reference to the ifa pointed by a route entry after freeing the entry iff the ifa is valid. ok bluhm@
2015-08-18Apply the logic used for "protocol" queues to bridge(4). This allowsMartin Pieuchot
to defer the work currently done in bridge_input() and requiring the KERNEL_LOCK to bridgeintr(). Tested by sthen@ ok rzalamena@, dlg@, bluhm@
2015-08-18Remove PF_KEY-specific hacks from rtalloc(9). They are dead since SPDMartin Pieuchot
lookups use the radix API directly. ok mikeb@
2015-08-18Check the error value returned by in6_ifattach().Martin Pieuchot
Prodded by and ok bluhm@
2015-08-17Remove unused variable in rt_ifa_add(), prodded by bluhm@Martin Pieuchot
2015-08-17Convert two rt->rt_refcnt-- into rtfree(9) making sure the route entryMartin Pieuchot
is freed when we no longer need it. In this case both code paths are executed in process context and thus serialized by the KERNEL_LOCK. Since we are adding a route entry to the table in both cases, rtfree(9) will not actually free the entry because it is still RT_VALID. ok bluhm@
2015-08-17Match the free(3) semantic and accept NULL pointers in rtfree(9).Martin Pieuchot
This will simplify upcoming conversions of rt_refcnt-- to rtfree(9). Such conversions are needed for proper MP refcounting. ok deraadt@, dlg@
2015-08-16dont need the biglock to call bpf now.David Gwynne
2015-08-16make bpf_mtap mpsafe by using SRPs.David Gwynne
this was originally implemented by jmatthew@ last year, and updated by us both during s2k15. there are four data structures that need to be looked after. the first is the bpf interface itself. it is allocated and freed at the same time as an actual interface, so if you're able to send or receive packets, you're able to run bpf on an interface too. dont need to do any work there. the second are bpf descriptors. these represent userland attaching to a bpf interface, so you can have many of them on a single bpf interface. they were arranged in a singly linked list before. now the head and next pointers are replaced with SRP pointers and followed by srp_enter. the list updates are serialised by the kernel lock. the third are the bpf filters. there is an inbound and outbound filter on each bpf descriptor, ann a process can replace them at any time. the pointers from the descriptor to those is also changed to be accessed via srp_enter. updates are serialised by the kernel lock. the fourth thing is the ring that bpf writes to for userland to read. there's one of these per descriptor. because these are only updated when a filter matches (which is hopefully a relatively rare event), we take the kernel lock to serialise the writes to the ring. all this together means you can run bpf against a packet without taking the kernel lock unless you actually caught a packet and need to send it to userland. even better, you can run bpf in parallel, so if we ever support multiple rings on a single interface, we can run bpf on each ring on different cpus safely. ive hit this pretty hard in production at work (yay dhcrelay) on myx (which does rx outside the biglock). ok jmatthew@ mpi@ millert@
2015-08-16avoid a toctou problem in if_input in the bpf handling.David Gwynne
load ifp->if_bpf into a local variable, test that, and pass it to bpf. this is instead of instead of assuming ifp->if_bpf wont change between checking it and passing it to bpf.
2015-08-13If no handler consumed a mbuf, free it. This also apply if an interfaceMartin Pieuchot
does not have any registered handler. Plug a mbuf leak found by sthen@ with gif(4) in a bridge. ok sthen@, claudio@
2015-08-03A recently added sanity check panic in pf_postprocess_addr() wasJonathan Gray
triggered for a reply-to rule. It turns out this case has been using uninitialised memory as if it were a valid pf pool. As the rest of the function assumes a valid pool for now just return. Problem reported by RD Thrush. ok jung@ mikeb@
2015-07-29Don't use mpls_input() as input handler anymore and instead call itRafael Zalamena
directly. Also protect non mp-safe functions while at it. ok mpi@.
2015-07-21Added OpenBSD CVS tag.Rafael Zalamena
2015-07-21No more AF_LINK addresses on the per-ifp address lists. ok mpi@Jeremie Courreges-Anglas
2015-07-21We don't do 'ARGSUSED' anymoreFlorian Obser
2015-07-21use curproc instead of proc0Florian Obser
pointed out by and OK bluhm@
2015-07-21Put the mbuf_list inside "#ifdef MPLS".Martin Pieuchot
reported by rpe@
2015-07-21- added /* FALLTHROUGH */ comments, typecasts (u_int32_t)-1, ...Alexandr Nedvedicky
ok mpi@
2015-07-20Use the kernel socket interface (sosend(9) etc) instead of shovingFlorian Obser
packets directly into the network stack with ip_output(). The locking is intentionally left as is and will be improved in another commit. Input / OK bluhm@, OK benno@
2015-07-20Remove splassert(IPL_NET) from if_input().Martin Pieuchot
if_input() has been designed to be able to safely handle a batch of packets from physical drivers to the network stack. Most of these drivers have an interrupt routine executed at IPL_NET and the check made sense during the conversion. However we also want to re-enqueue packets with if_input() from the network stack currently running at IPL_SOFTNET. ok claudio@