summaryrefslogtreecommitdiff
path: root/sys/net/bpf.c
AgeCommit message (Collapse)Author
2018-07-13Some USB network interfaces like rum(4) report ENXIO from theirAlexander Bluhm
ioctl function after the device has been pulled out. Also accept this error code in bpf_detachd() to prevent a kernel panic. tcpdump(8) may run while the interface is detached. from Moritz Buhl; OK stsp@
2018-03-02Protect the calls to ifpromisc() in bpf(4) with net lock. ThisAlexander Bluhm
affects the bpfioctl() and bpfclose() path. lock assertion reported and fix tested by Pierre Emeriaud; OK visa@
2018-02-19Remove almost unused `flags' argument of suser().Martin Pieuchot
The account flag `ASU' will no longer be set but that makes suser() mpsafe since it no longer mess with a per-process field. No objection from millert@, ok tedu@, bluhm@
2018-02-01add bpf_tap_hdr(), for handling a buffer (not an mbuf) with a header.David Gwynne
internally it uses mbufs to handle the chain of buffers, but the caller doesnt have to deal with that or allocate a temporary buffer with the header attached. ok mpi@
2018-01-24add support for bpf on "subsystems", not just network interfacesDavid Gwynne
bpf assumed that it was being unconditionally attached to network interfaces, and maintained a pointer to a struct ifnet *. this was mostly used to get at the name of the interface, which is how userland asks to be attached to a particular interface. this diff adds a pointer to the name and uses it instead of the interface pointer for these lookups. this in turn allows bpf to be attached to arbitrary subsystems in the kernel which just have to supply a name rather than an interface pointer. for example, bpf could be attached to pf_test so you can see what packets are about to be filtered. mpi@ is using this to look at usb transfers. bpf still uses the interface pointer for bpfwrite, and for enabling and disabling promisc. however, these are nopped out for subsystems. ok mpi@
2017-12-30Don't pull in <sys/file.h> just to get fcntl.hPhilip Guenther
ok deraadt@ krw@
2017-08-11Remove NET_LOCK()'s argument.Martin Pieuchot
Tested by Hrvoje Popovski, ok bluhm@
2017-05-24When using "tcpdump proto 128" the filter never matched. A signAlexander Bluhm
expansion bug in bpf prevented protocols above 127. m_data is signed, bpf_mbuf_ldb() returns unsigned. bug report Matthias Pitzl; OK deraadt@ millert@
2017-05-04Introduce sstosa() for converting sockaddr_storage with a type safeAlexander Bluhm
inline function instead of casting it to sockaddr. While there, use inline instead of __inline for all these conversions. Some struct sockaddr casts can be avoided completely. OK dhill@ mpi@
2017-04-20Tweak lock inits to make the system runnable with witness(4)Visa Hankala
on amd64 and i386.
2017-01-24splsoftnet() to NET_LOCK() in bpfwrite().Martin Pieuchot
ok dlg@, visa@
2017-01-24A space here, a space there. Soon we're talking real whitespaceKenneth R Westerback
rectification.
2017-01-09Use a mutex to serialize accesses to buffer slots.Martin Pieuchot
With this change bpf_catchpacket() no longer need the KERNEL_LOCK(). Tested by Hrvoje Popovski who reported a recursion in the previous attempt. ok bluhm@
2017-01-03Revert previous, there's still a problem with recursive entries inMartin Pieuchot
bpf_mpath_ether(). Problem reported by Hrvoje Popovski.
2017-01-02Use a mutex to serialize accesses to buffer slots.Martin Pieuchot
With this change bpf_catchpacket() no longer need the KERNEL_LOCK(). ok bluhm@, jmatthew@
2016-11-28Make sure the descriptor has been removed from the interface listMartin Pieuchot
before we call ifpromisc() and possibly sleep. ok bluhm@
2016-11-21Make sure bpf_wakeup() is called at most once when matching conditionsMartin Pieuchot
are fulfilled in bpf_catchpacket().
2016-11-21Rename bpf_reset_d() to match bpf_{attach,reset}d().Martin Pieuchot
2016-11-16Use goto in bpf{read,write}() to ease review of locked sections.Martin Pieuchot
While here properly account for used reference in bpfwrite(). ok bluhm@
2016-11-16Allow bpf_allocbufs() to fail when allocating memory.Martin Pieuchot
This will help trading the KERNEL_LOCK for a mutex. ok bluhm@
2016-10-16Fix bpf_catchpacket comment.Jeremie Courreges-Anglas
2016-09-12bpf_tap() is long dead! Long live bpf_mtap() & friends.Kenneth R Westerback
ok natano@ deraadt@
2016-08-22Call csignal() and selwakeup() from a KERNEL_LOCK'd task.Martin Pieuchot
This will allow us make bpf_tap() KERNEL_LOCK() free. Discussed with dlg@ and input from guenther@
2016-08-15No need to reset si_selpid after calling selwakeup() the functionMartin Pieuchot
already does it.
2016-08-15Introduce bpf_put() and bpf_get() instead of mixing macro and functionsMartin Pieuchot
for the reference counting. ok dlg@
2016-08-15Check if ``bd_bif'' is NULL inside bpf_catchpacket() to match bpfread()Martin Pieuchot
and bpfwrite(), all of which will need to grabe a lock to protect the buffers. ok dlg@
2016-08-15Merge bpfilter_create() into bpfopen() and make it such that theMartin Pieuchot
descriptor is referenced before it is inserted in the global list. ok dlg@
2016-07-25Make sure closed bpf devices are removed from bpf_d_list to free theMartin Natano
minor number for reuse by the device cloning code. This fixes a panic reported by bluhm@. initial diff from tedu ok deraadt
2016-06-10Add the "llprio" field to struct ifnet, and the corresponding keywordVincent Gross
to ifconfig. "llprio" allows one to set the priority of packets that do not go through pf(4), as the case is for arp(4) or bpf(4). ok sthen@ mikeb@
2016-05-18rework the srp api so it takes an srp_ref struct that the caller provides.David Gwynne
the srp_ref struct is used to track the location of the callers hazard pointer so later calls to srp_follow and srp_enter already know what to clear. this in turn means most of the caveats around using srps go away. specifically, you can now: - switch cpus while holding an srp ref - ie, you can sleep while holding an srp ref - you can take and release srp refs in any order the original intent was to simplify use of the api when dealing with complicated data structures. the caller now no longer has to track the location of the srp a value was fetched from, the srp_ref effectively does that for you. srp lists have been refactored to use srp_refs instead of srpl_iter structs. this is in preparation of using srps inside the ART code. ART is a complicated data structure, and lookups require overlapping holds of srp references. ok mpi@ jmatthew@
2016-05-10make the bpf tap functions take const struct mbuf *David Gwynne
this makes it more obvious that the bpf code should only read packets, never modify them. now possible because the paths that care about M_FILDROP set it after calling bpf_mtap. ok mpi@ visa@ deraadt@
2016-04-14Enable device cloning for bpf. This allows to have just one bpf deviceMartin Natano
node in /dev, that services all bpf consumers (up to 1024). Also, disallow the usage of all but the first minor device, so accidental use of another minor device will attract attention. Cloning bpf offers some advantages: - Users with high bpf usage won't have to clutter their /dev with device nodes. - A lot of programs in base use a pattern like this to acces bpf: int fd, n = 0; do { (void)snprintf(device, sizeof device, "/dev/bpf%d", n++); fd = open(device, mode); } while (fd < 0 && errno == EBUSY); Those can now be replaced by a simple open(), without loop. ok mikeb "right time in the cycle to try" deraadt
2016-04-02refactor bpf_filter a bit.David Gwynne
the code was confusing around how it dealt with packets in mbufs vs plain memory buffers with a lenght. this renames bpf_filter to _bpf_filter, and changes it so the packet memory is referred to by an opaque pointer, and callers have to provide a set of operations to extra values from that opaque pointer. bpf_filter is now provided as a wrapper around _bpf_filter. it provides a set of operators that work on a straight buffer with a lenght. this also adds a bpf_mfilter function which takes an mbuf instead of a buffer, and it provides explicit operations for extracting values from mbufs. if we want to use bpf filters against other data structures (usb or scsi packets maybe?) we are able to provide functions for extracting payloads from them and use _bpf_filter as is. ok canacar@
2016-03-30remove support for BIOCGQUEUE and BIOSGQUEUEDavid Gwynne
nothing uses them, and the implementation make incorrect assumptions about mbufs within bpf processing that could lead to some weird failures. ok sthen@ deraadt@ mpi@
2016-03-29make bpf_mtap et al return whether the mbuf should be droppedDavid Gwynne
ok mpi@
2016-02-12Convert to uiomove. From Martin Natano.Stefan Kempf
2016-02-10protect the bpf ring with splnet as well as the kernel lock.David Gwynne
kernel lock protects it against other cpus, but splnet prevents bpf code running at splsoftnet (eg, like bridge does) from having the rings trampled by a hardware interrupt on the same cpu. ok mpi@ jmatthew@
2016-02-05return if the bpf_if passed to bpf_tap and _bpf_mtap are NULL.David Gwynne
this works around a toctou bug in a very common idiom in our tree, in between the two lines below: if (ifp->if_bpf) bpf_mtap(ifp->if_bpf, m, BPF_DIRECTION_OUT); figured out by and diff from haesbart
2016-01-07Make open(O_NONBLOCK) of tun, tap, and bpf behave like open+ioctl(FIONBIO)Philip Guenther
problem noted by yasuoka@ ok yasuoka@ millert@
2015-12-05remove old lint annotationsTed Unangst
2015-10-07Do not call bpf_catchpacket() if another CPU detached a file from theMartin Pieuchot
corresponding interface. bfp_tap() and _bpf_mtap() are mostly run without the KERNEL_LOCK. The use of SRPs in these functions gives us the guarantees that manipulated BPF descriptors are alive but not the associated interface desctiptor! And indeed they can be cleared by another CPU running bpf_detachd(). Prevent a race reported by Hrvoje Popovski when closing tcpdump(8) with an IPL_MPSAFE ix(4). ok mikeb@, dlg@, deraadt@
2015-09-29make the bpf filters a bpf_program instead of an array of bpf_insn.David Gwynne
bpf_program contains a pointer to that same array, but also the number of elements in it. this allows us to know the size when we want to free them. ok deraadt@
2015-09-29add sizes to some of the simpler free callsTheo de Raadt
ok mpi
2015-09-13There's no point in abstracting ifp->if_output() as long as pf_test()Martin Pieuchot
needs to see lo0 in the output path. ok claudio@
2015-09-12Stop overwriting the rt_ifp pointer of RTF_LOCAL routes with lo0ifp.Martin Pieuchot
Use instead the RTF_LOCAL flag to loop local traffic back to the corresponding protocol queue. With this change rt_ifp is now always the same as rt_ifa->ifa_ifp. ok claudio@
2015-09-11FOREACH macro is not safe to use when removing elements on a list.Martin Pieuchot
Should fix a NULL dereference reported by guenther@. ok dlg@
2015-09-09convert bpf to using an srp list for the list of descriptors.David Gwynne
this replaces the hand rolled list. the code has always used hand rolled lists, but that gets a bit cumbersome when theyre SRPs. requested ages ago by mpi@
2015-09-01reintroduce bpf.c r1.121.David Gwynne
this differs slightly from 1.121 in that it uses the new srp_follow() to walk the list of descriptors on an interface. this is instead of interleaving srp_enter() and srp_leave(), which can lead to races and corruption if you're touching the same SRPs at different IPLs on the same CPU. ok deraadt@ jmatthew@
2015-08-23back out bpf+srp. its blowing up in a bridge setup.David Gwynne
ill debug this out of the tree.
2015-08-16make bpf_mtap mpsafe by using SRPs.David Gwynne
this was originally implemented by jmatthew@ last year, and updated by us both during s2k15. there are four data structures that need to be looked after. the first is the bpf interface itself. it is allocated and freed at the same time as an actual interface, so if you're able to send or receive packets, you're able to run bpf on an interface too. dont need to do any work there. the second are bpf descriptors. these represent userland attaching to a bpf interface, so you can have many of them on a single bpf interface. they were arranged in a singly linked list before. now the head and next pointers are replaced with SRP pointers and followed by srp_enter. the list updates are serialised by the kernel lock. the third are the bpf filters. there is an inbound and outbound filter on each bpf descriptor, ann a process can replace them at any time. the pointers from the descriptor to those is also changed to be accessed via srp_enter. updates are serialised by the kernel lock. the fourth thing is the ring that bpf writes to for userland to read. there's one of these per descriptor. because these are only updated when a filter matches (which is hopefully a relatively rare event), we take the kernel lock to serialise the writes to the ring. all this together means you can run bpf against a packet without taking the kernel lock unless you actually caught a packet and need to send it to userland. even better, you can run bpf in parallel, so if we ever support multiple rings on a single interface, we can run bpf on each ring on different cpus safely. ive hit this pretty hard in production at work (yay dhcrelay) on myx (which does rx outside the biglock). ok jmatthew@ mpi@ millert@