summaryrefslogtreecommitdiff
path: root/sys/net/bpf.c
AgeCommit message (Collapse)Author
2016-01-07Make open(O_NONBLOCK) of tun, tap, and bpf behave like open+ioctl(FIONBIO)Philip Guenther
problem noted by yasuoka@ ok yasuoka@ millert@
2015-12-05remove old lint annotationsTed Unangst
2015-10-07Do not call bpf_catchpacket() if another CPU detached a file from theMartin Pieuchot
corresponding interface. bfp_tap() and _bpf_mtap() are mostly run without the KERNEL_LOCK. The use of SRPs in these functions gives us the guarantees that manipulated BPF descriptors are alive but not the associated interface desctiptor! And indeed they can be cleared by another CPU running bpf_detachd(). Prevent a race reported by Hrvoje Popovski when closing tcpdump(8) with an IPL_MPSAFE ix(4). ok mikeb@, dlg@, deraadt@
2015-09-29make the bpf filters a bpf_program instead of an array of bpf_insn.David Gwynne
bpf_program contains a pointer to that same array, but also the number of elements in it. this allows us to know the size when we want to free them. ok deraadt@
2015-09-29add sizes to some of the simpler free callsTheo de Raadt
ok mpi
2015-09-13There's no point in abstracting ifp->if_output() as long as pf_test()Martin Pieuchot
needs to see lo0 in the output path. ok claudio@
2015-09-12Stop overwriting the rt_ifp pointer of RTF_LOCAL routes with lo0ifp.Martin Pieuchot
Use instead the RTF_LOCAL flag to loop local traffic back to the corresponding protocol queue. With this change rt_ifp is now always the same as rt_ifa->ifa_ifp. ok claudio@
2015-09-11FOREACH macro is not safe to use when removing elements on a list.Martin Pieuchot
Should fix a NULL dereference reported by guenther@. ok dlg@
2015-09-09convert bpf to using an srp list for the list of descriptors.David Gwynne
this replaces the hand rolled list. the code has always used hand rolled lists, but that gets a bit cumbersome when theyre SRPs. requested ages ago by mpi@
2015-09-01reintroduce bpf.c r1.121.David Gwynne
this differs slightly from 1.121 in that it uses the new srp_follow() to walk the list of descriptors on an interface. this is instead of interleaving srp_enter() and srp_leave(), which can lead to races and corruption if you're touching the same SRPs at different IPLs on the same CPU. ok deraadt@ jmatthew@
2015-08-23back out bpf+srp. its blowing up in a bridge setup.David Gwynne
ill debug this out of the tree.
2015-08-16make bpf_mtap mpsafe by using SRPs.David Gwynne
this was originally implemented by jmatthew@ last year, and updated by us both during s2k15. there are four data structures that need to be looked after. the first is the bpf interface itself. it is allocated and freed at the same time as an actual interface, so if you're able to send or receive packets, you're able to run bpf on an interface too. dont need to do any work there. the second are bpf descriptors. these represent userland attaching to a bpf interface, so you can have many of them on a single bpf interface. they were arranged in a singly linked list before. now the head and next pointers are replaced with SRP pointers and followed by srp_enter. the list updates are serialised by the kernel lock. the third are the bpf filters. there is an inbound and outbound filter on each bpf descriptor, ann a process can replace them at any time. the pointers from the descriptor to those is also changed to be accessed via srp_enter. updates are serialised by the kernel lock. the fourth thing is the ring that bpf writes to for userland to read. there's one of these per descriptor. because these are only updated when a filter matches (which is hopefully a relatively rare event), we take the kernel lock to serialise the writes to the ring. all this together means you can run bpf against a packet without taking the kernel lock unless you actually caught a packet and need to send it to userland. even better, you can run bpf in parallel, so if we ever support multiple rings on a single interface, we can run bpf on each ring on different cpus safely. ive hit this pretty hard in production at work (yay dhcrelay) on myx (which does rx outside the biglock). ok jmatthew@ mpi@ millert@
2015-06-16Store a unique ID, an interface index, rather than a pointer to theMartin Pieuchot
receiving interface in the packet header of every mbuf. The interface pointer should now be retrieved when necessary with if_get(). If a NULL pointer is returned by if_get(), the interface has probably been destroy/removed and the mbuf should be freed. Such mechanism will simplify garbage collection of mbufs and limit problems with dangling ifp pointers. Tested by jmatthew@ and krw@, discussed with many. ok mikeb@, bluhm@, dlg@
2015-05-13test mbuf pointers against NULL not 0Jonathan Gray
ok krw@ miod@
2015-02-10First step towards making uiomove() take a size_t size argument:Miod Vallat
- rename uiomove() to uiomovei() and update all its users. - introduce uiomove(), which is similar to uiomovei() but with a size_t. - rewrite uiomovei() as an uiomove() wrapper. ok kettenis@
2015-02-10make bpf(4) able to filter based on a pf(4) queue ID for tcpdump -Q qnameMartin Pelikan
ALTQ version has been on tech@ for years, people were generally ok with it. ok henning
2015-01-29back bpf.c down to 1.113, from before most recent timeout changes.Ted Unangst
nmap is broken, as reported by kent fritz. pending further investigation, we should keep nmap working until a better fix is developed for the original problem.
2015-01-28when doing a blocking read with a timeout, after the sleep resetDavid Gwynne
the start time so the next read behaves the same. from Simon Mages
2015-01-09correctly handle no timeouts and make timeout handling in general better.Ted Unangst
problem reported by Mages Simon ok guenther
2014-12-16primary change: move uvm_vnode out of vnode, keeping only a pointer.Ted Unangst
objective: vnode.h doesn't include uvm_extern.h anymore. followup changes: include uvm_extern.h or lock.h where necessary. ok and help from deraadt
2014-12-02replace some malloc multiplies with mallocarry. ok deraadt henningTed Unangst
2014-11-23length argument for some free() calls; ok dougTheo de Raadt
2014-10-07when running bpf on an outgoing vlan interface that doesnt have aDavid Gwynne
parent that doesnt offload the tag insertion, we need to chop the vlan subheader out before the filter is run, not after. this moves the mbuf surgery out from the bpf layer into the vlan layer. ok henning@ jmatthew@
2014-09-23lock around the sysctl code that sets the bpf buffer sizes so if we everDavid Gwynne
get multiple processes in the kernel these sets cant race and allow people to set the default greater than the max.
2014-09-22remove a stupid comment above bpfilterattach about how we dont do anythingDavid Gwynne
in it cos its only called on new systems, when it actually does. we dont care about old or new systems, just ours. the code is called, the fact that it exists is enough to demonstrate that.
2014-09-22stash a pointer to bpf_d in the knotes kn_hook instead of the device id.David Gwynne
we refcount the bpf_d memory correctly so it cant go away. possibly worse is the bpf minor id could be reused between the kq calls, so this seems safer to me. also avoids a list walk on each op cos the ptr is just there.
2014-09-22it's easy to allow bpfwrites bigger than MCLBYTES now that we haveDavid Gwynne
large cluster pools and MCLGETI. we could chain mbufs if we want to go even bigger. with a fix from Mathieu- <naabed at poolp dot org>
2014-09-22if you request a read timeout and then use kqueues to wait for them, youDavid Gwynne
end up waiting until the ring is full cos the timeout doesnt get set up when the knote is registered.
2014-09-19passing M_NOWAIT to m_tag_get means it can fail, which could hitDavid Gwynne
the failure path which leaks all the stuff the previous code in bpf_movein allocates. since it's only called from bpfwrite, use M_WAIT instead to make it reliable and just get rid of the bogus failure code. ok miod@
2014-07-12add a size argument to free. will be used soon, but for now default to 0.Ted Unangst
after discussions with beck deraadt kettenis.
2014-07-12sizeof(afh), afh being uint32, is cooler than literal "4"Henning Brauer
spotted by Kent R. Spillner <kspillner acm org>
2014-07-10time to claim copyrightHenning Brauer
2014-07-10some say you don't need NULL checks before free(). Not 0 either.Henning Brauer
2014-07-10introduce the revolutionary concept of NULL pointers. ok gccHenning Brauer
2014-07-10introduce bpf_mcopy_stripvlan, which cuts the 4 extra bytes out of theHenning Brauer
ether_vlan_header to make it a regular ether_header while copying into the bpf buffer. add bpf_mtap_stripvlan, which is a 1-line wrapper around _bpf_mtap passing this copy function in. ok benno
2014-07-09Add support bpfwrite on DLT_LOOP interfaces.YASUOKA Masahiko
ok guenther
2014-07-09Herr Reyk correctly pointed out that we don't need the if_pflog.h includeHenning Brauer
here any more
2014-07-09tedu bpf_mtap_pflog().Henning Brauer
now that it is a trivial wrapper around the extended bpf_mtap_hdr, we can use bpf_mtap_hdr directly. added benefit: pflog_bpfcopy doesn't need to be exported any more and can stay private to if_pflog.c ok benno bluhm reyk
2014-07-09bpf code surgery / shuffling / simplification.Henning Brauer
the various bpf_mtap_* are very similiar, they differ in what (and to some extent how) they prepend something, and what copy function they pass to bpf_catchpacket. use an internal _bpf_mtap as "backend" for bpf_mtap and friends. extend bpf_mtap_hdr so that it covers all common cases: if dlen is 0, nothing gets prepended. copy function can be given, if NULL the default bpf_mcopy is used. adjust the existing bpf_mtap_hdr users to pass a NULL ptr for the copy fn. re-implement bpf_mtap_af as simple wrapper for bpf_mtap_hdr. re-implement bpf_mtap_ether using bpf_map_hdr re-implement bpf_mtap_pflog as trivial bpf_mtap_hdr wrapper ok bluhm benno
2014-04-23Don't attempt to deal with link types supported by no drivers in theJeremie Courreges-Anglas
tree. ok henning@
2014-04-14"struct pkthdr" holds a routing table ID, not a routing domain one.Martin Pieuchot
Avoid the confusion by using an appropriate name for the variable. Note that since routing domain IDs are a subset of the set of routing table IDs, the following idiom is correct: rtableid = rdomain But to get the routing domain ID corresponding to a given routing table ID, you must call rtable_l2(9). claudio@ likes it, ok mikeb@
2014-03-30Eliminates struct pcred by moving the real and saved ugids intoPhilip Guenther
struct ucred; struct process then directly links to the ucred Based on a discussion at c2k10 or so before noting that FreeBSD and NetBSD did this too. ok matthew@
2013-12-24rearrange/correct timeout conditionals to work better.Ted Unangst
fixes negative timeout panics. tested by sthen.
2013-11-29panics still being reported. send bpf.c back to 1.85Ted Unangst
2013-11-17speelingDavid Gwynne
2013-11-15calculate the line in the sand before comparing it to ticks, which looksDavid Gwynne
more like the original conditional. if this doesnt fix rd thrushs panic, then this should be reverted to r1.85.
2013-11-12try bpf.c r1.84 again, this time without semantic changes to if statements.David Gwynne
cheers to sthen@ and krw@ for properly dealing with the fallout of my first commit.
2013-11-11Revert bpf.c 1.84 / bpfdesc.h 1.19 for now, "panic: timeout_add: to_ticks (-1)Stuart Henderson
< 0" seen by RD Thrush, http://article.gmane.org/gmane.os.openbsd.bugs/20113 where he has a long-running process using bpf which is active at the time of panic. krw@ agrees with reverting for now.
2013-11-11replace the user of ticks in a condition like "interval + start < ticks"David Gwynne
with "ticks - start > interval" because the latter copes with the ticks value wrapping. pointed out by guenther@ ok krw@
2012-12-28change the malloc(9) flags from M_DONTWAIT to M_NOWAIT; OK millert@Gleydson Soares