Age | Commit message (Collapse) | Author |
|
ioctl function after the device has been pulled out. Also accept
this error code in bpf_detachd() to prevent a kernel panic. tcpdump(8)
may run while the interface is detached.
from Moritz Buhl; OK stsp@
|
|
affects the bpfioctl() and bpfclose() path.
lock assertion reported and fix tested by Pierre Emeriaud; OK visa@
|
|
The account flag `ASU' will no longer be set but that makes suser()
mpsafe since it no longer mess with a per-process field.
No objection from millert@, ok tedu@, bluhm@
|
|
internally it uses mbufs to handle the chain of buffers, but the
caller doesnt have to deal with that or allocate a temporary buffer
with the header attached.
ok mpi@
|
|
bpf assumed that it was being unconditionally attached to network
interfaces, and maintained a pointer to a struct ifnet *. this was
mostly used to get at the name of the interface, which is how
userland asks to be attached to a particular interface. this diff
adds a pointer to the name and uses it instead of the interface
pointer for these lookups. this in turn allows bpf to be attached
to arbitrary subsystems in the kernel which just have to supply a
name rather than an interface pointer. for example, bpf could be
attached to pf_test so you can see what packets are about to be
filtered. mpi@ is using this to look at usb transfers.
bpf still uses the interface pointer for bpfwrite, and for enabling
and disabling promisc. however, these are nopped out for subsystems.
ok mpi@
|
|
ok deraadt@ krw@
|
|
Tested by Hrvoje Popovski, ok bluhm@
|
|
expansion bug in bpf prevented protocols above 127. m_data is
signed, bpf_mbuf_ldb() returns unsigned.
bug report Matthias Pitzl; OK deraadt@ millert@
|
|
inline function instead of casting it to sockaddr. While there,
use inline instead of __inline for all these conversions. Some
struct sockaddr casts can be avoided completely.
OK dhill@ mpi@
|
|
on amd64 and i386.
|
|
ok dlg@, visa@
|
|
rectification.
|
|
With this change bpf_catchpacket() no longer need the KERNEL_LOCK().
Tested by Hrvoje Popovski who reported a recursion in the previous
attempt.
ok bluhm@
|
|
bpf_mpath_ether().
Problem reported by Hrvoje Popovski.
|
|
With this change bpf_catchpacket() no longer need the KERNEL_LOCK().
ok bluhm@, jmatthew@
|
|
before we call ifpromisc() and possibly sleep.
ok bluhm@
|
|
are fulfilled in bpf_catchpacket().
|
|
|
|
While here properly account for used reference in bpfwrite().
ok bluhm@
|
|
This will help trading the KERNEL_LOCK for a mutex.
ok bluhm@
|
|
|
|
ok natano@ deraadt@
|
|
This will allow us make bpf_tap() KERNEL_LOCK() free.
Discussed with dlg@ and input from guenther@
|
|
already does it.
|
|
for the reference counting.
ok dlg@
|
|
and bpfwrite(), all of which will need to grabe a lock to protect the
buffers.
ok dlg@
|
|
descriptor is referenced before it is inserted in the global list.
ok dlg@
|
|
minor number for reuse by the device cloning code. This fixes a panic
reported by bluhm@.
initial diff from tedu
ok deraadt
|
|
to ifconfig.
"llprio" allows one to set the priority of packets that do not go through
pf(4), as the case is for arp(4) or bpf(4).
ok sthen@ mikeb@
|
|
the srp_ref struct is used to track the location of the callers
hazard pointer so later calls to srp_follow and srp_enter already
know what to clear. this in turn means most of the caveats around
using srps go away. specifically, you can now:
- switch cpus while holding an srp ref
- ie, you can sleep while holding an srp ref
- you can take and release srp refs in any order
the original intent was to simplify use of the api when dealing
with complicated data structures. the caller now no longer has to
track the location of the srp a value was fetched from, the srp_ref
effectively does that for you.
srp lists have been refactored to use srp_refs instead of srpl_iter
structs.
this is in preparation of using srps inside the ART code. ART is a
complicated data structure, and lookups require overlapping holds
of srp references.
ok mpi@ jmatthew@
|
|
this makes it more obvious that the bpf code should only read
packets, never modify them.
now possible because the paths that care about M_FILDROP set it
after calling bpf_mtap.
ok mpi@ visa@ deraadt@
|
|
node in /dev, that services all bpf consumers (up to 1024). Also,
disallow the usage of all but the first minor device, so accidental use
of another minor device will attract attention.
Cloning bpf offers some advantages:
- Users with high bpf usage won't have to clutter their /dev with device
nodes.
- A lot of programs in base use a pattern like this to acces bpf:
int fd, n = 0;
do {
(void)snprintf(device, sizeof device, "/dev/bpf%d", n++);
fd = open(device, mode);
} while (fd < 0 && errno == EBUSY);
Those can now be replaced by a simple open(), without loop.
ok mikeb
"right time in the cycle to try" deraadt
|
|
the code was confusing around how it dealt with packets in mbufs
vs plain memory buffers with a lenght.
this renames bpf_filter to _bpf_filter, and changes it so the packet
memory is referred to by an opaque pointer, and callers have to
provide a set of operations to extra values from that opaque pointer.
bpf_filter is now provided as a wrapper around _bpf_filter. it
provides a set of operators that work on a straight buffer with a
lenght.
this also adds a bpf_mfilter function which takes an mbuf instead
of a buffer, and it provides explicit operations for extracting
values from mbufs.
if we want to use bpf filters against other data structures (usb
or scsi packets maybe?) we are able to provide functions for
extracting payloads from them and use _bpf_filter as is.
ok canacar@
|
|
nothing uses them, and the implementation make incorrect assumptions
about mbufs within bpf processing that could lead to some weird
failures.
ok sthen@ deraadt@ mpi@
|
|
ok mpi@
|
|
|
|
kernel lock protects it against other cpus, but splnet prevents bpf
code running at splsoftnet (eg, like bridge does) from having the
rings trampled by a hardware interrupt on the same cpu.
ok mpi@ jmatthew@
|
|
this works around a toctou bug in a very common idiom in our tree,
in between the two lines below:
if (ifp->if_bpf)
bpf_mtap(ifp->if_bpf, m, BPF_DIRECTION_OUT);
figured out by and diff from haesbart
|
|
problem noted by yasuoka@
ok yasuoka@ millert@
|
|
|
|
corresponding interface.
bfp_tap() and _bpf_mtap() are mostly run without the KERNEL_LOCK. The
use of SRPs in these functions gives us the guarantees that manipulated
BPF descriptors are alive but not the associated interface desctiptor!
And indeed they can be cleared by another CPU running bpf_detachd().
Prevent a race reported by Hrvoje Popovski when closing tcpdump(8) with
an IPL_MPSAFE ix(4).
ok mikeb@, dlg@, deraadt@
|
|
bpf_program contains a pointer to that same array, but also the
number of elements in it. this allows us to know the size when we
want to free them.
ok deraadt@
|
|
ok mpi
|
|
needs to see lo0 in the output path.
ok claudio@
|
|
Use instead the RTF_LOCAL flag to loop local traffic back to the
corresponding protocol queue.
With this change rt_ifp is now always the same as rt_ifa->ifa_ifp.
ok claudio@
|
|
Should fix a NULL dereference reported by guenther@.
ok dlg@
|
|
this replaces the hand rolled list. the code has always used hand
rolled lists, but that gets a bit cumbersome when theyre SRPs.
requested ages ago by mpi@
|
|
this differs slightly from 1.121 in that it uses the new srp_follow()
to walk the list of descriptors on an interface. this is instead
of interleaving srp_enter() and srp_leave(), which can lead to races
and corruption if you're touching the same SRPs at different IPLs
on the same CPU.
ok deraadt@ jmatthew@
|
|
ill debug this out of the tree.
|
|
this was originally implemented by jmatthew@ last year, and updated
by us both during s2k15.
there are four data structures that need to be looked after.
the first is the bpf interface itself. it is allocated and freed
at the same time as an actual interface, so if you're able to send
or receive packets, you're able to run bpf on an interface too.
dont need to do any work there.
the second are bpf descriptors. these represent userland attaching
to a bpf interface, so you can have many of them on a single bpf
interface. they were arranged in a singly linked list before. now
the head and next pointers are replaced with SRP pointers and
followed by srp_enter. the list updates are serialised by the kernel
lock.
the third are the bpf filters. there is an inbound and outbound
filter on each bpf descriptor, ann a process can replace them at
any time. the pointers from the descriptor to those is also changed
to be accessed via srp_enter. updates are serialised by the kernel
lock.
the fourth thing is the ring that bpf writes to for userland to
read. there's one of these per descriptor. because these are only
updated when a filter matches (which is hopefully a relatively rare
event), we take the kernel lock to serialise the writes to the ring.
all this together means you can run bpf against a packet without
taking the kernel lock unless you actually caught a packet and need
to send it to userland. even better, you can run bpf in parallel,
so if we ever support multiple rings on a single interface, we can
run bpf on each ring on different cpus safely.
ive hit this pretty hard in production at work (yay dhcrelay) on
myx (which does rx outside the biglock).
ok jmatthew@ mpi@ millert@
|