Age | Commit message (Collapse) | Author |
|
corresponding interface.
bfp_tap() and _bpf_mtap() are mostly run without the KERNEL_LOCK. The
use of SRPs in these functions gives us the guarantees that manipulated
BPF descriptors are alive but not the associated interface desctiptor!
And indeed they can be cleared by another CPU running bpf_detachd().
Prevent a race reported by Hrvoje Popovski when closing tcpdump(8) with
an IPL_MPSAFE ix(4).
ok mikeb@, dlg@, deraadt@
|
|
bpf_program contains a pointer to that same array, but also the
number of elements in it. this allows us to know the size when we
want to free them.
ok deraadt@
|
|
ok mpi
|
|
needs to see lo0 in the output path.
ok claudio@
|
|
Use instead the RTF_LOCAL flag to loop local traffic back to the
corresponding protocol queue.
With this change rt_ifp is now always the same as rt_ifa->ifa_ifp.
ok claudio@
|
|
Should fix a NULL dereference reported by guenther@.
ok dlg@
|
|
this replaces the hand rolled list. the code has always used hand
rolled lists, but that gets a bit cumbersome when theyre SRPs.
requested ages ago by mpi@
|
|
this differs slightly from 1.121 in that it uses the new srp_follow()
to walk the list of descriptors on an interface. this is instead
of interleaving srp_enter() and srp_leave(), which can lead to races
and corruption if you're touching the same SRPs at different IPLs
on the same CPU.
ok deraadt@ jmatthew@
|
|
ill debug this out of the tree.
|
|
this was originally implemented by jmatthew@ last year, and updated
by us both during s2k15.
there are four data structures that need to be looked after.
the first is the bpf interface itself. it is allocated and freed
at the same time as an actual interface, so if you're able to send
or receive packets, you're able to run bpf on an interface too.
dont need to do any work there.
the second are bpf descriptors. these represent userland attaching
to a bpf interface, so you can have many of them on a single bpf
interface. they were arranged in a singly linked list before. now
the head and next pointers are replaced with SRP pointers and
followed by srp_enter. the list updates are serialised by the kernel
lock.
the third are the bpf filters. there is an inbound and outbound
filter on each bpf descriptor, ann a process can replace them at
any time. the pointers from the descriptor to those is also changed
to be accessed via srp_enter. updates are serialised by the kernel
lock.
the fourth thing is the ring that bpf writes to for userland to
read. there's one of these per descriptor. because these are only
updated when a filter matches (which is hopefully a relatively rare
event), we take the kernel lock to serialise the writes to the ring.
all this together means you can run bpf against a packet without
taking the kernel lock unless you actually caught a packet and need
to send it to userland. even better, you can run bpf in parallel,
so if we ever support multiple rings on a single interface, we can
run bpf on each ring on different cpus safely.
ive hit this pretty hard in production at work (yay dhcrelay) on
myx (which does rx outside the biglock).
ok jmatthew@ mpi@ millert@
|
|
receiving interface in the packet header of every mbuf.
The interface pointer should now be retrieved when necessary with
if_get(). If a NULL pointer is returned by if_get(), the interface
has probably been destroy/removed and the mbuf should be freed.
Such mechanism will simplify garbage collection of mbufs and limit
problems with dangling ifp pointers.
Tested by jmatthew@ and krw@, discussed with many.
ok mikeb@, bluhm@, dlg@
|
|
ok krw@ miod@
|
|
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@
|
|
ALTQ version has been on tech@ for years, people were generally ok with it.
ok henning
|
|
nmap is broken, as reported by kent fritz.
pending further investigation, we should keep nmap working until a
better fix is developed for the original problem.
|
|
the start time so the next read behaves the same.
from Simon Mages
|
|
problem reported by Mages Simon
ok guenther
|
|
objective: vnode.h doesn't include uvm_extern.h anymore.
followup changes: include uvm_extern.h or lock.h where necessary.
ok and help from deraadt
|
|
|
|
|
|
parent that doesnt offload the tag insertion, we need to chop the
vlan subheader out before the filter is run, not after.
this moves the mbuf surgery out from the bpf layer into the vlan
layer.
ok henning@ jmatthew@
|
|
get multiple processes in the kernel these sets cant race and allow people
to set the default greater than the max.
|
|
in it cos its only called on new systems, when it actually does.
we dont care about old or new systems, just ours. the code is called, the
fact that it exists is enough to demonstrate that.
|
|
we refcount the bpf_d memory correctly so it cant go away. possibly worse
is the bpf minor id could be reused between the kq calls, so this seems
safer to me. also avoids a list walk on each op cos the ptr is just there.
|
|
large cluster pools and MCLGETI.
we could chain mbufs if we want to go even bigger.
with a fix from Mathieu- <naabed at poolp dot org>
|
|
end up waiting until the ring is full cos the timeout doesnt get set up
when the knote is registered.
|
|
the failure path which leaks all the stuff the previous code in
bpf_movein allocates.
since it's only called from bpfwrite, use M_WAIT instead to make
it reliable and just get rid of the bogus failure code.
ok miod@
|
|
after discussions with beck deraadt kettenis.
|
|
spotted by Kent R. Spillner <kspillner acm org>
|
|
|
|
|
|
|
|
ether_vlan_header to make it a regular ether_header while copying into
the bpf buffer.
add bpf_mtap_stripvlan, which is a 1-line wrapper around _bpf_mtap passing
this copy function in.
ok benno
|
|
ok guenther
|
|
here any more
|
|
now that it is a trivial wrapper around the extended bpf_mtap_hdr, we can
use bpf_mtap_hdr directly. added benefit: pflog_bpfcopy doesn't need to
be exported any more and can stay private to if_pflog.c
ok benno bluhm reyk
|
|
the various bpf_mtap_* are very similiar, they differ in what (and to some
extent how) they prepend something, and what copy function they pass to
bpf_catchpacket.
use an internal _bpf_mtap as "backend" for bpf_mtap and friends.
extend bpf_mtap_hdr so that it covers all common cases:
if dlen is 0, nothing gets prepended.
copy function can be given, if NULL the default bpf_mcopy is used.
adjust the existing bpf_mtap_hdr users to pass a NULL ptr for the copy fn.
re-implement bpf_mtap_af as simple wrapper for bpf_mtap_hdr.
re-implement bpf_mtap_ether using bpf_map_hdr
re-implement bpf_mtap_pflog as trivial bpf_mtap_hdr wrapper
ok bluhm benno
|
|
tree. ok henning@
|
|
Avoid the confusion by using an appropriate name for the variable.
Note that since routing domain IDs are a subset of the set of routing
table IDs, the following idiom is correct:
rtableid = rdomain
But to get the routing domain ID corresponding to a given routing table
ID, you must call rtable_l2(9).
claudio@ likes it, ok mikeb@
|
|
struct ucred; struct process then directly links to the ucred
Based on a discussion at c2k10 or so before noting that FreeBSD and
NetBSD did this too.
ok matthew@
|
|
fixes negative timeout panics. tested by sthen.
|
|
|
|
|
|
more like the original conditional.
if this doesnt fix rd thrushs panic, then this should be reverted to
r1.85.
|
|
cheers to sthen@ and krw@ for properly dealing with the fallout of my
first commit.
|
|
< 0" seen by RD Thrush, http://article.gmane.org/gmane.os.openbsd.bugs/20113
where he has a long-running process using bpf which is active at the time of
panic. krw@ agrees with reverting for now.
|
|
with "ticks - start > interval" because the latter copes with the ticks
value wrapping.
pointed out by guenther@
ok krw@
|
|
|
|
on a packet, make bpf_catchpacket take a timeval indicating when the
packet was captured. Move microtime to the calling functions and grab
the timestamp as soon as we know that we're going to call catchpacket
at least once.
From NetBSD, ok deraadt, claudio, sthen
|
|
waiting for memory to become available
obtained from netbsd with tweaks, with input from deraadt and
blambert, ok deraadt, claudio
|