Age | Commit message (Collapse) | Author |
|
From David Hill; OK mpi@; tested kspillner@; tweaks bluhm@
|
|
Spotted by Dilli Paudel <dilli ! paudel at oracle ! com>
ok jung@, ok mikeb@
|
|
can be used or should be released by rtfree(9).
It currently checks if the route is UP and is not attached to a stall
ifa.
ok bluhm@, claudio@
|
|
checks done in rtrequest1(9).
This chunk has been introduced in 1991 when rtrequest1(RTM_DELETE...)
was not doing a route lookup and no longer make any sense.
ok bluhm@
|
|
|
|
this differs slightly from 1.121 in that it uses the new srp_follow()
to walk the list of descriptors on an interface. this is instead
of interleaving srp_enter() and srp_leave(), which can lead to races
and corruption if you're touching the same SRPs at different IPLs
on the same CPU.
ok deraadt@ jmatthew@
|
|
As a side effect there's no need to run if_attachdomain() after the
list of domains has been built.
ok claudio@, reyk@
|
|
OK mikeb@
|
|
in bridge_localbroadcast() too.
This should fix another alignment issue kettenis@ is seeing.
ok dlg@
|
|
since July. The code involved deals with af-to handling.
|
|
rtrequest1(9).
This simplifies rtfree(9) dances and will prevent another CPU to free
the entry before we're done with it as soon as routing functions can
be executed in parallel.
ok bluhm@, mikeb@
|
|
This fixes a crash during ifconfig bridge0 destroy.
OK mpi@
|
|
IN6_IFF_NODAD pseudo-flag not being set.
This was just a flag for spaghetti code that should not exist in the
first place.
Tested by sebastia@, ok sthen@
|
|
everywhere to avoid passing around pointers to uninitialized stack
memory. While there, fix the call to in6_recoverscope() in
fill_drlist().
OK deraadt@ mpi@
|
|
ill debug this out of the tree.
|
|
|
|
Keep route entry/BSD compatibility goos in the rtable layer. The way
addresses and masks (prefix-lengths) are encoded is really tied to the
radix-tree implementation.
Since we decided to no longer support non-contiguous masks, we could get
rid of some extra "sockaddr" allocations and reduce the memory grows
related to the use of a multibit-trie.
|
|
his reference ART implementation from a BSD 4-clause to ISC.
Thanks a lot to him!
|
|
ART implementation.
ART (Allotment Routing Table) is a multibit-trie algorithm invented by
D. Knuth while reviewing Yoichi's SMART [0] (Smart Multi-Array Routing
Table) paper.
This implementation, unlike the one from the KAME project, supports
variable stride lengths which makes it easier to adapt the consumed
memory/speed trade-off. It also let you use a bigger first-level
table, what other algorithms such as POPTRIE [1] need to implement
separately.
Adaptation to the OpenBSD kernel has been done with two different data
structures. ART nodes and route entries are managed separately which
makes the algorithm implementation free of any MULTIPATH logic.
This implementation does not include Path Compression.
[0] http://www.hariguchi.org/art/smart.pdf
[1] http://conferences.sigcomm.org/sigcomm/2015/pdf/papers/p57.pdf
ok dlg@, reyk@
|
|
which are routed on behalf route-to action.
OK bluhm@
|
|
into a common pattern. In the man page clarify the usage of the
returned route.
OK mpi@ mikeb@ jmc@
|
|
Note that it is safe to keep a reference to the ifa pointed by a route
entry after freeing the entry iff the ifa is valid.
ok bluhm@
|
|
to defer the work currently done in bridge_input() and requiring the
KERNEL_LOCK to bridgeintr().
Tested by sthen@
ok rzalamena@, dlg@, bluhm@
|
|
lookups use the radix API directly.
ok mikeb@
|
|
Prodded by and ok bluhm@
|
|
|
|
is freed when we no longer need it.
In this case both code paths are executed in process context and thus
serialized by the KERNEL_LOCK. Since we are adding a route entry to
the table in both cases, rtfree(9) will not actually free the entry
because it is still RT_VALID.
ok bluhm@
|
|
This will simplify upcoming conversions of rt_refcnt-- to rtfree(9).
Such conversions are needed for proper MP refcounting.
ok deraadt@, dlg@
|
|
|
|
this was originally implemented by jmatthew@ last year, and updated
by us both during s2k15.
there are four data structures that need to be looked after.
the first is the bpf interface itself. it is allocated and freed
at the same time as an actual interface, so if you're able to send
or receive packets, you're able to run bpf on an interface too.
dont need to do any work there.
the second are bpf descriptors. these represent userland attaching
to a bpf interface, so you can have many of them on a single bpf
interface. they were arranged in a singly linked list before. now
the head and next pointers are replaced with SRP pointers and
followed by srp_enter. the list updates are serialised by the kernel
lock.
the third are the bpf filters. there is an inbound and outbound
filter on each bpf descriptor, ann a process can replace them at
any time. the pointers from the descriptor to those is also changed
to be accessed via srp_enter. updates are serialised by the kernel
lock.
the fourth thing is the ring that bpf writes to for userland to
read. there's one of these per descriptor. because these are only
updated when a filter matches (which is hopefully a relatively rare
event), we take the kernel lock to serialise the writes to the ring.
all this together means you can run bpf against a packet without
taking the kernel lock unless you actually caught a packet and need
to send it to userland. even better, you can run bpf in parallel,
so if we ever support multiple rings on a single interface, we can
run bpf on each ring on different cpus safely.
ive hit this pretty hard in production at work (yay dhcrelay) on
myx (which does rx outside the biglock).
ok jmatthew@ mpi@ millert@
|
|
load ifp->if_bpf into a local variable, test that, and pass it to bpf.
this is instead of instead of assuming ifp->if_bpf wont change between
checking it and passing it to bpf.
|
|
does not have any registered handler.
Plug a mbuf leak found by sthen@ with gif(4) in a bridge.
ok sthen@, claudio@
|
|
triggered for a reply-to rule. It turns out this case has been using
uninitialised memory as if it were a valid pf pool.
As the rest of the function assumes a valid pool for now just return.
Problem reported by RD Thrush.
ok jung@ mikeb@
|
|
directly. Also protect non mp-safe functions while at it.
ok mpi@.
|
|
|
|
|
|
|
|
pointed out by and OK bluhm@
|
|
reported by rpe@
|
|
ok mpi@
|
|
packets directly into the network stack with ip_output().
The locking is intentionally left as is and will be improved in
another commit.
Input / OK bluhm@, OK benno@
|
|
if_input() has been designed to be able to safely handle a batch of
packets from physical drivers to the network stack. Most of these
drivers have an interrupt routine executed at IPL_NET and the check
made sense during the conversion. However we also want to re-enqueue
packets with if_input() from the network stack currently running at
IPL_SOFTNET.
ok claudio@
|
|
ok mpi@, claudio@.
|
|
with MPLS packets.
ok mpi@, claudio@
|
|
path was taken. This both prevents warnings from clang and acts as a
sanity check.
ok mcbride@ henning@
|
|
to optimize for an INET-only kernel, as well as the fantasy unicorn
INET6-only kernel. (INET-only kernel still works)
prompted by deraadt
ok bluhm sashan
|
|
OK @mcbride
|
|
ok mcbride@
|
|
ok mcbride@
|
|
OK deraadt.
|