Age | Commit message (Collapse) | Author |
|
switch(4) currently supports OpenFlow 1.3.5.
Currently, it's disabled by the kernel config.
With help from yasuoka@ reyk@ jsg@.
ok deraadt@ yasuoka@ reyk@ henning@
|
|
to ifconfig.
"llprio" allows one to set the priority of packets that do not go through
pf(4), as the case is for arp(4) or bpf(4).
ok sthen@ mikeb@
|
|
theyre currently unused, so no functional change.
|
|
|
|
ok mpi@
|
|
|
|
work is represented by struct task.
the start routine is now wrapped by a task which is serialised by the
infrastructure. if_start_barrier has been renamed to ifq_barrier and
is now implemented as a task that gets serialised with the start
routine.
this also adds an ifq_restart() function. it serialises a call to
ifq_clr_oactive and calls the start routine again. it exists to
avoid a race that kettenis@ identified in between when a start
routine discovers theres no space left on a ring, and when it calls
ifq_set_oactive. if the txeof side of the driver empties the ring
and calls ifq_clr_oactive in between the above calls in start, the
queue will be marked oactive and the stack will never call the start
routine again.
by serialising the ifq_set_oactive call in the start routine and
ifq_clr_oactive calls we avoid that race.
tested on various nics
ok mpi@
|
|
ok mpi@
|
|
the intention is to make it more clear what belongs to a transmit
queue and what belongs to an interface.
suggested by and ok mpi@
|
|
<net/if_var.h> because some other operating systems have defines in
there.
ok jasper@
|
|
this avoids current recursion to pf_test() function. the change also
switches icmp_error()/icmp6_error() to use ip_send()/ip6_send() so
they are safe for PF.
The idea comes from Markus Friedl. bluhm, mikeb and mpi helped me
a lot to get it into shape.
OK bluhm@, mpi@
|
|
fallback to a SLIST.
ok dlg@, jasper@
|
|
existing start routines will still be called under the kernel lock
and at IPL_NET.
mpsafe start routines will be serialised so only one instance of
each interfaces function will be running in the kernel at any point
in time. this guarantees packets will be dequeued in order, and the
start routines dont have to lock against themselves because if_start
does it for them.
the code to do that is based on the scsi runqueue code.
this also provides an if_start_barrier() function that should wait
until any currently running instances of if_start have finished.
a driver can opt in to the mpsafe if_start call by doing the following:
1. setting ifp->if_xflags = IFXF_MPSAFE
2. only calling if_start() instead of its own start routine
3. clearing IFF_RUNNING before calling if_start_barrier() on its way down
4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)
to simplify the implementation the tx mitigation code has been removed.
tested by several
ok mpi@ jmatthew@
|
|
changing.
|
|
|
|
there are two things shared between the network stack and drivers
in the send path: the send queue and the IFF_OACTIVE flag. the send
queue is now protected by a mutex. this diff makes the oactive
functionality mpsafe too.
IFF_OACTIVE is part of if_flags. there are two problems with that.
firstly, if_flags is a short and we dont have any MI atomic operations
to manipulate a short. secondly, while we could make the IFF_OACTIVE
operates mpsafe, all changes to other flags would have to be made
safe at the same time, otherwise a read-modify-write cycle on their
updates could clobber the oactive change.
instead, this moves the oactive mark into struct ifqueue and provides
an API for changing it. there's ifq_set_oactive, ifq_clr_oactive,
and ifq_is_oactive. these are modelled on ifsq_set_oactive,
ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd.
this diff includes changes to all the drivers manipulating IFF_OACTIVE
to now use the ifsq_{set,clr_is}_oactive API too.
ok kettenis@ mpi@ jmatthew@ deraadt@
|
|
|
|
hfsc needed a rollback ifqop to requeue the mbuf because it used
ml_dequeue in the begin op. now it uses MBUF_LIST_FIRST to get a
ref to the first mbuf in deq_begin.
now the disciplines dont need a rollback op, so ifq_deq_rollback
can be simplified to just releasing the mutex.
based on a discussion with kenjiro cho
|
|
fixing it now before i regret it more.
|
|
the code is refactored so the IFQ macros call newly implemented ifq
functions. the ifq code is split so each discipline (priq and hfsc
in our case) is an opaque set of operations that the common ifq
code can call. the common code does the locking, accounting (ifq_len
manipulation), and freeing of the mbuf if the disciplines enqueue
function rejects it. theyre kind of like bufqs in the block layer
with their fifo and nscan disciplines.
the new api also supports atomic switching of disciplines at runtime.
the hfsc setup in pf_ioctl.c has been tweaked to build a complete
hfsc_if structure which it attaches to the send queue in a single
operation, rather than attaching to the interface up front and
building up a list of queues.
the send queue is now mutexed, which raises the expectation that
packets can be enqueued or purged on one cpu while another cpu is
dequeueing them in a driver for transmission. a lot of drivers use
IFQ_POLL to peek at an mbuf and attempt to fit it on the ring before
committing to it with a later IFQ_DEQUEUE operation. if the mbuf
gets freed in between the POLL and DEQUEUE operations, fireworks
will ensue.
to avoid this, the ifq api introduces ifq_deq_begin, ifq_deq_rollback,
and ifq_deq_commit. ifq_deq_begin allows a driver to take the ifq
mutex and get a reference to the mbuf they wish to try and tx. if
there's space, they can ifq_deq_commit it to remove the mbuf and
release the mutex. if there's no space, ifq_deq_rollback simply
releases the mutex. this api was developed to make updating the
drivers using IFQ_POLL easy, instead of having to do significant
semantic changes to avoid POLL that we cannot test on all the
hardware.
the common code has been tested pretty hard, and all the driver
modifications are straightforward except for de(4). if that breaks
it can be dealt with later.
ok mpi@ jmatthew@
|
|
attached to a carp(4) or bridge(4) member, to not dereference rt_ifp
directly.
ok visa@
|
|
descriptor.
Allow to get rid of two if_ref() in the output paths.
ok dlg@
|
|
L2 resolution depends on the protocol (encoded in the route entry) and
an ``ifp''. Not having to care about an ``ifa'' makes our life easier
in our MP effort. Fewer dependencies between data structures implies
fewer headaches.
Discussed with bluhm@, ok claudio@
|
|
rdomains and bridges on the local system. This can be used to route
through local rdomains, to create L2 devices (like trunks) between
them, and many other things.
Discussed with many, with input from mpi@
OK sthen@ phessler@ yasuoka@ mikeb@
|
|
of rt_getifa() when adding link level route from outside the
kernel.
ok claudio@
|
|
also the comment above IFQ_ENQUEUE that says the pattr argument is unused.
ok mpi@
|
|
Necessary bumps in Ports will be handled by sthen@.
OK mpi@ dlg@
|
|
this is done by moving to the refcnt api and using refcnt_finalize.
tested by Hrjove Popovski
ok mpi@
|
|
Instead of violating a layer of abstraction by keeping per pseudo-driver
informations in "struct ifnet", the port trunk is now passed as a cookie
to the interface input handler (ifih).
The time of per pseudo-driver hack in the network stack is over!
ok mikeb@
|
|
the mbuf in both the hfsc and priq error paths.
ok mikeb@ mpi@ claudio@ henning@
|
|
needs to see lo0 in the output path.
ok claudio@
|
|
context.
ok mpi@, claudio@
|
|
Use instead the RTF_LOCAL flag to loop local traffic back to the
corresponding protocol queue.
With this change rt_ifp is now always the same as rt_ifa->ifa_ifp.
ok claudio@
|
|
the protocol queues.
It basically does what looutput() was doing but having a generic
function will allow us to get rid of the loopback hack overwwritting
the rt_ifp field of RTF_LOCAL routes.
ok mikeb@, dlg@, claudio@
|
|
of using SRPs as a backend for if_get.
this also tries to document how things work and what if index 0 is for.
ok mpi@ claudio@
|
|
to pass additional context or transient data with the similar life
time.
ok mpi, suggestions, hand holding and ok from dlg
|
|
noticed by deraadt@
|
|
instead of having every driver that manipulates the ifih list
understand SRPLs, this moves that processing into if_ih_insert and
if_ih_remove functions.
we rely on the kernel lock to serialise the modifications to the
list.
tested by mpi@
ok mpi@ claudio@ mikeb@
|
|
if_get can get a reference to an ifp, but it never releases that
reference. this provides an if_put function that can be used to
decrement the refcount.
we cannot come up with a scheme for letting the network stack run on
one (or many) cpus while ioctls are pulling interfaces down on another
cpu without refcounts for the interfaces.
if_put is going in now so we can go through the stack and put the
necessary calls to it in, and then we'll backfill this implementation
to actually check the refcounts when the interface detaches.
ok mpi@ mikeb@ claudio@
|
|
ifp in order to access its ifih handlers.
So get rid of if_get() in the various ifih handlers we know the ifp is
live at this point.
ok dlg@
|
|
talking about (*ifp->if_output)().
ok claudio@, dlg@
|
|
|
|
ok dlg@
|
|
the second (unused) argument of the input packet handlers.
ok dlg@
|
|
To keep the list of input handlers short, multiple vlans share the
same ifih.
if_input_process() now looks if the interface of a mbuf changed to
make sure the corresponding handlers are executed. This is a hack
and will be improved later.
ok dlg@
|
|
will be exported to userland with the existing sysctl, getifaddrs()
and routing socket (if_msghdr.ifm_data) interfaces that expose
if_data. All programs and daemons - Apps - that call the
SIOCGIFRDOMAIN ioctl in a getifaddrs() loop or after receiving an
interface message on the routing socket can now remove the pointless
additional ioctl. In base, that could be: dhclient, isakmpd, dhcpd,
dhcrelay, ntpd, ospfd, ripd, ifconfig.
No ABI breakage because it uses a previously unused pad field in if_data.
OK mpi@ deraadt@
|
|
change behaviour for now but will allow to share the same address with
the parent interface without major hacks.
OK mpi@
|
|
a packet on the sending queue of an interface.
Tested by many, thanks a lot!
ok dlg@, claudio@
|
|
this has a slight semantic change. previously pipex would only
process up to 128 packets on the input and output queues at a time
and would reschedule the softint if there were any left. now it
mq_delists the current set of pending packets and only processes
them. if anything is added to the queues later they'll cause the
softint to run again.
this in turn lets us deprecate sysctl_ifq since nothing uses it
anymore. because niqueues are mostly wrappers around mbuf_queues,
we can provide sysctl_mq and just #define sysctl_niq to it.
pipex bits are ok yasuoka@
|
|
might be overwritten by pseudo-drivers.
ok dlg@, henning@
|