Age | Commit message (Collapse) | Author |
|
interface descriptor. It panics during attach of em(4) device at
boot.
|
|
descriptor.
We have the mess in network interface statistics. Only pseudo drivers
do per-CPU counters allocation, all other network devices use the old
`if_data'. The network stack partially uses per-CPU counters and
partially use `if_data', but the protection is inconsistent: some times
counters accessed with exclusive netlock, some times with shared
netlock, some times with kernel lock, but without netlock, some times
with another locks.
To make network interfaces statistics more consistent, always allocate
per-CPU counters at interface attachment time and use it instead of
`if_data'. At this step only move counters allocation to the if_attach()
internals. The `if_data' removal will be performed with the following
diffs to make review and tests easier.
ok bluhm
|
|
rip6_output() did modify inp_outputopts6 temporarily to provide
different ip6_pktopts to in6_embedscope(). Better pass inp_outputopts6
and inp_moptions6 as separate arguments to in6_embedscope().
Simplify the code that deals with these options in in6_embedscope().
Doucument inp_moptions and inp_moptions6 as protected by net lock.
OK kn@
|
|
Also disable TCP LRO on bridged vlan(4) and default for bpe(4), nvgre(4) and
vxlan(4).
ok bluhm@
|
|
introduce in_hdr_cksum_out(). It is used like in_proto_cksum_out().
OK claudio@
|
|
ok deraadt@ miod@ krw@
|
|
(*if_qstart)() be always called with netlock held doesn't work anymore
with PPPOE sessions.
Introduce `pipex_list_mtx' mutex(9) and use it to protect global pipex(4)
lists and radix trees.
Protect pipex(4) `session' dereference with reference counters, because we
could sleep when accessing pipex(4) from ioctl(2) path, and this is not
possible with mutex(9) held.
ok bluhm@
|
|
ok gnezdo@ semarie@ mpi@
|
|
|
|
|
|
testing has shown up to a 30% improvement in the veb forwarding
rate with this change.
an earlier diff was tested by hrvoje popovski
tested on amd64 and sparc64
|
|
the guts of this are in the etherbridge code which i added for
veb and used in bpe. there's a bit of boilerplate to make sure that
the addresses used for the endpoints will work with the tunnel
addresses that have been configured, but it's not too bad.
again, this is hard to use because ifconfig doesnt (yet) know how
to put ethernet addresses into the "add address" ioctl.
these ioctls could be used for things like evpn via bgpd though.
not sure if that's interesting to anyone though. it would probably
be more useful on vxlan interfaces.
|
|
the "ports" that nvgre provides to etherbridge are ip addresses
used in the underlay network.
ok patrick@ jmatthew@
|
|
using if_vinput factors out a lot of repeated code between tunnel
drivers, and it means monitor mode works on gre and mgre now too.
make the l2 gre interfaces do some things in the same order while
here.
|
|
ok dlg@
|
|
OK dlg@, bluhm@
No Opinion mpi@
Not against it claudio@
|
|
Fixed up a reference to gre_wccp where a fixed value from wwcp
standard was intended.
ok gkoehler@
|
|
this helps nvgre follow things like carp masters changing on the
inside of the virtual network.
"makes sense" jmatthew@
|
|
ok deraadt@
|
|
Most clonable interface drivers (except bridge, enc, loop, pppx,
switch, trunk and vlan) initialise the send queue's length to IFQ_MAXLEN
during *_clone_create() even though ifq_init(), which is eventually called
through if_attach(), does the same.
Remove all early "ifq_set_maxlen(&ifq->if_snd, IFQ_MAXLEN);" lines to leave
it to ifq_init() and have clonable drivers a tad more in sync.
OK mvs
|
|
ok dlg@ tobhe@
|
|
i've been wanting to do this for a while, and now that we've got
stoeplitz and it gives us 16 bits, it seems like the right time.
|
|
Prevent concurrency in the socket layer which is not ready for that.
Two recent data corruptions in pfsync(4) and the socket layer pointed
out that, at least, tun(4) was incorrectly using NET_RUNLOCK(). Until
we find a way in software to avoid future mistakes and to make sure that
only the softnet thread and some ioctls are safe to use a read version
of the lock, put everything back to the exclusive version.
ok stsp@, visa@
|
|
|
|
this is largely mechanical, except for carp. this moves the addition
of the carp link state hook after we're committed to using the new
interface as a carpdev. because the add can't fail, we avoid a
complicated unwind dance. also, this tweaks the carp linkstate hook
so it only updates the relevant carp interface, not all of the
carpdevs on the parent.
hrvoje popovski has tested an early version of this diff and it's
generally ok, but there's some splasserts that this diff fires that
i'll fix in an upcoming diff.
ok claudio@
|
|
the main semantic change is that things registering detach hooks
have to allocate and set a task structure that then gets added to
the list. this means if the task is allocated up front (eg, as part
of carps softc or bridges port structure), it avoids the possibility
that adding a hook can fail. a lot of drivers weren't checking for
failure, and unwinding state in the event of failure in other parts
was error prone.
while doing this i discovered that the list operations have to be
in a particular order, but drivers weren't doing that consistently
either. this diff wraps the list ops up so you have to seriously
go out of your way to screw them up.
ive also sprinkled some NET_ASSERT_LOCKED around the list operations
so we can make sure there's no potential for the list to be corrupted,
especially while it's being run.
hrvoje popovski has tested this a bit, and some issues he discovered
have been fixed.
ok sashan@
|
|
gre tunnel is set up. This could cause a panic. In gre(4) reject
outgoing packets during that time window. While there, count
interface errors and use generic unhandled_af().
bug reported by andreas at nullbyte dot se; OK dlg@
|
|
ok dlg@, sthen@, millert@
|
|
makes input bytes and packets consistent
|
|
this let's input processing bypass ifiqs. there's a performance
benefit from this, and it will let me tweak the backpressure detection
mechanism that ifiqs use without impacting on a stack of virtual
interfaces.
ive tested all of these except mpw, which i will end up testing
soon anyway.
|
|
i need to come back to this and make it flow a bit better, but this
is a good start.
|
|
no functional change.
|
|
conditional timeout_barrier(9).
OK kn@ dlg@
|
|
ok claudio@ dlg@
|
|
ok dlg@
|
|
There's nothing underneath the tunnels that needs configuration,
so there's no point in keepign track of configured multicast
addresses. We will at least save a bit of memory.
|
|
ENETRESET in hardware drivers means you should reprogram the hardware.
There's no hardware to reprogram, so just turn it into 0 on the way
out.
|
|
calls to m_get/M_GET calls because M_MOVE_PKTHDR() is initialising
the pkthdr and so it is not needed when allocation the header.
OK bluhm@
|
|
this is a step toward better rfc6040 support
ok claudio@
|
|
no functional change.
|
|
the mbuf prio will still be set according to the llprio value, but the
tos on the packet may be forced to a specific number by txprio
|
|
rfc1853 is about IP in IP Tunneling. rfc2003 about IP Encapsulation
within IP agrees.
|
|
for l3 interfaces (gre and mgre), allow txprio from the payload,
the mbuf, or a hardcoded value. for l2 interfaces (egre, ngre, and
eoip), get txprio from the mbuf or a hardcoded value.
ok claudio@
|
|
|
|
|
|
the llprio is already used to set the gre and eoip packet tos/tclass,
but it was queued at the default prio before this.
|
|
llprios are valued 0 to 7, while the ip tos/dscp/tclass is an 8 bit
value. fortunately the high 3 bits map nicely to the llprio values,
so we shift the llprio into place when generating the keepalive
frames. the llprio is defaulted to the value that cisco uses for
their gre keepalives.
|
|
the timeout gets configured instead of gre_up().
this avoids complex gre_ioctl() ordering rules and
enables the sc_ka_hold timeout before the first packet
is received.
from markus@
|
|
from markus@
|
|
|