Age | Commit message (Collapse) | Author |
|
Use a common struct route for both inet and inet6. Unfortunately
struct sockaddr is shorter than sockaddr_in6, so netinet/in.h has
to be exposed from net/route.h. Struct route has to be bsd visible
for userland as netstat kvm code inspects inp_route. Internet PCB
and TCP SYN cache can use a plain struct route now. All specific
sockaddr types for inet and inet6 are embeded there.
OK claudio@
|
|
interface descriptor. It panics during attach of em(4) device at
boot.
|
|
descriptor.
We have the mess in network interface statistics. Only pseudo drivers
do per-CPU counters allocation, all other network devices use the old
`if_data'. The network stack partially uses per-CPU counters and
partially use `if_data', but the protection is inconsistent: some times
counters accessed with exclusive netlock, some times with shared
netlock, some times with kernel lock, but without netlock, some times
with another locks.
To make network interfaces statistics more consistent, always allocate
per-CPU counters at interface attachment time and use it instead of
`if_data'. At this step only move counters allocation to the if_attach()
internals. The `if_data' removal will be performed with the following
diffs to make review and tests easier.
ok bluhm
|
|
Using a scratch buffer makes it possible to take a consistent snapshot of
per-CPU counters without having to allocate memory.
Makes ddb(4) show uvmexp command work in OOM situations.
ok kn@, mvs@, cheloha@
|
|
This diff introduces separate capabilities for TCP offloading. We split this
into LRO (large receive offloading) and TSO (TCP segmentation offloading).
LRO can be turned on/off via tcprecvoffload option of ifconfig and is not
inherited to sub interfaces.
TSO is inherited by sub interfaces to signal this hardware offloading capability
to the network stack.
With tweaks from bluhm, claudio and dlg
ok bluhm, claudio
|
|
ok miod@ millert@
|
|
Naming the list like the struct itself makes for awful grepping.
Call the global variable "ifnetlist" from now on.
There used to be kvm(3) consumers in base picking up this symbol, but those
have long been converted to other interfaces.
A few potential ports users remain, same deal as sys/net/if_var.h r1.116
"Remove struct ifnet's unused if_switchport member": they get bumped.
Previous users pointed out by deraadt
OK bluhm
|
|
ok gnezdo@ semarie@ mpi@
|
|
pass the uint64_t that ether_input has already converted from a
real ethernet address into carp_input so it can use it without
having to do its own conversion.
tested by hrvoje popovski
tested by me on amd64 and sparc64
ok patrick@ jmatthew@
|
|
a new object that is already refcounted, so carp attach does not
reach into internal structures. Add kasserts to detect counter
overflow or underflow.
OK mvs@
|
|
ok dlg@ bluhm@
|
|
patch submitted by Ralf Horstmann from ackstorm.de
OK dlg@
|
|
ok kn
|
|
an uvm fault. Check that ifp0 is not NULL.
OK sashan@ mvs@
|
|
ok sashan@
|
|
carp_input is only tried after vlan and bridge handling is done,
and after the ethernet packet doesnt match the parent interfaces
mac address.
this has been in snaps as part of a larger diff for over a week.
|
|
these packets have generally already been counted on the interface
because that's where they were sent or received from. the protocol
handling side of things already counts things like packets, which
you see with netstat -sp carp.
|
|
this is modelled on vlan_transmit, and basically enqueues the packet
directly on the parent interface.
even though carp is generally not used to transmit packets, we run
dhcp relays on it at work and hit a situation where we unecessarily
dropped packets because it's ifq maxlen was 1. i've been running
this for a month in production.
ok jmatthew@
|
|
|
|
ifpromisc() already refcounts, so carp doesn't have to do it
implicitly with the carpdev list. there's no functional change, the
code just gets a bit simpler.
|
|
this follows what's been done for detach and link state hooks, and
makes handling of hooks generally more robust.
address hooks are a bit different to detach/link state hooks in
that there's only a few things that register hooks (carp, pf, vxlan),
but a lot of places to run the hooks (lots of ipv4 and ipv6 address
configuration).
an address hook cookie was in struct pfi_kif, which is part of the
pf abi. rather than break pfctl -sI, this maintains the void * used
for the cookie and uses it to store a task, which is then used as
intended with the new api.
|
|
this is largely mechanical, except for carp. this moves the addition
of the carp link state hook after we're committed to using the new
interface as a carpdev. because the add can't fail, we avoid a
complicated unwind dance. also, this tweaks the carp linkstate hook
so it only updates the relevant carp interface, not all of the
carpdevs on the parent.
hrvoje popovski has tested an early version of this diff and it's
generally ok, but there's some splasserts that this diff fires that
i'll fix in an upcoming diff.
ok claudio@
|
|
the main semantic change is that things registering detach hooks
have to allocate and set a task structure that then gets added to
the list. this means if the task is allocated up front (eg, as part
of carps softc or bridges port structure), it avoids the possibility
that adding a hook can fail. a lot of drivers weren't checking for
failure, and unwinding state in the event of failure in other parts
was error prone.
while doing this i discovered that the list operations have to be
in a particular order, but drivers weren't doing that consistently
either. this diff wraps the list ops up so you have to seriously
go out of your way to screw them up.
ive also sprinkled some NET_ASSERT_LOCKED around the list operations
so we can make sure there's no potential for the list to be corrupted,
especially while it's being run.
hrvoje popovski has tested this a bit, and some issues he discovered
have been fixed.
ok sashan@
|
|
ok semarie@, visa@
|
|
this let's input processing bypass ifiqs. there's a performance
benefit from this, and it will let me tweak the backpressure detection
mechanism that ifiqs use without impacting on a stack of virtual
interfaces.
ive tested all of these except mpw, which i will end up testing
soon anyway.
|
|
will reduce the sleep time by one tick which doesn't matter in the common
case. The code never passes a true 0 timeval to timeout_add_tv so the code
will always sleep for at least 1 tick which is good enough.
OK kn@, florian@, visa@, cheloha@
|
|
This also makes the IPv4 and IPv6 code more similar.
OK phessler@
|
|
out of the KERNEL_LOCK().
ok visa@, bluhm@
|
|
why it was necessary.
OK bluhm@
'ok but watch for fallouts' mpi@
|
|
It does not make sense to call if_get() again, just pass ifp as
parameter. Move the IFT_CARP check into the function instead of
doing it everywhere. Replace the inverted match variable logic
with simple returns.
OK mpi@ friehm@
|
|
the caller would leak a mbuf. Convert carp_prepare_ad() to a void
function and remove the error check.
reported by Maxime Villard; OK mpi@
|
|
The account flag `ASU' will no longer be set but that makes suser()
mpsafe since it no longer mess with a per-process field.
No objection from millert@, ok tedu@, bluhm@
|
|
Introduce bridge_ourether() and move carp(4)-specific SRPL code inside
carp_ourether().
ok bluhm@
|
|
A NULL dereference can happen since processing protocol layer is
deffered to a second task. In other words the NET_LOCK() is released
then regrabbed between ip_input() and carp_proto_input().
The same workaround is already in use in carp_output() due to deffered
processing in case of IPsec.
The real fix is to make carp(4) MP-safe and use if_get(9) there, any
taker?
Found & fix tested by Hrvoje Popovski.
|
|
if_deactivate looked for carp parent interfaces and called carp_ifdetach
to have children interfaces unplug themselves. this diff has the
carp interfaces register detach hooks on the parent instead. the
effect is the same, but using the standard every other interface
uses.
while im here i shuffle the order the hooks carp_set_ifp are
estabilshed so it will fail if they arent allocated.
ok visa@ mpi@
|
|
ip_carp.c r1.322 removed the ability to receive carp protocol packets
on !IFT_CARP interfaces. however, carppeers cause the carp protocol
packets to be directed to a unicast address on another interface,
which definitely is not mapped back to a carp interface.
this brings back the ability to get carp packets on parent interfaces.
it is a bit different to a backout because it only allows carp
parents to be ethernet interfaces.
mpi@ told me carp regress tests were failing.
|
|
previously the driver only cared that a carp interface wasnt configured
as a carpdev. because the code only really works on ethernet, it makes
sense to restrict it.
ok visa@ mpi@
|
|
|
|
currently carp uses a struct carp_if to hold an srp list head, which
is accessed by both if_carp in struct ifnet, and via the if input
handlers list.
this gets rid of some indirection by making if_carp itself the list
head, rather than a pointer to the list head via a struct carp_if.
it also makes accessing the list consistent by only using if_carp
to get to it.
ok mpi@
|
|
carp6_proto_input_if only handles packets "received" on real carp
interfaces, which the ethernet stack goes to a lot of trouble to
provide. since carp assumes ethernet, carp_proto_input_if can assume
the packets will come in right too.
ok mpi@
|
|
memory shortage. As it is invoked from a system call, it should
not fail and wait instead.
OK visa@ mpi@
|
|
They have the same functionnality since friehm@ cleaned up
balancing code.
ok florian@, visa@, patrick@, bluhm@, jmatthew@
|
|
to make it symmetric to the addrhook establish which is being done in
carp_clone_create(). This fixes the issue that carp does not recognize
address changes on the carp after an interface has detached, which could
cause issues like carp not recovering or even panics. Unfortunately
there are more bugs lurking in carp.
ok bluhm@
|
|
pr_input handlers without KERNEL_LOCK().
ok visa@
|
|
while packets where being passed to IPsec tasks.
Found the hardway by Hrvoje Popovski.
ok phessler@, claudio@
|
|
Exposes per-CPU counters to real parrallelism.
ok visa@, bluhm@, jca@
|
|
Tested by Hrvoje Popovski, ok bluhm@
|
|
|
|
bugs could easily result in use-after-free or double free. Introduce
m_freemp() which automatically resets the pointer before freeing
it. So we have less dangling pointers in the kernel.
OK krw@ mpi@ claudio@
|
|
ether_input(). Now we use mbuf tags instead of modifying the MAC
address.
ok mpi@
|