summaryrefslogtreecommitdiff
path: root/sys/netinet/ip_carp.c
AgeCommit message (Collapse)Author
2024-02-13Merge struct route and struct route_in6.Alexander Bluhm
Use a common struct route for both inet and inet6. Unfortunately struct sockaddr is shorter than sockaddr_in6, so netinet/in.h has to be exposed from net/route.h. Struct route has to be bsd visible for userland as netstat kvm code inspects inp_route. Internet PCB and TCP SYN cache can use a plain struct route now. All specific sockaddr types for inet and inet6 are embeded there. OK claudio@
2023-12-23Backout always allocate per-CPU statistics counters for networkAlexander Bluhm
interface descriptor. It panics during attach of em(4) device at boot.
2023-12-22Always allocate per-CPU statistics counters for network interfaceVitaliy Makkoveev
descriptor. We have the mess in network interface statistics. Only pseudo drivers do per-CPU counters allocation, all other network devices use the old `if_data'. The network stack partially uses per-CPU counters and partially use `if_data', but the protection is inconsistent: some times counters accessed with exclusive netlock, some times with shared netlock, some times with kernel lock, but without netlock, some times with another locks. To make network interfaces statistics more consistent, always allocate per-CPU counters at interface attachment time and use it instead of `if_data'. At this step only move counters allocation to the if_attach() internals. The `if_data' removal will be performed with the following diffs to make review and tests easier. ok bluhm
2023-09-16Allow counters_read(9) to take an optional scratch buffer.Martin Pieuchot
Using a scratch buffer makes it possible to take a consistent snapshot of per-CPU counters without having to allocate memory. Makes ddb(4) show uvmexp command work in OOM situations. ok kn@, mvs@, cheloha@
2023-05-16Use separate IFCAPs for LRO and TSO.Jan Klemkow
This diff introduces separate capabilities for TCP offloading. We split this into LRO (large receive offloading) and TSO (TCP segmentation offloading). LRO can be turned on/off via tcprecvoffload option of ifconfig and is not inherited to sub interfaces. TSO is inherited by sub interfaces to signal this hardware offloading capability to the network stack. With tweaks from bluhm, claudio and dlg ok bluhm, claudio
2023-03-08Delete obsolete /* ARGSUSED */ lint comments.Philip Guenther
ok miod@ millert@
2022-09-08Rename global ifnet TAILQKlemens Nanni
Naming the list like the struct itself makes for awful grepping. Call the global variable "ifnetlist" from now on. There used to be kvm(3) consumers in base picking up this symbol, but those have long been converted to other interfaces. A few potential ports users remain, same deal as sys/net/if_var.h r1.116 "Remove struct ifnet's unused if_switchport member": they get bumped. Previous users pointed out by deraadt OK bluhm
2021-03-10spellingJonathan Gray
ok gnezdo@ semarie@ mpi@
2021-03-07use uint64_t ethernet addresses for compares in carp.David Gwynne
pass the uint64_t that ether_input has already converted from a real ethernet address into carp_input so it can use it without having to do its own conversion. tested by hrvoje popovski tested by me on amd64 and sparc64 ok patrick@ jmatthew@
2021-02-08Start refcounting interface groups with 1. if_creategroup() returnsAlexander Bluhm
a new object that is already refcounted, so carp attach does not reach into internal structures. Add kasserts to detect counter overflow or underflow. OK mvs@
2021-01-21carp(4): convert ifunit() to if_unit(9)mvs
ok dlg@ bluhm@
2021-01-04- fix use after free, when packet gets dropped.Alexandr Nedvedicky
patch submitted by Ralf Horstmann from ackstorm.de OK dlg@
2020-07-28Don't treat an error if carppeer is an unicast and the peer is down.YASUOKA Masahiko
ok kn
2020-07-28After the previous commit, src/regress/sys/netinet/carp triggeredAlexander Bluhm
an uvm fault. Check that ifp0 is not NULL. OK sashan@ mvs@
2020-07-24Use interface index instead of pointer to `ifnet' in carp(4).mvs
ok sashan@
2020-07-22move carp_input into ether_input, instead of via an input handler.David Gwynne
carp_input is only tried after vlan and bridge handling is done, and after the ethernet packet doesnt match the parent interfaces mac address. this has been in snaps as part of a larger diff for over a week.
2020-05-21don't count packets in the carp protocol handling against an interface.David Gwynne
these packets have generally already been counted on the interface because that's where they were sent or received from. the protocol handling side of things already counts things like packets, which you see with netstat -sp carp.
2020-05-21implement a carp_transmit that bypasses the ifq on output.David Gwynne
this is modelled on vlan_transmit, and basically enqueues the packet directly on the parent interface. even though carp is generally not used to transmit packets, we run dhcp relays on it at work and hit a situation where we unecessarily dropped packets because it's ifq maxlen was 1. i've been running this for a month in production. ok jmatthew@
2020-04-29remove some trailing whitespace. no functional change.David Gwynne
2019-11-08void being too clever about setting/clearing ifpromisc on the parent.David Gwynne
ifpromisc() already refcounts, so carp doesn't have to do it implicitly with the carpdev list. there's no functional change, the code just gets a bit simpler.
2019-11-08convert interface address change hooks to tasks and a task_list.David Gwynne
this follows what's been done for detach and link state hooks, and makes handling of hooks generally more robust. address hooks are a bit different to detach/link state hooks in that there's only a few things that register hooks (carp, pf, vxlan), but a lot of places to run the hooks (lots of ipv4 and ipv6 address configuration). an address hook cookie was in struct pfi_kif, which is part of the pf abi. rather than break pfctl -sI, this maintains the void * used for the cookie and uses it to store a task, which is then used as intended with the new api.
2019-11-07turn the linkstate hooks into a task list, like the detach hooks.David Gwynne
this is largely mechanical, except for carp. this moves the addition of the carp link state hook after we're committed to using the new interface as a carpdev. because the add can't fail, we avoid a complicated unwind dance. also, this tweaks the carp linkstate hook so it only updates the relevant carp interface, not all of the carpdevs on the parent. hrvoje popovski has tested an early version of this diff and it's generally ok, but there's some splasserts that this diff fires that i'll fix in an upcoming diff. ok claudio@
2019-11-06replace the hooks used with if_detachhooks with a task list.David Gwynne
the main semantic change is that things registering detach hooks have to allocate and set a task structure that then gets added to the list. this means if the task is allocated up front (eg, as part of carps softc or bridges port structure), it avoids the possibility that adding a hook can fail. a lot of drivers weren't checking for failure, and unwinding state in the event of failure in other parts was error prone. while doing this i discovered that the list operations have to be in a particular order, but drivers weren't doing that consistently either. this diff wraps the list ops up so you have to seriously go out of your way to screw them up. ive also sprinkled some NET_ASSERT_LOCKED around the list operations so we can make sure there's no potential for the list to be corrupted, especially while it's being run. hrvoje popovski has tested this a bit, and some issues he discovered have been fixed. ok sashan@
2019-06-10Use mallocarray(9) & put some free(9) sizes for M_IPMOPTS allocations.Martin Pieuchot
ok semarie@, visa@
2019-04-23a first cut at converting some virtual ethernet interfaces to if_vinputDavid Gwynne
this let's input processing bypass ifiqs. there's a performance benefit from this, and it will let me tweak the backpressure detection mechanism that ifiqs use without impacting on a stack of virtual interfaces. ive tested all of these except mpw, which i will end up testing soon anyway.
2018-12-17Switch from timeout_add with tvtohz to just timeout_add_tv. Now this changeClaudio Jeker
will reduce the sleep time by one tick which doesn't matter in the common case. The code never passes a true 0 timeval to timeout_add_tv so the code will always sleep for at least 1 tick which is good enough. OK kn@, florian@, visa@, cheloha@
2018-12-04Use m_align() and while there reorder the pkthdr initalisation a bit.Claudio Jeker
This also makes the IPv4 and IPv6 code more similar. OK phessler@
2018-09-24Turn carp_ourether() mp-safe, this is a requirement for taking bridge(4)Martin Pieuchot
out of the KERNEL_LOCK(). ok visa@, bluhm@
2018-07-10Remove DELAY(1000) from carp_send_arp() / carp_send_na() since it is not clearfriehm
why it was necessary. OK bluhm@ 'ok but watch for fallouts' mpi@
2018-05-21All places that call carp_lsdrop() use the interface pointer already.Alexander Bluhm
It does not make sense to call if_get() again, just pass ifp as parameter. Move the IFT_CARP check into the function instead of doing it everywhere. Replace the inverted match variable logic with simple returns. OK mpi@ friehm@
2018-03-21The function carp_prepare_ad() never fails. The error handling inAlexander Bluhm
the caller would leak a mbuf. Convert carp_prepare_ad() to a void function and remove the error check. reported by Maxime Villard; OK mpi@
2018-02-19Remove almost unused `flags' argument of suser().Martin Pieuchot
The account flag `ASU' will no longer be set but that makes suser() mpsafe since it no longer mess with a per-process field. No objection from millert@, ok tedu@, bluhm@
2018-02-07Unbreak carp(4) MAC check in bridge_process().Martin Pieuchot
Introduce bridge_ourether() and move carp(4)-specific SRPL code inside carp_ourether(). ok bluhm@
2018-01-25Use a workaround for detached parent in carp_proto_input_c().Martin Pieuchot
A NULL dereference can happen since processing protocol layer is deffered to a second task. In other words the NET_LOCK() is released then regrabbed between ip_input() and carp_proto_input(). The same workaround is already in use in carp_output() due to deffered processing in case of IPsec. The real fix is to make carp(4) MP-safe and use if_get(9) there, any taker? Found & fix tested by Hrvoje Popovski.
2018-01-12have carp use standard detach hooks instead of getting special handlingDavid Gwynne
if_deactivate looked for carp parent interfaces and called carp_ifdetach to have children interfaces unplug themselves. this diff has the carp interfaces register detach hooks on the parent instead. the effect is the same, but using the standard every other interface uses. while im here i shuffle the order the hooks carp_set_ifp are estabilshed so it will fail if they arent allocated. ok visa@ mpi@
2018-01-12unbreak configurations using carppeersDavid Gwynne
ip_carp.c r1.322 removed the ability to receive carp protocol packets on !IFT_CARP interfaces. however, carppeers cause the carp protocol packets to be directed to a unicast address on another interface, which definitely is not mapped back to a carp interface. this brings back the ability to get carp packets on parent interfaces. it is a bit different to a backout because it only allows carp parents to be ethernet interfaces. mpi@ told me carp regress tests were failing.
2018-01-12restrict carp to configuring ethernet interfaces as carpdevs.David Gwynne
previously the driver only cared that a carp interface wasnt configured as a carpdev. because the code only really works on ethernet, it makes sense to restrict it. ok visa@ mpi@
2018-01-11carp_ourether gets passed the parent interface, not the carp interface.David Gwynne
2018-01-10get rid of struct carp_if by moving the srpl into struct ifnet if_carp.David Gwynne
currently carp uses a struct carp_if to hold an srp list head, which is accessed by both if_carp in struct ifnet, and via the if input handlers list. this gets rid of some indirection by making if_carp itself the list head, rather than a pointer to the list head via a struct carp_if. it also makes accessing the list consistent by only using if_carp to get to it. ok mpi@
2018-01-10simplify the input interface type check in carp_proto_input_if.David Gwynne
carp6_proto_input_if only handles packets "received" on real carp interfaces, which the ethernet stack goes to a lot of trouble to provide. since carp assumes ethernet, carp_proto_input_if can assume the packets will come in right too. ok mpi@
2018-01-09Creating a cloned interface could return ENOMEM due to temporaryAlexander Bluhm
memory shortage. As it is invoked from a system call, it should not fail and wait instead. OK visa@ mpi@
2017-11-23Replace non mp-safe carp_iamatch6() with mp-safe carp_iamatch().Martin Pieuchot
They have the same functionnality since friehm@ cleaned up balancing code. ok florian@, visa@, patrick@, bluhm@, jmatthew@
2017-11-21Move the addrhook disestablish from carpdetach() to carp_clone_destroy()Patrick Wildt
to make it symmetric to the addrhook establish which is being done in carp_clone_create(). This fixes the issue that carp does not recognize address changes on the carp after an interface has detached, which could cause issues like carp not recovering or even panics. Unfortunately there are more bugs lurking in carp. ok bluhm@
2017-11-20Sprinkle some NET_ASSERT_LOCKED(), const and co to prepare runningMartin Pieuchot
pr_input handlers without KERNEL_LOCK(). ok visa@
2017-10-16Handle the case where the parent of a carp(4) is being destroyedMartin Pieuchot
while packets where being passed to IPsec tasks. Found the hardway by Hrvoje Popovski. ok phessler@, claudio@
2017-10-09Reduces the scope of the NET_LOCK() in sysctl(2) path.Martin Pieuchot
Exposes per-CPU counters to real parrallelism. ok visa@, bluhm@, jca@
2017-08-11Remove NET_LOCK()'s argument.Martin Pieuchot
Tested by Hrvoje Popovski, ok bluhm@
2017-06-22Fix the remaining ';;'s in sys/Tom Cosgrove
2017-06-19When dealing with mbuf pointers passed down as function parameters,Alexander Bluhm
bugs could easily result in use-after-free or double free. Introduce m_freemp() which automatically resets the pointer before freeing it. So we have less dangling pointers in the kernel. OK krw@ mpi@ claudio@
2017-05-30Carp balancing ip does not work since there is a mac filter infriehm
ether_input(). Now we use mbuf tags instead of modifying the MAC address. ok mpi@