summaryrefslogtreecommitdiff
path: root/sys/netinet/if_ether.c
AgeCommit message (Collapse)Author
2023-12-18Fix race between ifconfig destroy and ARP timer.Alexander Bluhm
After if_detach() has called if_remove(), if_get() will return NULL. Before if_detach() grabs the net lock, ARP timer can still run. In this case arptfree() should just return, instead of triggering an assertion because ifp is NULL. The ARP route will be deleted later when in_ifdetach() calls in_purgeaddr(). OK kn@ mvs@ claudio@
2023-11-09Run arp timeout without kernel lock.Alexander Bluhm
Since cheloha@ has implemented timeout processes that do not grab the kernel lock, start using TIMEOUT_MPSAFE for arptimer(). OK kn@ mvs@
2023-05-12Access rt_llinfo without checking RTF_LLINFO flag before. They areAlexander Bluhm
always set together with ARP mutex. OK mvs@
2023-05-07I preparation for TSO in software, cleanup the fragment code. UseAlexander Bluhm
if_output_ml() to send mbuf lists to interfaces. This can be used for TSO, fragments, ARP and ND6. Rename variable fml to ml. In pf_route6() split the if else block. Put the safety check (hlen + firstlen < tlen) into ip_fragment(). It makes the code correct in case the packet is too short to be fragmented. This should not happen, but other functions also have this logic. No functional change. OK sashan@
2023-04-25Exclusive net lock or mutex arp_mtx protect the llinfo_arp fields.Alexander Bluhm
So kernel lock is only needed for changing the route rt_flags. In arpresolve() protect rt_llinfo lookup and llinfo_arp modification with arp_mtx. Grab kernel lock for rt_flags reject modification only when needed. Tested by Hrvoje Popovski; OK patrick@ kn@
2023-04-12Pull MP-safe arprequest() out of kernel lockKlemens Nanni
Defer sending after unlock, reuse `refresh' from similar construct. OK bluhm
2023-04-07Remove kernel locks from the ARP input path. Caller if_netisr()Alexander Bluhm
grabs the exclusive netlock and that is sufficent for in_arpinput() and arpcache(). with kn@; OK mvs@; tested by Hrvoje Popovski
2023-04-05ARP has a sysctl to show the number of packets waiting for an arpAlexander Bluhm
response. Implement analog sysctl net.inet6.icmp6.nd6_queued for ND6 to reduce places where mbufs can hide within the kernel. Atomic operations operate on unsigned int. Make the type of total hold queue length consistent. Use atomic load to read the value for the sysctl. This clarifies why no lock around sysctl_rdint() is needed. OK mvs@ kn@
2023-04-05ARP has a queue of packets that should be sent after name resolution.Alexander Bluhm
ND6 did only hold a single packet. Unify the logic and add a mbuf hold queue to struct llinfo_nd6. This is MP safe and queue limits are tracked with atomic operations. New function if_mqoutput() has common code for ARP and ND6. ln_saddr6 holds the source address of the requesting packet. That is easier than fiddling with mbuf queue in nd6_ns_output(). OK kn@
2023-03-08Delete obsolete /* ARGSUSED */ lint comments.Philip Guenther
ok miod@ millert@
2023-03-04properly initialise LIST headKlemens Nanni
This worked because the global head variable is zero-initialised, but one must not rely on that. OK mvs claudio
2023-01-31Remove the last ones route lock references from comments.Vitaliy Makkoveev
No functional change.
2023-01-31Route lock was reverted, adjust forgotten commentary.Vitaliy Makkoveev
No functional changes.
2023-01-28Revert the `rt_lock' rwlock(9) diff to fix the recursiveVitaliy Makkoveev
rwlock(9) acquisition. Reported-by: syzbot+fbe3acb4886adeef31e0@syzkaller.appspotmail.com
2023-01-21Introduce `rt_lock' rwlock(9) and use it instead of kernel lock toVitaliy Makkoveev
serialize arpcache() and arpresolve(). In fact, net stack already has sleep points, so the rwlock(9) is better here because we avoid intersection with the rest of kernel locked paths. Also this new lock assumed to use to route layer protection instead of netlock. Hrvoje Popovski had tested this diff and found no visible performance impact. ok bluhm@
2022-12-07Fix comment by splitting sentance in two. From kn@Claudio Jeker
2022-07-16To fix an KASSERT(la != NULL) panic in ARP, protect the rt_llinfoAlexander Bluhm
field of the route with a mutex. Keep rt_llinfo not NULL consistent with RTF_LLINFO flag is set. Also do not put the mutex in the fast path. OK mpi@
2022-06-27Push the kernel lock down into arpresolve(). We still need it toAlexander Bluhm
prevent concurrent access to rt_llinfo from rtrequest_delete(). But the common case, when the MAC address is already known, works without lock. tested by Hrvoje Popovski; OK mvs@
2022-06-27Instead of calling getuptime() all the time in ARP code, do it onlyAlexander Bluhm
once per function. This gives a more consistent time value. OK claudio@ miod@ mvs@
2021-04-28Use mq_delist() to fetch the ARP mbuf hold queue once and feed theAlexander Bluhm
mbuf list to if_output(). OK sashan@ mvs@
2021-04-28Document the locking mechanism of the global variables in ARP code.Alexander Bluhm
The global list of ARP llinfo is protected by net lock. This is not sufficent when we switch to shared netlock. Add a mutex for insertion and removal when net lock is not exclusive. This is needed if we want run IP output on multiple CPU. Put an assertion for shared net lock into arp_rtrequest. input mvs@; OK sashan@
2021-04-26Convert the ARP packet hold queue from mbuf list to mbuf queue whichAlexander Bluhm
contins a mutex. Update la_hold_total with atomic operations. OK sashan@
2021-04-23Setting variable arpinit_done is not MP save if we want to executeAlexander Bluhm
arp_rtrequest() in parallel. Move initialization to arpinit() function. OK kettenis@ mvs@
2021-04-23The variable la_hold_total contains the number of packets currentlyAlexander Bluhm
in the arp queue. So the sysctl net.inet.ip.arpqueued must be read only. In if_ether.c include the header with the declaration of la_hold_total to ensure that the definition matches. OK mvs@
2020-06-24kernel: use gettime(9)/getuptime(9) in lieu of time_second(9)/time_uptime(9)cheloha
time_second(9) and time_uptime(9) are widely used in the kernel to quickly get the system UTC or system uptime as a time_t. However, time_t is 64-bit everywhere, so it is not generally safe to use them on 32-bit platforms: you have a split-read problem if your hardware cannot perform atomic 64-bit reads. This patch replaces time_second(9) with gettime(9), a safer successor interface, throughout the kernel. Similarly, time_uptime(9) is replaced with getuptime(9). There is a performance cost on 32-bit platforms in exchange for eliminating the split-read problem: instead of two register reads you now have a lockless read loop to pull the values from the timehands. This is really not *too* bad in the grand scheme of things, but compared to what we were doing before it is several times slower. There is no performance cost on 64-bit (__LP64__) platforms. With input from visa@, dlg@, and tedu@. Several bugs squashed by visa@. ok kettenis@
2019-11-07Avoid NULL dereference in arpinvalidate() and nd6_invalidate() byKenneth R Westerback
making RTM_INVALIDATE code path perform same check as RTM_DELETE does. ok mpi@
2019-10-16tsleep(9) -> tsleep_nsec(9)Martin Pieuchot
ok cheloha@, visa@
2019-07-17Introduce ETHER_IS_BROADCAST/ANYADDR/EQ() and use them where appropriate.Martin Pieuchot
ok dlg@, sthen@, millert@
2019-06-13In arp_rtrequest and nd6_rtrequest return early if the RTF_MPLS flag isClaudio Jeker
set. These mpls routes use the rt_llinfo structure to store the MPLS label and would confuse the arp and nd6 code. OK bluhm@ anton@ Reported-by: syzbot+927e93a362f3ae33dd9c@syzkaller.appspotmail.com
2019-01-20Refresh arp entries that are about to expire. Once their life time is lessClaudio Jeker
then 1/8 of net.inet.ip.arptimeout the system will send out a arp request about every 30 seconds until either the entry is updated or expired. Not refreshing arp entries will result in packet drop every time a entry expires which is not ideal for important gateway entries. Came up with this after a discussion with deraadt@. OK benno@ deraadt@
2018-11-30MH_ALIGN -> m_align. In revarprequest() set the ph_rtableid so thatClaudio Jeker
the function is doing the same initialisation as arprequest(). OK bluhm@
2018-06-11Push the KERNEL_LOCK() inside route_input().Martin Pieuchot
ok visa@, tb@
2018-03-31When reusing an mbuf to send an ARP response, don't forget to clearStefan Sperling
the mbuf packet header. Otherwise, stale mbuf state related to the ARP request packet might affect the fate of the ARP reply packet. For example, I observed that for an ARP request to a carp IP, where the underlying carpdev interface is part of a bridge, ARP replies were always sent out on the carpdev interface, even if the corresponding ARP request was received not on the carpdev but on a different bridge member interface. This happened because the M_PROTO1 mbuf flag was set on the ARP request mbuf when it left the bridge towards carp, and was still set on the ARP reply, which reused the same mbuf, sent back towards the bridge. The bridge's loop detection saw the M_PROTO1 flag and prevented the ARP reply from entering the bridge, so the reply was instead sent out directly on the carpdev... ok bluhm@ mpi@
2018-03-13Mbuf data is used as struct ether_header before it has been madeAlexander Bluhm
continuous. The length of the hardware and protocol address are provided in the network packet and have to be checked first. So enforce that we only deal with internet over ethernet arp headers with the address length filled correctly. found by Maxime Villard; OK claudio@
2018-01-16Recycle IFF_NOTRAILERS into IFF_STATICARP and document ownerhsipMartin Pieuchot
of IFF* flags. inputs from jmc@, ok bluhm@, visa@
2018-01-15There was an issue that dynamic path MTU discovery together withAlexander Bluhm
ARP or ND timeout could delete local routes. Put an assert into arptfree() and nd6_free() so this cannot happen again. OK mpi@
2017-08-11Remove NET_LOCK()'s argument.Martin Pieuchot
Tested by Hrvoje Popovski, ok bluhm@
2017-07-30Switch installer to Allotment Routing Table (ART).Florian Obser
Prompted by a bugreport by naddy that IPv6 autoconfiguration is broken in the installer. OK mpi, "go for it" deraadt
2017-07-28Add an error argument to rtm_send() instead of rerolling it insideMartin Pieuchot
rtdeletemsg(). ok bluhm@
2017-03-06Prefix functions dealing with routing messages with 'rtm_' and keepMartin Pieuchot
them all in net/rtsock.c. This allows to easily spot which functions are doing a copyout(9) when dealing with the routing midlayer. ok phessler@, bluhm@, dhill@, krw@, claudio@
2016-12-19Introduce the NET_LOCK() a rwlock used to serialize accesses to the partsMartin Pieuchot
of the network stack that are not yet ready to be executed in parallel or where new sleeping points are not possible. This first pass replace all the entry points leading to ip_output(). This is done to not introduce new sleeping points when trying to acquire ART's write lock, needed when a new L2 entry is created via the RT_RESOLVE. Inputs from and ok bluhm@, ok dlg@
2016-11-20Make rtable_iterate(9) mpsafe by using the new SRPL_NEXT(9).Martin Pieuchot
ok dlg@, jmatthew@
2016-11-07ARP and NDP timeouts mess with the routing table, so they need a processMartin Pieuchot
context. Convert them to timeout_set_proc(9).
2016-09-15all pools have their ipl set via pool_setipl, so fold it into pool_init.David Gwynne
the ioff argument to pool_init() is unused and has been for many years, so this replaces it with an ipl argument. because the ipl will be set on init we no longer need pool_setipl. most of these changes have been done with coccinelle using the spatch below. cocci sucks at formatting code though, so i fixed that by hand. the manpage and subr_pool.c bits i did myself. ok tedu@ jmatthew@ @ipl@ expression pp; expression ipl; expression s, a, o, f, m, p; @@ -pool_init(pp, s, a, o, f, m, p); -pool_setipl(pp, ipl); +pool_init(pp, s, a, ipl, f, m, p);
2016-09-07Rename rtable_mpath_next() into rtable_iterate() and make it do a properMartin Pieuchot
reference count. rtable_iterate() frees the passed ``rt'' and returns the next one on the multipath list or NULL if there's none. ok dlg@
2016-09-06pool_setipl for various netinet and netinet6 bitsDavid Gwynne
thank you to everyone who helped reviewed these diffs ok mpi@
2016-08-22Make the ``rt_gwroute'' pointer of RTF_GATEWAY entries immutable.Martin Pieuchot
This means that no protection is needed to guarantee that the next hop route wont be modified by CPU1 while CPU0 is dereferencing it in a L2 resolution functions. While here also fix an ``ifa'' leak resulting in RTF_GATEWAY being always invalid. dlg@ likes it, inputs and ok bluhm@
2016-07-14Prevent a use-after-free by not updating an ARP entry that has beenMartin Pieuchot
removed from the table. Currently the storage for L2 addresses is freed when an entry is removed from the table. That means that we cannot access this chunk of memory between RTM_DELETE and rtfree(9). Note that this doesn't apply to MPLS because the associated storage is currently released by the last rtfree(9). ok mikeb@
2016-07-13Move ARP processing back to the KERNEL_LOCK()ed task until the raceMartin Pieuchot
triggered by updating a cached, but removed from the table, entry is properly fixed. Diff from dlg@, prodding deraadt@
2016-07-13Introduce RTF_MULTICAST and flag corresponding IPv6 routes as suchMartin Pieuchot
instead of abusing RTF_CLONING. Fix a leak reporeted by Aaron Riekenberg on misc@, ok sthen@