Age | Commit message (Collapse) | Author |
|
After if_detach() has called if_remove(), if_get() will return NULL.
Before if_detach() grabs the net lock, ARP timer can still run. In
this case arptfree() should just return, instead of triggering an
assertion because ifp is NULL. The ARP route will be deleted later
when in_ifdetach() calls in_purgeaddr().
OK kn@ mvs@ claudio@
|
|
Since cheloha@ has implemented timeout processes that do not grab
the kernel lock, start using TIMEOUT_MPSAFE for arptimer().
OK kn@ mvs@
|
|
always set together with ARP mutex.
OK mvs@
|
|
if_output_ml() to send mbuf lists to interfaces. This can be used
for TSO, fragments, ARP and ND6. Rename variable fml to ml. In
pf_route6() split the if else block. Put the safety check (hlen +
firstlen < tlen) into ip_fragment(). It makes the code correct in
case the packet is too short to be fragmented. This should not
happen, but other functions also have this logic.
No functional change. OK sashan@
|
|
So kernel lock is only needed for changing the route rt_flags. In
arpresolve() protect rt_llinfo lookup and llinfo_arp modification
with arp_mtx. Grab kernel lock for rt_flags reject modification
only when needed.
Tested by Hrvoje Popovski; OK patrick@ kn@
|
|
Defer sending after unlock, reuse `refresh' from similar construct.
OK bluhm
|
|
grabs the exclusive netlock and that is sufficent for in_arpinput()
and arpcache().
with kn@; OK mvs@; tested by Hrvoje Popovski
|
|
response. Implement analog sysctl net.inet6.icmp6.nd6_queued for
ND6 to reduce places where mbufs can hide within the kernel.
Atomic operations operate on unsigned int. Make the type of total
hold queue length consistent.
Use atomic load to read the value for the sysctl. This clarifies
why no lock around sysctl_rdint() is needed.
OK mvs@ kn@
|
|
ND6 did only hold a single packet. Unify the logic and add a mbuf
hold queue to struct llinfo_nd6. This is MP safe and queue limits
are tracked with atomic operations. New function if_mqoutput() has
common code for ARP and ND6. ln_saddr6 holds the source address
of the requesting packet. That is easier than fiddling with mbuf
queue in nd6_ns_output().
OK kn@
|
|
ok miod@ millert@
|
|
This worked because the global head variable is zero-initialised,
but one must not rely on that.
OK mvs claudio
|
|
No functional change.
|
|
No functional changes.
|
|
rwlock(9) acquisition.
Reported-by: syzbot+fbe3acb4886adeef31e0@syzkaller.appspotmail.com
|
|
serialize arpcache() and arpresolve(). In fact, net stack already has
sleep points, so the rwlock(9) is better here because we avoid
intersection with the rest of kernel locked paths. Also this new lock
assumed to use to route layer protection instead of netlock.
Hrvoje Popovski had tested this diff and found no visible performance
impact.
ok bluhm@
|
|
|
|
field of the route with a mutex. Keep rt_llinfo not NULL consistent
with RTF_LLINFO flag is set. Also do not put the mutex in the fast
path.
OK mpi@
|
|
prevent concurrent access to rt_llinfo from rtrequest_delete().
But the common case, when the MAC address is already known, works
without lock.
tested by Hrvoje Popovski; OK mvs@
|
|
once per function. This gives a more consistent time value.
OK claudio@ miod@ mvs@
|
|
mbuf list to if_output().
OK sashan@ mvs@
|
|
The global list of ARP llinfo is protected by net lock. This is
not sufficent when we switch to shared netlock. Add a mutex for
insertion and removal when net lock is not exclusive. This is
needed if we want run IP output on multiple CPU.
Put an assertion for shared net lock into arp_rtrequest.
input mvs@; OK sashan@
|
|
contins a mutex. Update la_hold_total with atomic operations.
OK sashan@
|
|
arp_rtrequest() in parallel. Move initialization to arpinit()
function.
OK kettenis@ mvs@
|
|
in the arp queue. So the sysctl net.inet.ip.arpqueued must be read
only. In if_ether.c include the header with the declaration of
la_hold_total to ensure that the definition matches.
OK mvs@
|
|
time_second(9) and time_uptime(9) are widely used in the kernel to
quickly get the system UTC or system uptime as a time_t. However,
time_t is 64-bit everywhere, so it is not generally safe to use them
on 32-bit platforms: you have a split-read problem if your hardware
cannot perform atomic 64-bit reads.
This patch replaces time_second(9) with gettime(9), a safer successor
interface, throughout the kernel. Similarly, time_uptime(9) is replaced
with getuptime(9).
There is a performance cost on 32-bit platforms in exchange for
eliminating the split-read problem: instead of two register reads you
now have a lockless read loop to pull the values from the timehands.
This is really not *too* bad in the grand scheme of things, but
compared to what we were doing before it is several times slower.
There is no performance cost on 64-bit (__LP64__) platforms.
With input from visa@, dlg@, and tedu@.
Several bugs squashed by visa@.
ok kettenis@
|
|
making RTM_INVALIDATE code path perform same check as RTM_DELETE does.
ok mpi@
|
|
ok cheloha@, visa@
|
|
ok dlg@, sthen@, millert@
|
|
set. These mpls routes use the rt_llinfo structure to store the MPLS label
and would confuse the arp and nd6 code.
OK bluhm@ anton@
Reported-by: syzbot+927e93a362f3ae33dd9c@syzkaller.appspotmail.com
|
|
then 1/8 of net.inet.ip.arptimeout the system will send out a arp request
about every 30 seconds until either the entry is updated or expired.
Not refreshing arp entries will result in packet drop every time a entry
expires which is not ideal for important gateway entries.
Came up with this after a discussion with deraadt@. OK benno@ deraadt@
|
|
the function is doing the same initialisation as arprequest().
OK bluhm@
|
|
ok visa@, tb@
|
|
the mbuf packet header. Otherwise, stale mbuf state related to the
ARP request packet might affect the fate of the ARP reply packet.
For example, I observed that for an ARP request to a carp IP, where the
underlying carpdev interface is part of a bridge, ARP replies were always
sent out on the carpdev interface, even if the corresponding ARP request
was received not on the carpdev but on a different bridge member interface.
This happened because the M_PROTO1 mbuf flag was set on the ARP request mbuf
when it left the bridge towards carp, and was still set on the ARP reply,
which reused the same mbuf, sent back towards the bridge. The bridge's loop
detection saw the M_PROTO1 flag and prevented the ARP reply from entering
the bridge, so the reply was instead sent out directly on the carpdev...
ok bluhm@ mpi@
|
|
continuous. The length of the hardware and protocol address are
provided in the network packet and have to be checked first. So
enforce that we only deal with internet over ethernet arp headers
with the address length filled correctly.
found by Maxime Villard; OK claudio@
|
|
of IFF* flags.
inputs from jmc@, ok bluhm@, visa@
|
|
ARP or ND timeout could delete local routes. Put an assert into
arptfree() and nd6_free() so this cannot happen again.
OK mpi@
|
|
Tested by Hrvoje Popovski, ok bluhm@
|
|
Prompted by a bugreport by naddy that IPv6 autoconfiguration is broken
in the installer.
OK mpi, "go for it" deraadt
|
|
rtdeletemsg().
ok bluhm@
|
|
them all in net/rtsock.c.
This allows to easily spot which functions are doing a copyout(9)
when dealing with the routing midlayer.
ok phessler@, bluhm@, dhill@, krw@, claudio@
|
|
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.
This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.
Inputs from and ok bluhm@, ok dlg@
|
|
ok dlg@, jmatthew@
|
|
context.
Convert them to timeout_set_proc(9).
|
|
the ioff argument to pool_init() is unused and has been for many
years, so this replaces it with an ipl argument. because the ipl
will be set on init we no longer need pool_setipl.
most of these changes have been done with coccinelle using the spatch
below. cocci sucks at formatting code though, so i fixed that by hand.
the manpage and subr_pool.c bits i did myself.
ok tedu@ jmatthew@
@ipl@
expression pp;
expression ipl;
expression s, a, o, f, m, p;
@@
-pool_init(pp, s, a, o, f, m, p);
-pool_setipl(pp, ipl);
+pool_init(pp, s, a, ipl, f, m, p);
|
|
reference count.
rtable_iterate() frees the passed ``rt'' and returns the next one on the
multipath list or NULL if there's none.
ok dlg@
|
|
thank you to everyone who helped reviewed these diffs
ok mpi@
|
|
This means that no protection is needed to guarantee that the next hop
route wont be modified by CPU1 while CPU0 is dereferencing it in a L2
resolution functions.
While here also fix an ``ifa'' leak resulting in RTF_GATEWAY being always
invalid.
dlg@ likes it, inputs and ok bluhm@
|
|
removed from the table.
Currently the storage for L2 addresses is freed when an entry is
removed from the table. That means that we cannot access this
chunk of memory between RTM_DELETE and rtfree(9).
Note that this doesn't apply to MPLS because the associated storage
is currently released by the last rtfree(9).
ok mikeb@
|
|
triggered by updating a cached, but removed from the table, entry is
properly fixed.
Diff from dlg@, prodding deraadt@
|
|
instead of abusing RTF_CLONING.
Fix a leak reporeted by Aaron Riekenberg on misc@, ok sthen@
|