Age | Commit message (Collapse) | Author |
|
Replace hand-rolled reference counting with refcnt_init(9) and hook it up
with a new dt(4) probe.
OK bluhm mvs
|
|
|
|
|
|
in{,6}_addmulti(). Since kernel lock is no more taken while following
setsockopt() path, it should be taken in this place. Corresponding
in{,6}_delmulti() already acquire kernel lock around (*if_ioctl)().
Problem reported and diff tested by weerd@
ok kn@ bluhm@
|
|
First the right address is picked from the net lock protected if_addrlist.
Then all ioctls just copy out the address, nothing requires the kernel lock.
SIOCGIFDSTADDR_IN6 checks the net lock protected if_flags,
SIOCGIFALIFETIME_IN6 computes lifetimes which only need the address.
This removes the last kernel lock from IPv6 read ioctls (multicast being
the untouched exception here).
Users of these ioctl(2)s are route6d(8), rad(8), slaacd(8), isakmpd(8) and
of course ifconfig(8).
OK mvs
|
|
Neighbour Discovery information is protected by the net lock, as
documented in nd6.h struct nd_ifinfo.
ndp(8) is the only SIOCGIFINFO_IN6 and SIOCGNBRINFO_IN6 user in base.
nd6_lookup(), also used in ICMP6 input and IPv6 forwarding, only needs
the net lock.
OK mvs
|
|
*if_afdata[] and struct domain's dom_if{at,de}tach() are only used with
IPv6 Neighbour Discovery in6_dom{at,de}tach(), which allocate/init and
free single struct nd_ifinfo.
Set up a new ND-specific *if_nd member directly to avoid yet another
layer of indirection and thus make the generic domain API obsolete.
The per-interface data is only accessed in nd6.c and nd6_nbr.c through
the ND_IFINFO() macro; it is allocated and freed exactly once during
interface at/detach, so document it as [I]mmutable.
OK bluhm mvs claudio
|
|
This was the right diff after all, I just confused myself between trees.
OK bluhm
---
Remove useless struct in6_ifextra
in6_var.h r1.75 removed all other struct members.
Now It only contains a single struct nd_ifinfo pointer, so address family
specific data might as well be just that.
ND_IFINFO() is the only way nd6_nbr.c and nd6.c access this data, there is
no other usage of if_afdata[].
One allocation and unhelpful indirection less per interface.
All under _KERNEL.
OK claudio
|
|
I committed the wrong iteration of this diff, sorry for the noise.
|
|
All prior lines in this function already use it, do so on the last one.
OK claudio
|
|
in6_var.h r1.75 removed all other struct members.
Now It only contains a single struct nd_ifinfo pointer, so address family
specific data might as well be just that.
ND_IFINFO() is the only way nd6_nbr.c and nd6.c access this data, there is
no other usage of if_afdata[].
One allocation and unhelpful indirection less per interface.
All under _KERNEL.
OK claudio
|
|
so->so_state is already read without kernel lock inside soo_ioctl()
which calls pru_control() aka in6_control() and in_control().
OK mvs
|
|
This is all under _KERNEL:
- rs_lhcookie was added in 2014 110585f259f4974284e531f0a1e121b001a580dc
"Move sending of router solicitations to the kernel; [...]"
but never used
- nprefixes and ndefrouters became obsolete with 2017
4a2f474d14c160dc7829cce0149ead09d473ece9
"Remove sending of router solicitations and processing of router
advertisements from the kernel. [...]"
OK mpi
|
|
Naming the list like the struct itself makes for awful grepping.
Call the global variable "ifnetlist" from now on.
There used to be kvm(3) consumers in base picking up this symbol, but those
have long been converted to other interfaces.
A few potential ports users remain, same deal as sys/net/if_var.h r1.116
"Remove struct ifnet's unused if_switchport member": they get bumped.
Previous users pointed out by deraadt
OK bluhm
|
|
There was a crash due to use after free of the ifa although it is
ref counted. As ifa_refcnt was a simple integer increment, there
may be a path where multiple CPUs access it concurrently. So change
to struct refcnt which is MP safe and provides dt(4) leak debugging.
Link level address for IPsec enc(4) and various MPLS interfaces is
special. There ifa is part of struct sc. Use refcount anyway and
add a panic to detect use after free.
bug report stsp@; OK mvs@
|
|
NET_RLOCK_IN_IOCTL, which have the same implementation. The R and
W are hard to see, call the new macro NET_LOCK_SHARED. Rename the
opposite assertion from NET_ASSERT_WLOCKED to NET_ASSERT_LOCKED_EXCLUSIVE.
Update some outdated comments about net locking.
OK mpi@ mvs@
|
|
Revert the pr_usrreqs move: syzkaller found a NULL pointer deref
and I won't be available to monitor for followup issues for a bit
|
|
don't declare it again in the .c file
ok dlg@ mvs@ bluhm@
|
|
ok jmc@ reads ok tb@
|
|
ok gnezdo@ semarie@ mpi@
|
|
to the previous behavior: In case of a tie the new implementation
would keep the current best address while the old implementation
replaced the best address.
Since IPv6 addresses are stored in a TAILQ this meant that the rewrite
would use the "oldest" address while the previous behavior was to use
the "newest".
RFC 6724 section 5 has no opinion which one is better and leaves the
tie break up to implementers.
naddy found out the hard way that this breaks his IPv6 connectivity in
case of flash renumbering events when the link on his cpe flaps and a
new prefix is used since we would always pick an old address.
While we could pick the newest address in a tie break this feels too
much like an implementation detail, a solution much more in the spirit
of IPv6 is to pick the address with the highest preferred lifetime (or
valid lifetime in case of another tie).
very patient testing naddy@
|
|
Fixes a bunch of panics reported by syzkaller.
ok florian@
Reported-by: syzbot+02f2e07964a89ab65ea4@syzkaller.appspotmail.com
Reported-by: syzbot+c26b058a499ce38f689f@syzkaller.appspotmail.com
Reported-by: syzbot+62af76d8cb7c09ac017c@syzkaller.appspotmail.com
Reported-by: syzbot+d70144b3ae2ec068e318@syzkaller.appspotmail.com
Reported-by: syzbot+3c87ca9873bfd0492f5c@syzkaller.appspotmail.com
Reported-by: syzbot+323549177062adb80f84@syzkaller.appspotmail.com
Reported-by: syzbot+e745c1c29d960337ce14@syzkaller.appspotmail.com
Reported-by: syzbot+91da988a445013baf925@syzkaller.appspotmail.com
Reported-by: syzbot+747cbcbbed6318542061@syzkaller.appspotmail.com
Reported-by: syzbot+ca5efa23e00130bc8000@syzkaller.appspotmail.com
Reported-by: syzbot+731ab8c9a0342ace4189@syzkaller.appspotmail.com
Reported-by: syzbot+6c80b815a0ff8f09be69@syzkaller.appspotmail.com
Reported-by: syzbot+7939d2c4bc9a5dfa707a@syzkaller.appspotmail.com
Reported-by: syzbot+e893fb0259640a314d06@syzkaller.appspotmail.com
Reported-by: syzbot+b6a3447070ae8ffcb125@syzkaller.appspotmail.com
Reported-by: syzbot+23c0824b688f28c79c1b@syzkaller.appspotmail.com
Reported-by: syzbot+6cc72412d8ddcf87f8a1@syzkaller.appspotmail.com
|
|
RFC 6724 section 5.
This simplifies the code considerably while extensive testing shows no
change in behaviour. It is time to volunteer some more testers.
OK denis@ some time ago.
|
|
This is the name the other BSDs use for this, there is no reason to
be different, the IPv6 RFCs call these addresses temporary, and some
software in ports wants to use this as well.
Most recently pointed out for firefox by landry.
OK claudio, sthen
|
|
ok claudio@
|
|
time_second(9) and time_uptime(9) are widely used in the kernel to
quickly get the system UTC or system uptime as a time_t. However,
time_t is 64-bit everywhere, so it is not generally safe to use them
on 32-bit platforms: you have a split-read problem if your hardware
cannot perform atomic 64-bit reads.
This patch replaces time_second(9) with gettime(9), a safer successor
interface, throughout the kernel. Similarly, time_uptime(9) is replaced
with getuptime(9).
There is a performance cost on 32-bit platforms in exchange for
eliminating the split-read problem: instead of two register reads you
now have a lockless read loop to pull the values from the timehands.
This is really not *too* bad in the grand scheme of things, but
compared to what we were doing before it is several times slower.
There is no performance cost on 64-bit (__LP64__) platforms.
With input from visa@, dlg@, and tedu@.
Several bugs squashed by visa@.
ok kettenis@
|
|
Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.
dlg@ said that comments should be good enough.
ok sashan@
|
|
made from socket close path. Most device drivers are not MP-safe yet,
and the closing of AF_INET and AF_INET6 sockets is no longer under the
kernel lock.
This fixes a panic seen by jcs@.
OK mpi@
|
|
dhcpcd from ports uses SIOCGIFAFLAG_IN6 without setting sin6_len.
OK deraadt@ millert@
|
|
addresses. Implement in6_sa2sin6() to validate inet6 address family
and address length. The SIOCGIFDSTADDR_IN6, SIOCGIFNETMASK_IN6,
SIOCGIFAFLAG_IN6, SIOCGIFALIFETIME_IN6, and SIOCDIFADDR_IN6 ioctl(2)
are safe now.
OK visa@
|
|
this follows what's been done for detach and link state hooks, and
makes handling of hooks generally more robust.
address hooks are a bit different to detach/link state hooks in
that there's only a few things that register hooks (carp, pf, vxlan),
but a lot of places to run the hooks (lots of ipv4 and ipv6 address
configuration).
an address hook cookie was in struct pfi_kif, which is part of the
pf abi. rather than break pfctl -sI, this maintains the void * used
for the cookie and uses it to store a task, which is then used as
intended with the new api.
|
|
|
|
this allows mpls interfaces (mpe, mpw) to pass the rdomain they
wish the local label to be in, rather than have it implicitly forced
to 0 by these functions. right now they'll pass 0, but it will soon
be possible to have them rx packets in other rdomains.
previously the functions used ifp->if_rdomain for the rdomain.
everything other than mpls still passes ifp->if_rdomain.
ok mpi@
|
|
MPLS interfaces (ab)use rt_ifa_add for adding the local MPLS label
that they listen on for incoming packets, while every other use of
rt_ifa_add is for adding addresses on local interfaces. MPLS does
this cos the addresses involved are in basically the same shape as
ones used for setting up local addresses.
It is appropriate for interfaces to want RTF_MPATH on local addresses,
but in the MPLS case it means you can have multiple local things
listening on the same label, which doesn't actually work. mpe in
particular keeps track of in use labels to it can handle collisions,
however, mpw does not. It is currently possible to have multiple
mpw interfaces on the same local label, and sharing the same label
as mpe or possible normal forwarding labels.
Moving the RTF_MPATH flag out of rt_ifa_add means all the callers
that still want it need to pass it themselves. The mpe and mpw
callers are left alone without the flag, and will now get EEXIST
from rt_ifa_add when a label is already in use.
ok (and a huge amount of patience and help) mpi@
claudio@ is ok with the idea, but saw a much much earlier solution
to the problem
|
|
Benno removed code to answer ICMP queries over 4 years ago.
Aham Brahmasmi (aham.brahmasmi AT gmx.com) points out
that we still joined the group though.
OK sthen, bluhm, kn
|
|
scope check and clearing of the scope id into separate functions.
input & ok visa, mpi
|
|
OK tb
|
|
Hoist privilege check to the top and split out handling of
SIOCAIFADDR_IN6 and SIOCDIFADDR_IN6 into a separate function.
Merge tangled switches and simplify the code paths.
tested by hrvoje
ok visa
|
|
that only needs a read lock.
Tested by hrvoje
ok visa
|
|
for in_control(). Protect mrt6_ioctl() and nd6_ioctl() with a read
lock and in6_ioctl with the NET_LOCK() while establishing a single
exit point.
tested by kn
ok florian, mpi, visa
|
|
Found the hard way.
|
|
For the PRU_CONTROL bit the NET_LOCK surrounds in[6]_control() and
on the ENOTSUPP case we guard the driver if_ioctl functions.
OK mpi@
|
|
updated from userland that was marked duplicated or tentative.
Otherwise we would just lose the duplicated / tentative state and assume
that the address is now unique and usable.
OK kn
|
|
They have the same functionnality since friehm@ cleaned up
balancing code.
ok florian@, visa@, patrick@, bluhm@, jmatthew@
|
|
Instead return EOPNOTSUPP and call it from ifioctl(). This will help
getting per-driver ioctl routines outside of need the NET_LOCK().
While here always return ENXIO when ``ifp'' is NULL.
ok visa@, florian@
|
|
Also it does not change behaviour.
OK jca
|
|
ok florian@, sthen@, jsg@
|
|
if_attach() enforces it is properly defined.
|
|
ok florian@, claudio@, bluhm@
|
|
ok florian@, claudio@, visa@, bluhm@
|