summaryrefslogtreecommitdiff
path: root/sys/netinet/in.c
AgeCommit message (Collapse)Author
2024-01-06Take net lock before kernel lock.Alexander Bluhm
Doing KERNEL_LOCK() just before NET_LOCK() does not make sense. Net lock is a rwlock that releases kernel lock during sleep. To avoid an unnecessary release and take kernel lock cycle, move KERNEL_LOCK() after NET_LOCK(). There is no lock order reversal deadlock issue. Both locks are used in any order thoughout the kernel. As NET_LOCK() releases the kernel lock when it cannot take the lock immediately and has to sleep, we always end in the order kernel lock before net lock after sleeping. OK sashan@
2023-06-28use refcnt API for multicast addresses, add tracepoint:refcnt:ifmaddr probeKlemens Nanni
Replace hand-rolled reference counting with refcnt_init(9) and hook it up with a new dt(4) probe. OK bluhm mvs
2023-04-24Hoist privilege checks furtherKlemens Nanni
in6.c already has the privilege check as early as possible, make in.c match. For unprivileged IPv4 ioctl calls with invalid args, this changes errno from E* to EPERM. OK bluhm
2023-04-21Drop error variable and return directly; OK mvs tbKlemens Nanni
2023-04-19move kernel lock into multicast ioctl handlers; OK mvsKlemens Nanni
2023-04-18Hoist identical privilege checks in in_ioctl*()Klemens Nanni
All cases do the same check as first step, so merge it before the switch and before grapping exclusive locks. OK mvs
2023-04-15Unlock in_ioctl_get(), push kernel lock into in_ioctl_{set,change}_ifaddr()Klemens Nanni
Just like in6_ioctl_get(), read ioctls are safe with the shared net lock to protect interface addresses and flags. OK mvs
2022-12-06Add missing kernel lock around (*if_ioctl)() call withinVitaliy Makkoveev
in{,6}_addmulti(). Since kernel lock is no more taken while following setsockopt() path, it should be taken in this place. Corresponding in{,6}_delmulti() already acquire kernel lock around (*if_ioctl)(). Problem reported and diff tested by weerd@ ok kn@ bluhm@
2022-11-19Push kernel lock into pru_control() aka. in6_control() / in_control()Klemens Nanni
so->so_state is already read without kernel lock inside soo_ioctl() which calls pru_control() aka in6_control() and in_control(). OK mvs
2022-09-08Rename global ifnet TAILQKlemens Nanni
Naming the list like the struct itself makes for awful grepping. Call the global variable "ifnetlist" from now on. There used to be kvm(3) consumers in base picking up this symbol, but those have long been converted to other interfaces. A few potential ports users remain, same deal as sys/net/if_var.h r1.116 "Remove struct ifnet's unused if_switchport member": they get bumped. Previous users pointed out by deraadt OK bluhm
2022-08-29Use struct refcnt for interface address reference counting.Alexander Bluhm
There was a crash due to use after free of the ifa although it is ref counted. As ifa_refcnt was a simple integer increment, there may be a path where multiple CPUs access it concurrently. So change to struct refcnt which is MP safe and provides dt(4) leak debugging. Link level address for IPsec enc(4) and various MPLS interfaces is special. There ifa is part of struct sc. Use refcount anyway and add a panic to detect use after free. bug report stsp@; OK mvs@
2022-08-06Clean up the netlock macros. Merge NET_RLOCK_IN_SOFTNET andAlexander Bluhm
NET_RLOCK_IN_IOCTL, which have the same implementation. The R and W are hard to see, call the new macro NET_LOCK_SHARED. Rename the opposite assertion from NET_ASSERT_WLOCKED to NET_ASSERT_LOCKED_EXCLUSIVE. Update some outdated comments about net locking. OK mpi@ mvs@
2022-05-05No longer consider IN_EXPERIMENTAL aka 240/4 as not forwardable.Claudio Jeker
We already allow 240/4 in and out so lets allow it through as well. One of many steps to make 240/4 useable. Diff by Seth David Schoen (schoen at loyalty.org) OK bluhm@ djm@
2022-03-28if_detach() does if_remove(ifp); NET_LOCK(); rti_delete(). NewAlexander Bluhm
igmp groups may join while sleeping in interface destruction. In this case if_get() in igmp_joingroup() fails and rti_fill() is not called. Then inm->inm_rti may be NULL. This is the condition when syzkaller crashes in igmp_leavegroup(). Pass the ifp the current CPU is already holding down to igmp_joingroup() and igmp_leavegroup() to avoid half constructed igmp groups. Calling if_get() in caller and callee makes no sense anyway. Reported-by: syzbot+146823a676b7bea83649@syzkaller.appspotmail.com OK denis@
2022-03-04in_addmulti() is only called from ioctl(2) or setsockopt(2). WaitAlexander Bluhm
for malloc(9) to make the system call reliable. OK mvs@
2021-03-10spellingJonathan Gray
ok gnezdo@ semarie@ mpi@
2020-05-27Document the various flavors of NET_LOCK() and rename the reader version.Martin Pieuchot
Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path take the reader lock. This is mostly for documentation purpose as long as the softnet thread is converted back to use a read lock. dlg@ said that comments should be good enough. ok sashan@
2020-03-15Guard SIOCDELMULTI if_ioctl calls with KERNEL_LOCK() where the call isVisa Hankala
made from socket close path. Most device drivers are not MP-safe yet, and the closing of AF_INET and AF_INET6 sockets is no longer under the kernel lock. This fixes a panic seen by jcs@. OK mpi@
2019-12-01Don't require a valid sa_len for a bunch of IPv4 "get" ioctlsJeremie Courreges-Anglas
Same fix as for the IPv6 case. Fixes a regression in ports/net/openvpn spotted by landry@, ok bluhm@
2019-11-28Although ifconfig(8) checks it already, enforce contiguous inetAlexander Bluhm
netmask in the kernel. OK visa@
2019-11-08convert interface address change hooks to tasks and a task_list.David Gwynne
this follows what's been done for detach and link state hooks, and makes handling of hooks generally more robust. address hooks are a bit different to detach/link state hooks in that there's only a few things that register hooks (carp, pf, vxlan), but a lot of places to run the hooks (lots of ipv4 and ipv6 address configuration). an address hook cookie was in struct pfi_kif, which is part of the pf abi. rather than break pfctl -sI, this maintains the void * used for the cookie and uses it to store a task, which is then used as intended with the new api.
2019-11-07Do propper kernel input validation for in_control() ioctl(2)Alexander Bluhm
SIOCGIFADDR, SIOCGIFNETMASK, SIOCGIFDSTADDR, SIOCGIFBRDADDR, SIOCSIFADDR, SIOCSIFNETMASK, SIOCSIFDSTADDR, and SIOCSIFBRDADDR. Name in_ioctl_set_ifaddr() consistently. Use in_sa2sin() to validate inet address. Combine if_addrlist loops and add comment. Although netmask is not a inet address, length must be valid. Reported-by: syzbot+5fc6da002fc4e8d994be@syzkaller.appspotmail.com OK visa@
2019-10-23Kernel is missing propper input validation when configuring addresses.Alexander Bluhm
Fix the SIOCAIFADDR and SIOCDIFADDR ioctl(2) by implementing in_sa2sin() to validate inet address family and address length. OK visa@
2019-07-25Add IFXF_AUTOCONF4 to if_xflags to match IFXF_AUTOCONF6. LetKenneth R Westerback
ifconfig set/unset it. ok deraadt@ kmos@
2019-02-13change rt_ifa_add and rt_ifa_del so they take an rdomain argument.David Gwynne
this allows mpls interfaces (mpe, mpw) to pass the rdomain they wish the local label to be in, rather than have it implicitly forced to 0 by these functions. right now they'll pass 0, but it will soon be possible to have them rx packets in other rdomains. previously the functions used ifp->if_rdomain for the rdomain. everything other than mpls still passes ifp->if_rdomain. ok mpi@
2019-02-10remove the implict RTF_MPATH flag that rt_ifa_add() sets on new routes.David Gwynne
MPLS interfaces (ab)use rt_ifa_add for adding the local MPLS label that they listen on for incoming packets, while every other use of rt_ifa_add is for adding addresses on local interfaces. MPLS does this cos the addresses involved are in basically the same shape as ones used for setting up local addresses. It is appropriate for interfaces to want RTF_MPATH on local addresses, but in the MPLS case it means you can have multiple local things listening on the same label, which doesn't actually work. mpe in particular keeps track of in use labels to it can handle collisions, however, mpw does not. It is currently possible to have multiple mpw interfaces on the same local label, and sharing the same label as mpe or possible normal forwarding labels. Moving the RTF_MPATH flag out of rt_ifa_add means all the callers that still want it need to pass it themselves. The mpe and mpw callers are left alone without the flag, and will now get EEXIST from rt_ifa_add when a label is already in use. ok (and a huge amount of patience and help) mpi@ claudio@ is ok with the idea, but saw a much much earlier solution to the problem
2018-07-11Disambiguate the source of panics in sys/net* by adding __func__Nayden Markatchev
magic constant to panic() calls. ok benno@ henning@ tb@
2018-06-03SIOCSIFNETMASK takes a 'struct ifreq', so use 'ifr' for it and drop theTheo Buehler
now unused 'ifra' from in_ioctl(). Discussed with mpi and visa
2018-06-03Better version of the refactoring attempted in r1.154. Move SIOCSIFADDRTheo Buehler
to its own function and merge the two switches in in_ioctl_change_ifaddr(). Finally: each ioctl has its own case and privilege check. ok visa As an aside, an audit of the ports tree has shown that we should continue to support the legacy ioctls SIOCSIF{,BRD,DST}ADDR, SIOCSIFNETMASK despite the fact that they have been deprecated for the better part of two decades and FreeBSD dropped support 7 years ago. Too many ports still rely on them. Thanks to sthen and visa for their help with that.
2018-05-31zap trailing whitespaceTheo Buehler
2018-05-31Re-commit the correct half of revision 1.154:Theo Buehler
Some more code shuffling to get rid of one switch in in_ioctl(). This way there is one case for each of SIOCSIFBRDADDR, SIOCSIFDSTADDR and SIOCSIFNETMASK, starting with a privilege check before any global data is modified. ok visa
2018-05-31Back out previous. The handling of SIOCSIFADDR is now wrong.Theo Buehler
2018-05-31Some more code shuffling to get rid of one switch in each, in_ioctl()Theo Buehler
and in_ioctl_change_ifaddr(). This way there is one case per ioctl starting with a privilege check before any global data is modified. The code paths are now straightforward. Some code duplication between SIOCSIFADDR and SIOCAIFADDR, but that can be addressed later. tested by hrvoje ok visa
2018-05-28Next step in disentangling the switches in in_ioctl(). Split out theTheo Buehler
handling of SIOCAIFADDR, SIOCDIFADDR, SIOCSIFADDR into a separate function, analogously to what was done in in6_ioctl(). tested by hrvoje ok visa
2018-05-25Prepare in_ioctl() for further refactoring with the goal of merging theTheo Buehler
two big switches in this function. Error out early in the default case without grabbing the NET_LOCK() and move SIOCSIFNETMASK a bit up. This will reduce the noise in an upcoming diff. ok visa
2018-05-02Split out the handling of SIOCGIF* into a new function, in_ioctl_get(),Theo Buehler
that only needs the read lock. ok visa, mpi
2018-04-30Reduce the scope of the NET_LOCK() in in_control(). Two functions wereTheo Buehler
protected: mrt_ioctl() and in_ioctl(). The former has no other callers and only needs a read lock. The latter will need refactoring to reduce the lock's scope further. In a first step, establish a single exit point and protect most of the function body with the NET_LOCK() while removing the NET_LOCK() from a handful of callers. suggested by & ok mpi, ok visa
2018-04-24Make the ramdisks compile again, we don't have MROUTING on them.Florian Obser
Found the hard way.
2018-04-24Push NET_LOCK down in the default ifioctl case.Paul Irofti
For the PRU_CONTROL bit the NET_LOCK surrounds in[6]_control() and on the ENOTSUPP case we guard the driver if_ioctl functions. OK mpi@
2018-03-02Revert all the bits of the autocreate 127.0.0.1 on lo(4) creation for now.Claudio Jeker
This needs to go back to the drawing board.
2018-02-10Kill unneeded caddr_t cast.Claudio Jeker
2018-02-10Similar to the IPv6 case create 127.0.0.1/8 on lo(4) interfaces which actClaudio Jeker
as loopback interfaces for each rdomain (including lo0). This is done when the interface is brought up. This is now also done by default (either on attach of lo0 or when creating the rdomain). OK mpi@
2017-11-04Stop calling ifp->if_ioctl() inside in{,6}_ioctl().Martin Pieuchot
Instead return EOPNOTSUPP and call it from ifioctl(). This will help getting per-driver ioctl routines outside of need the NET_LOCK(). While here always return ENXIO when ``ifp'' is NULL. ok visa@, florian@
2017-10-24Remove some more tests checking for a non-NULL `ifp->if_ioctl'.Martin Pieuchot
if_attach() enforces it is properly defined.
2017-10-24Remove support for never used ioctls originating from KAME.Martin Pieuchot
ok florian@, claudio@, bluhm@
2017-10-20Do not test if if_ioctl is NULL, it isn't.Martin Pieuchot
ok florian@, claudio@, visa@, bluhm@
2017-08-11Validate sockaddr from userland in central functions. This resultsAlexander Bluhm
in common checks for unix, inet, inet6 instead of partial checks here and there. Some checks are already done at a higher layer, but better be paranoid with user input. OK claudio@ millert@
2017-05-29Per-interface list of addresses, both multicast and unicast, areMartin Pieuchot
currently protected by the NET_LOCK(). They are not accessed in the hot path, so protecting them with a mutex could be an option. However since we're now going to run with a NET_LOCK() for some time, assert that it is held. IPsec is not yet ready to run without KERNEL_LOCK(), so assert it is held, even in the forwarding path. Tested by sthen@, ok visa@, claudio@, bluhm@
2017-05-16Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().Martin Pieuchot
ok visa@
2017-05-04Introduce sstosa() for converting sockaddr_storage with a type safeAlexander Bluhm
inline function instead of casting it to sockaddr. While there, use inline instead of __inline for all these conversions. Some struct sockaddr casts can be avoided completely. OK dhill@ mpi@