summaryrefslogtreecommitdiff
path: root/sys/netinet6
AgeCommit message (Collapse)Author
2022-11-10Use local variable for consistencyKlemens Nanni
OK claudio
2022-11-07Run the ND6 expiry timer without kernel lockKlemens Nanni
Added in 2017 to Reduce contention on the NET_LOCK() by moving the nd6 address expiration task to the `softnettq`. This should no longer be needed thanks to sys/net/if.c r1.652 in 2022: Activate parallel IP forwarding. Start 4 softnet tasks. Limit the usage to the number of CPUs. Nothing in nd6_expire() or nd6_expire_timer_update() requires protection by the kernel lock. The interface list and per-interface address lists remain protected by the net lock. Tests by Hrvoje OK mpi
2022-10-17Change pru_abort() return type to the type of void and make pru_abort()Vitaliy Makkoveev
optional. We have no interest on pru_abort() return value. We call it only from soabort() which is dummy pru_abort() wrapper and has no return value. Only the connection oriented sockets need to implement (*pru_abort)() handler. Such sockets are tcp(4) and unix(4) sockets, so remove existing code for all others, it doesn't called. ok guenther@
2022-10-03System calls should not fail due to temporary memory shortage inAlexander Bluhm
malloc(9) or pool_get(9). Pass down a wait flag to pru_attach(). During syscall socket(2) it is ok to wait, this logic was missing for internet pcb. Pfkey and route sockets were already waiting. sonewconn() must not wait when called during TCP 3-way handshake. This logic has been preserved. Unix domain stream socket connect(2) can wait until the other side has created the socket to accept. OK mvs@
2022-09-13Do soreceive() with shared netlock for raw sockets.Vitaliy Makkoveev
ok bluhm@
2022-09-09Clarify/typofix commentsKlemens Nanni
OpenBSD is not FreeBSD and has no stf(4) interface. No object change.
2022-09-08Rename global ifnet TAILQKlemens Nanni
Naming the list like the struct itself makes for awful grepping. Call the global variable "ifnetlist" from now on. There used to be kvm(3) consumers in base picking up this symbol, but those have long been converted to other interfaces. A few potential ports users remain, same deal as sys/net/if_var.h r1.116 "Remove struct ifnet's unused if_switchport member": they get bumped. Previous users pointed out by deraadt OK bluhm
2022-09-05Move mld6 address variables from data to stack memory to make themAlexander Bluhm
MP safe. Due to the KAME scope address hack, the link-local all nodes and all routers IPv6 addresses cannot be const. OK benno@
2022-09-05Use shared netlock in soreceive(). The UDP and IP divert layerAlexander Bluhm
provide locking of the PCB. If that is possible, use shared instead of exclusive netlock in soreceive(). The PCB mutex provides a per socket lock against multiple soreceive() running in parallel. Release and regrab both locks in sosleep_nsec(). OK mvs@
2022-09-04spellingJonathan Gray
2022-09-03Move PRU_PEERADDR request to (*pru_peeraddr)().Vitaliy Makkoveev
Introduce in{,6}_peeraddr() and use them for inet and inet6 sockets, except tcp(4) case. Also remove *_usrreq() handlers. ok bluhm@
2022-09-03Move PRU_SOCKADDR request to (*pru_sockaddr)()Vitaliy Makkoveev
Introduce in{,6}_sockaddr() functions, and use them for all except tcp(4) inet sockets. For tcp(4) sockets use tcp_sockaddr() to keep debug ability. The key management and route domain sockets returns EINVAL error for PRU_SOCKADDR request, so keep this behaviour for a while instead of make pru_sockaddr handler optional and return EOPNOTSUPP. ok bluhm@
2022-09-02Move PRU_CONTROL request to (*pru_control)().Vitaliy Makkoveev
The 'proc *' arg is not used for PRU_CONTROL request, so remove it from pru_control() wrapper. Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for inet6 case. ok guenther@ bluhm@
2022-09-01Move PRU_CONNECT2 request to (*pru_connect2)().Vitaliy Makkoveev
ok bluhm@
2022-08-31Move PRU_SENDOOB request to (*pru_sendoob)().Vitaliy Makkoveev
PRU_SENDOOB request always consumes passed `top' and `control' mbufs. To avoid dummy m_freem(9) handlers for all protocols release passed mbufs in the pru_sendoob() EOPNOTSUPP error path. Also fix `control' mbuf(9) leak in the tcp(4) PRU_SENDOOB error path. ok bluhm@
2022-08-30Refactor internet PCB lookup function. Rename in_pcbhashlookup()Alexander Bluhm
so the public API is in_pcblookup() and in_pcblookup_listen(). For internal use introduce in_pcbhash_insert() and in_pcbhash_lookup() to avoid code duplication. Routing domain is unsigned, change the type to u_int. OK mvs@
2022-08-29Move PRU_RCVOOB request to (*pru_rcvoob)().Vitaliy Makkoveev
ok bluhm@
2022-08-29Use struct refcnt for interface address reference counting.Alexander Bluhm
There was a crash due to use after free of the ifa although it is ref counted. As ifa_refcnt was a simple integer increment, there may be a path where multiple CPUs access it concurrently. So change to struct refcnt which is MP safe and provides dt(4) leak debugging. Link level address for IPsec enc(4) and various MPLS interfaces is special. There ifa is part of struct sc. Use refcount anyway and add a panic to detect use after free. bug report stsp@; OK mvs@
2022-08-28Move PRU_SENSE request to (*pru_sense)().Vitaliy Makkoveev
ok bluhm@
2022-08-28Move PRU_ABORT request to (*pru_abort)().Vitaliy Makkoveev
We abort only the sockets which are linked to `so_q' or `so_q0' queues of listening socket. Such sockets have no corresponding file descriptor and are not accessed from userland, so PRU_ABORT used to destroy them on listening socket destruction. Currently all our sockets support PRU_ABORT request, but actually it required only for tcp(4) and unix(4) sockets, so i should be optional. However, they will be removed with separate diff, and this time PRU_ABORT requests were converted as is. Also, the socket should be destroyed on PRU_ABORT request, but route and key management sockets leave it alive. This was also converted as is, because this wrong code never called. ok bluhm@
2022-08-27Move PRU_SEND request to (*pru_send)().Vitaliy Makkoveev
The former PRU_SEND error path of gre_usrreq() had `control' mbuf(9) leak. It was fixed in new gre_send(). The former pfkeyv2_send() was renamed to pfkeyv2_dosend(). ok bluhm@
2022-08-26Move PRU_RCVD request to (*pru_rcvd)().Vitaliy Makkoveev
ok bluhm@
2022-08-22Move PRU_SHUTDOWN request to (*pru_shutdown)().Vitaliy Makkoveev
ok bluhm@
2022-08-22Document that igmp_timers_are_running and mld6_timers_are_runningAlexander Bluhm
are protected by netlock. They are only used as shortcut in fast timer. Common prefix in mld6.c is mld6. OK mvs@
2022-08-22Move PRU_DISCONNECT request to (*pru_disconnect).Vitaliy Makkoveev
ok bluhm@
2022-08-22Use rwlock per inpcb table to protect notify list. The notifyAlexander Bluhm
function may sleep, so holding a mutex is not possible. The same list entry and rwlock is used for UDP multicast and raw IP delivery. By adding a write lock, exclusive netlock is no longer necessary for PCB notify and UDP and raw IP input. OK mvs@
2022-08-22Move PRU_ACCEPT request to (*pru_accept)().Vitaliy Makkoveev
ok bluhm@
2022-08-21Only grab netlock in igmp and mdl6 fast timer when necessary. ThereAlexander Bluhm
are status variables that can be used to avoid locking if timers are not running. This should reduce contention on exclusive netlock. OK kn@ mvs@
2022-08-21Move PRU_CONNECT request to (*pru_connect)() handler.Vitaliy Makkoveev
ok bluhm@
2022-08-21Move PRU_LISTEN request to (*pru_listen)() handler.Vitaliy Makkoveev
ok bluhm@
2022-08-21Remove ip_local() and ip6_local(). After moving the IPv4 fragmentAlexander Bluhm
reassembly and IPv6 hob-by-hob header chain processing out of ip_local() and ip6_local(), they are almost empty stubs. The check for local deliver loop in ip_ours() and ip6_ours() is sufficient. Recover mbuf offset and next protocol directly in ipintr() and ip6intr(). OK mvs@
2022-08-21Introduce a mutex per inpcb to serialize access to socket receiveAlexander Bluhm
buffer. Later it may be used to protect more of the PCB or socket. In divert input replace the kernel lock with this mutex. OK mvs@
2022-08-20Move PRU_BIND request to (*pru_bind)() handler.Vitaliy Makkoveev
For the protocols which don't support request, leave handler NULL. Do the NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in such case. This will be done for all upcoming user request handlers. ok bluhm@ guenther@
2022-08-15Run IPv6 hop-by-hop options processing in parallel. The ip6_hbhchcheck()Alexander Bluhm
code is MP safe and moves from ip6_local() to ip6_ours(). If there are any options, store the chain offset and next protocol in a mbuf tag. When dequeuing without tag, it is a regular IPv6 header. As mbuf tags degrade performance, use them only if a hop-by-hop header is present. Such packets are rare and pf drops them by default. OK mvs@
2022-08-15Introduce 'pr_usrreqs' structure and move existing user-protocolVitaliy Makkoveev
handlers into it. We want to split existing (*pr_usrreq)() to multiple short handlers for each PRU_ request as it was already done for PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)() split will be done with the following diffs. Based on reverted diff from guenther@. ok bluhm@
2022-08-12Remove differences between ip_fragment() and ip6_fragment(). TheyAlexander Bluhm
do nearly the same thing, so they should look similar. OK sashan@
2022-08-12There are some places in ip and ip6 input where operations fail dueAlexander Bluhm
to out of memory. Use a generic idropped counter for those. OK mvs@
2022-08-12At successful return ip6_check_rh0hdr() keeps *offp unmodified.Alexander Bluhm
The IPv6 routing header type 0 check should modify *offp only in case of an error, so that the generated icmp6 packet has the correct pointer. OK sashan@
2022-08-09Backout "Call getuptime() just once per function"Klemens Nanni
This caused stuck ndp cache entries as found by naddy, sorry.
2022-08-08If interface drivers had enabled transmit offloading of the payloadAlexander Bluhm
checksum, IPv6 fragments contained invalid checksum. For fragments the protocol checksum has to be calculated before fragmentation. Hardware cannot do this as it is too late. Do it earlier in software. tested and OK mbuhl@
2022-08-08Constify in6_addr pointer arguments in nd6_*() functionsKlemens Nanni
All of them are passed to inspect/copy out fields, none of the functions writes to the struct. This makes it easier to argue about code (in MP context). OK bluhm
2022-08-08Call getuptime() just once per functionKlemens Nanni
IPv6 pendant to bluhm's sys/netinet/if_ether.c r1.249: Instead of calling getuptime() all the time in ARP code, do it only once per function. This gives a more consistent time value. OK claudio@ miod@ mvs@ OK bluhm
2022-08-08To make protocol input functions MP safe, internet PCB need protection.Alexander Bluhm
Use their reference counter in more places. The in_pcb lookup functions hold the PCBs in hash tables protected by table->inpt_mtx mutex. Whenever a result is returned, increment the ref count before releasing the mutex. Then the inp can be used as long as neccessary. Unref it at the end of all functions that call in_pcb lookup. As a shortcut, pf may also hold a reference to the PCB. When pf_inp_lookup() returns it, it also incements the ref count and the caller can handle it like the inp from table lookup. OK sashan@
2022-08-06Clean up the netlock macros. Merge NET_RLOCK_IN_SOFTNET andAlexander Bluhm
NET_RLOCK_IN_IOCTL, which have the same implementation. The R and W are hard to see, call the new macro NET_LOCK_SHARED. Rename the opposite assertion from NET_ASSERT_WLOCKED to NET_ASSERT_LOCKED_EXCLUSIVE. Update some outdated comments about net locking. OK mpi@ mvs@
2022-07-28Zap prototypes for nonexistent nd6_setmtu() and in6_ifdel()Klemens Nanni
Removed in 2015 and 2002, respectively. OK claudio
2022-07-28Zap outdated nd6_free() comment about staticKlemens Nanni
Added in 2002 r1.48 "sync with latest KAME [...]" along the attribute, but nd6_free() became a global void function in 2017 r1.212. Afaik static kernel functions are avoided to aid ddb'ugging and I presume the "significant changes in the kernel" bits of the comment stem from something 20 years ago no longer holding true today. Afterall, this change has been safe for five years. OK claudio
2022-07-24Fix assertion for write netlock in rip6_input(). ip6_input() hasAlexander Bluhm
shared net lock. ip_deliver() needs exclusive net lock. Instead of calling ip_deliver() directly, use ip6_ours() to queue the packet. Move the write lock assertion into ip_deliver() to catch such bugs earlier. The assertion was only triggered with IPv6 multicast forwarding or router alert hop by hop option. Found by regress test. OK kn@ mvs@
2022-07-22Zap nd6_recalc_reachtm_interval indirectionKlemens Nanni
Only used once, so use the macro directly like ND6_SLOWTIMER_INTERVAL is used in many places. OK florian
2022-07-22Leftovers from florian's RS/NA purge from the kernel in 2017.Klemens Nanni
OK bluhm
2022-07-22Zap dead store nd6_allocatedKlemens Nanni
There since KAME IPv6 import in 1999. OK "Pool statistics has this info already." bluhm