summaryrefslogtreecommitdiff
path: root/sys/netinet
AgeCommit message (Collapse)Author
2017-09-07Replace a goto found in the ipq foreach loop with a simple break.Alexander Bluhm
This is a common idiom when a list element has been found. OK visa@ mpi@
2017-09-06Replace the call to ifa_ifwithaddr() in divert6_output() with aAlexander Bluhm
route lookup to make it MP safe. Only set the mbuf header fields that are needed. Validate the name input. Also use the same variables in IPv4 and IPv6 functions and avoid unneccessary initialization. OK mpi@
2017-09-06Replace the call to ifa_ifwithaddr() in divert_output() with a routeAlexander Bluhm
lookup to make it MP safe. Only set the mbuf header fields that are needed. Validate the name input. OK mpi@
2017-09-05Replace NET_ASSERT_LOCKED() by soassertlocked() in *_usrreq().Martin Pieuchot
Not all of them need the NET_LOCK(). ok bluhm@
2017-09-05Serialize access to IP reassembly queue with a mutex. This letsVisa Hankala
ip_local(), ip_slowtimo() and ip_drain() run without KERNEL_LOCK() and NET_LOCK(). Input and OK mpi@, bluhm@
2017-09-01Simplify list traversal in ip_freef(), and replace a hand-rolledVisa Hankala
list traversal with LIST_FOREACH_SAFE(). OK bluhm@, mpi@
2017-09-01Change sosetopt() to no longer free the mbuf it receives and changeMartin Pieuchot
all the callers to call m_freem(9). Support from deraadt@ and tedu@, ok visa@, bluhm@
2017-08-22Prevent a race against ipsec_in_use.Martin Pieuchot
Problem reported and fix tested by Hrvoje Popovski. ok bluhm@, visa@
2017-08-15Convert hand rolled sockaddr checks to the nam2sin functions.Alexander Bluhm
Especially in tcp_usrreq() connect detect the correct address family based on the inp_flags instead of the sa_family user input. OK mpi@
2017-08-11Remove NET_LOCK()'s argument.Martin Pieuchot
Tested by Hrvoje Popovski, ok bluhm@
2017-08-11Validate sockaddr from userland in central functions. This resultsAlexander Bluhm
in common checks for unix, inet, inet6 instead of partial checks here and there. Some checks are already done at a higher layer, but better be paranoid with user input. OK claudio@ millert@
2017-08-10icmp_mtudisc() might be called by TCP even on loopback after aAlexander Bluhm
retransmit timeout. Do not run path MTU discovery on local routes as we never want that on loopback. For permanent ARP or ND entries disable path MTU discovery as they use the same rt_expire field. This prevents that permanent routes and entries disappear. bug analysis friehm@; OK mpi@
2017-08-08fix typo in previous commit.T.J. Townsend
2017-08-08Stop running nd6_expire every second.Florian Obser
We know when pltime or vltime decrease to zero. Run nd6_expire then. Input & OK mpi, bluhm
2017-08-08Increase the limit of the IP protocol queues from 256 to 2048 mbufs.Alexander Bluhm
The interface congestion algorithm kills performance at this place, with the large queues it never triggers. OK mpi@ claudio@
2017-08-04The in_pcbhashlookup() in in_pcbconnect() enforces that the 4 tupelAlexander Bluhm
of src/dst ip/port is unique for TCP. But if the socket is not bound, the automatic bind by connect happens after the check. If the socket has the SO_REUSEADDR flag, in_pcbbind() may select an existing local port. Then we had two colliding TCP PCBs. This resulted in a packet storm of ACK packets on loopback. The softnet task was constantly holding the netlock and has a high priority, so the system hung. Do the in_pcbhashlookup() again after in_pcbbind(). This creates sporadic "connect: Address already in use" errors instead of a hang. bug report and testing Olivier Antoine; OK mpi@
2017-08-04We do have SO_TIMESTAMP since some time and there is other code in theFlorian Obser
kernel that uses it without the #ifdef guard. OK bluhm
2017-08-03Since nearly 20 years the correct spelling ofFlorian Obser
ICMP6_DST_UNREACH_NOTNEIGHBOR is ICMP6_DST_UNREACH_BEYONDSCOPE (RFC 1885 was obsoleted). sthen grepped the ports sources to make sure nothing uses it. OK millert, jca
2017-07-30Switch installer to Allotment Routing Table (ART).Florian Obser
Prompted by a bugreport by naddy that IPv6 autoconfiguration is broken in the installer. OK mpi, "go for it" deraadt
2017-07-28Add an error argument to rtm_send() instead of rerolling it insideMartin Pieuchot
rtdeletemsg(). ok bluhm@
2017-07-27Grab the KERNEL_LOCK() before calling sorwakeup().Martin Pieuchot
In the forwarding path, pf_test() is executed w/o KERNEL_LOCK() and in case of divert end up calling sowakup(). However selwakup() and csignal() are not yet ready to be executed w/o KERNEL_LOCK(). ok bluhm@
2017-07-14kernels don't build without MROUTING because ip_var.h only sometimesTed Unangst
introduces a forward decl for socket. turns out the affected file doesn't need ip_var.h, so remove it. then move the decl to the bottom to prevent the problem from recurring. bug report by Nick Briggs ok mpi
2017-07-12Get rid of ICMPV6CTL_ND6_DRLIST and ICMPV6CTL_ND6_PRLIST sysctlsFlorian Obser
With this we can also get rid of in6_prefix and in6_defrouter. They are meaningless, the kernel no longer tracks this information. Pointed out by & OK mpi
2017-07-05Fix RAMDISK build.Visa Hankala
OK bluhm@
2017-07-05The IP in IP input function strips the outer header and reinsertsAlexander Bluhm
the inner IP packet into the internet queue. The IPv6 local delivery code has a loop to deal with header chains. The idea is to use this loop and avoid the queueing and rescheduling. The IPsec packet will be processed in a single flow. Merge the IP deliver loop from both IP versions into a single ip_deliver() function that can handle both addresss families. This allows to process an IP in IP header like a normal extension header. If af != AF_UNSPEC, we are already in a deliver loop and have the kernel look. Then we can just return the next protocol. Otherwise we enqueue. The dequeue thread has the kernel lock and starts an IP delivery loop. OK mpi@
2017-06-26Convert ip_input() to a pr_input style function. Goal is to processAlexander Bluhm
IPsec packets without additional enqueueing. OK mpi@
2017-06-26Assert that the corresponding socket is locked when manipulating socketMartin Pieuchot
buffers. This is one step towards unlocking TCP input path. Note that all the functions asserting for the socket lock are not necessarilly MP-safe. All the fields of 'struct socket' aren't protected. Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to tell when a filter needs to lock the underlying data structures. Logic and name taken from NetBSD. Tested by Hrvoje Popovski. ok claudio@, bluhm@, mikeb@
2017-06-26Split a part of tdb_delete() into tdb_unlink() so that we can removePatrick Wildt
a TDB from the hash table without actually free()ing it. That way we can modify the TDB and then put it back in using puttdb(). ok claudio@
2017-06-22Fix the remaining ';;'s in sys/Tom Cosgrove
2017-06-20Do not use the interface pointer after if_put(). Rename ipip_input_gif()Alexander Bluhm
to ipip_input_if() and always pass the ifp. Only dump the packet to bpf if we are called with a gif(4) interface. OK mpi@
2017-06-19When dealing with mbuf pointers passed down as function parameters,Alexander Bluhm
bugs could easily result in use-after-free or double free. Introduce m_freemp() which automatically resets the pointer before freeing it. So we have less dangling pointers in the kernel. OK krw@ mpi@ claudio@
2017-06-19The IP multicast forward functions return an errno, call the variableAlexander Bluhm
error. Make the ip_mforward() return value consistent. Simplify the caller logic in ipv6_input() like in IPv4. OK mpi@
2017-06-11Use a common 'goto bad' style and set mp to NULL after freeing itAlexander Bluhm
in ipip_input_gif(). This prevents a use-after-free if there is a bug in the IP input functions. OK mpi@
2017-06-09Replace rtrequest(RTM_DELETE...) rtrequest_delete() and do not evenMartin Pieuchot
try to remove a route from the table if it is and invalid cache. This is a step towards decoupling code dealing with userland and kernel inserted routes. ok bluhm@
2017-06-07Grab the KERNEL_LOCK() around rtm*() functions. Routing sockets globalsMartin Pieuchot
aren't protected by the NET_LOCK(). While here change lock assertions in rt_{set,put}gwroute(), the NET_LOCK() is enough. Tested by Hrvoje Popovski. ok jmatthew@, claudio@
2017-05-31Move IPv4 & IPv6 incoming/forwarding path, PIPEX ppp processing andMartin Pieuchot
IPv4 & IPv6 dispatch functions outside the KERNEL_LOCK(). We currently rely on the NET_LOCK() serializing access to most global data structures for that. IP input queues are no longer used in the forwarding case. They still exist as boundary between the network and transport layers because TCP/UDP & friends still need the KERNEL_LOCK(). Since we do not want to grab the NET_LOCK() for every packet, the softnet thread will do it once before processing a batch. That means the L2 processing path, which is currently running without lock, will now run with the NET_LOCK(). IPsec isn't ready to run without KERNEL_LOCK(), so the softnet thread will grab the KERNEL_LOCK() as soon as ``ipsec_in_use'' is set. Tested by Hrvoje Popovski. ok visa@, bluhm@, henning@
2017-05-30add sizes to free() callsTheo de Raadt
2017-05-30Carp balancing ip does not work since there is a mac filter infriehm
ether_input(). Now we use mbuf tags instead of modifying the MAC address. ok mpi@
2017-05-30Introduce ipv{4,6}_input(), two wrappers around IP queues.Martin Pieuchot
This will help transitionning to an un-KERNEL_LOCK()ed IP forwarding path. Disucssed with bluhm@, ok claudio@
2017-05-29Per-interface list of addresses, both multicast and unicast, areMartin Pieuchot
currently protected by the NET_LOCK(). They are not accessed in the hot path, so protecting them with a mutex could be an option. However since we're now going to run with a NET_LOCK() for some time, assert that it is held. IPsec is not yet ready to run without KERNEL_LOCK(), so assert it is held, even in the forwarding path. Tested by sthen@, ok visa@, claudio@, bluhm@
2017-05-28Call bpf_mtap_af() a bit earlier in ipip_input(). This preparesAlexander Bluhm
upcoming diffs, no functional change. OK mpi@
2017-05-28Leaving IP multicast group requires the NET_LOCK().Martin Pieuchot
Grab the lock before calling carpdetach(). ok bluhm@
2017-05-28clang warns on unused labels. Place a recently introduced label underJonathan Gray
ifdef IPSEC to fix the clang build when IPSEC is not defined. ok deraadt@ bluhm@
2017-05-28Rename ip_local() to ip_deliver() and give it the same parametersAlexander Bluhm
as the pr_input functions. Add an assert that IPv4 delivery ends in IP proto done to assure that IPv4 protocol functions work like IPv6. OK mpi@
2017-05-27Fix the carp mode 'balancing ip-stealth'. Set the link state UPAlexander Bluhm
if at least one vhid is in state MASTER. from Florian Riehm; OK florian@
2017-05-26In IPIP input rename the variable ipo to ip as it is used for innerAlexander Bluhm
and outer header. Reset values depending on the the mbuf when the mbuf is adjusted. Check the length of the inner IP header with the correct size in case of IPv6. Check the IPv4 header size including IP options. For the IPIP statistics the inner header length has to be subtracted from the packet size as the outer header has already been stripped off. OK mpi@
2017-05-26Instead of looking at the IP version of the header, use the outerAlexander Bluhm
address family passed to ipip_input(). OK mpi@
2017-05-22Move IPsec forward and local policy check functions to ipsec_input.cAlexander Bluhm
and give them better names. input and OK mikeb@
2017-05-22Use the IPsec policy check from IPv4 also when doing local deliveryAlexander Bluhm
in ip6_local() to our IPv6 stack. OK mikeb@
2017-05-22Fix a mbuf leak when reflecting an ICMP packet with IP options.Alexander Bluhm
Free the options in icmp_input_if() after a successful call to icmp_reflect(). bug report and analysis by Hendrik Gerlach OK krw@ claudio@ phessler@