summaryrefslogtreecommitdiff
path: root/sys/netinet
AgeCommit message (Collapse)Author
2019-12-06Checking the IPsec policy is expensive. Check only when IPsec is used.tobhe
ok bluhm@
2019-12-01Don't require a valid sa_len for a bunch of IPv4 "get" ioctlsJeremie Courreges-Anglas
Same fix as for the IPv6 case. Fixes a regression in ports/net/openvpn spotted by landry@, ok bluhm@
2019-11-29Change the default security level for incoming IPsec flows fromtobhe
isakmpd and iked to REQUIRE. Filter policy violations earlier. ok sashan@ bluhm@
2019-11-28Although ifconfig(8) checks it already, enforce contiguous inetAlexander Bluhm
netmask in the kernel. OK visa@
2019-11-13Add DoT 853 to DEFBADDYNAMICPORTS_TCP. This port will be increasinglyTheo de Raadt
unfiltered in the future, so this prevents rresvport_af(3) from randomly exposing a service intended for local visibility only. ok florian
2019-11-11Prevent underflows in tp->snd_wnd if the remote side ACKs more thanAlexander Bluhm
tp->snd_wnd. This can happen, for example, when the remote side responds to a window probe by ACKing the one byte it contains. from FreeBSD; via markus@; OK sashan@ tobhe@
2019-11-08void being too clever about setting/clearing ifpromisc on the parent.David Gwynne
ifpromisc() already refcounts, so carp doesn't have to do it implicitly with the carpdev list. there's no functional change, the code just gets a bit simpler.
2019-11-08convert interface address change hooks to tasks and a task_list.David Gwynne
this follows what's been done for detach and link state hooks, and makes handling of hooks generally more robust. address hooks are a bit different to detach/link state hooks in that there's only a few things that register hooks (carp, pf, vxlan), but a lot of places to run the hooks (lots of ipv4 and ipv6 address configuration). an address hook cookie was in struct pfi_kif, which is part of the pf abi. rather than break pfctl -sI, this maintains the void * used for the cookie and uses it to store a task, which is then used as intended with the new api.
2019-11-07Do propper kernel input validation for in_control() ioctl(2)Alexander Bluhm
SIOCGIFADDR, SIOCGIFNETMASK, SIOCGIFDSTADDR, SIOCGIFBRDADDR, SIOCSIFADDR, SIOCSIFNETMASK, SIOCSIFDSTADDR, and SIOCSIFBRDADDR. Name in_ioctl_set_ifaddr() consistently. Use in_sa2sin() to validate inet address. Combine if_addrlist loops and add comment. Although netmask is not a inet address, length must be valid. Reported-by: syzbot+5fc6da002fc4e8d994be@syzkaller.appspotmail.com OK visa@
2019-11-07Avoid NULL dereference in arpinvalidate() and nd6_invalidate() byKenneth R Westerback
making RTM_INVALIDATE code path perform same check as RTM_DELETE does. ok mpi@
2019-11-07turn the linkstate hooks into a task list, like the detach hooks.David Gwynne
this is largely mechanical, except for carp. this moves the addition of the carp link state hook after we're committed to using the new interface as a carpdev. because the add can't fail, we avoid a complicated unwind dance. also, this tweaks the carp linkstate hook so it only updates the relevant carp interface, not all of the carpdevs on the parent. hrvoje popovski has tested an early version of this diff and it's generally ok, but there's some splasserts that this diff fires that i'll fix in an upcoming diff. ok claudio@
2019-11-06replace the hooks used with if_detachhooks with a task list.David Gwynne
the main semantic change is that things registering detach hooks have to allocate and set a task structure that then gets added to the list. this means if the task is allocated up front (eg, as part of carps softc or bridges port structure), it avoids the possibility that adding a hook can fail. a lot of drivers weren't checking for failure, and unwinding state in the event of failure in other parts was error prone. while doing this i discovered that the list operations have to be in a particular order, but drivers weren't doing that consistently either. this diff wraps the list ops up so you have to seriously go out of your way to screw them up. ive also sprinkled some NET_ASSERT_LOCKED around the list operations so we can make sure there's no potential for the list to be corrupted, especially while it's being run. hrvoje popovski has tested this a bit, and some issues he discovered have been fixed. ok sashan@
2019-11-04remove mobileip(4)David Gwynne
noone seems to use it, and we should not encourage people to use it by having it available. it's been disabled for most of the last release and noones asked for it in 6.6, so i'm taking that as an ok for this removal.
2019-10-25make whitespace in the IPPROTO defines consistent. no functional change.David Gwynne
2019-10-25+#define IPPROTO_UDPLITE 136, as per RFC 3828 and the IANA allocationDavid Gwynne
please don't interpret this as an intention on my part to implement UDP-Lite.
2019-10-23Kernel is missing propper input validation when configuring addresses.Alexander Bluhm
Fix the SIOCAIFADDR and SIOCDIFADDR ioctl(2) by implementing in_sa2sin() to validate inet address family and address length. OK visa@
2019-10-17in6_setsockaddr and in6_setpeeraddr can't fail, so let them return void.David Gwynne
this also brings them in line with the AF_INET equivalents. ok visa@ bluhm@
2019-10-16tsleep(9) -> tsleep_nsec(9)Martin Pieuchot
ok cheloha@, visa@
2019-10-07ip_ether.c is empty, and now unlinked from the build.David Gwynne
ok jca@ deraadt@ claudio@ visa@
2019-10-04gif shouldn't include netinet/ip_ether.h, cos gif doesnt do etherip.David Gwynne
ip_ether.h is where netinet/ip_ipip.h got the forward declaration for struct tdb from though, so fix that before cutting ip_ether.h out of gif.
2019-10-04get rid of prototypes for mplsip_input and mplsip_output. they don't exist.David Gwynne
2019-09-30remove the "copy function" argument to bpf_mtap_hdr.David Gwynne
it was previously (ab)used by pflog, which has since been fixed. apart from that nothing else used it, so we can trim the cruft. ok kn@ claudio@ visa@ visa@ also made sure i fixed ipw(4) so i386 won't break.
2019-09-02Fix a route use after free in multicast route. Move the rt_mcast_del()Alexander Bluhm
out of the rtable_walk(). This avoids recursion to prevent stack overflow. Also it allows freeing the route outside of the walk. Now mrt_mcast_del() frees the route only when it is deleted from the routing table. If that fails, it must not be freed. After the route is returned by mfc_find(), it is reference counted. Then we need a rtfree(), but not in the other caes. Move rt_timer_remove_all() into rt_mcast_del(). OK mpi@
2019-08-06When we needed the kernel lock for local IP packet delivery, mpi@Alexander Bluhm
introduced a queue to grab the lock for multiple packets. Now we have only netlock for both IP and protocol input. So the queue is not necessary anymore. It just switches CPU and decreases performance. So remove the inet and inet6 ip queue for local packets. To get TCP running on loopback, we have to queue once between TCP input and output of the two sockets. So use the loopback queue in looutput() unconditionally. OK visa@
2019-07-25Add IFXF_AUTOCONF4 to if_xflags to match IFXF_AUTOCONF6. LetKenneth R Westerback
ifconfig set/unset it. ok deraadt@ kmos@
2019-07-17Introduce ETHER_IS_BROADCAST/ANYADDR/EQ() and use them where appropriate.Martin Pieuchot
ok dlg@, sthen@, millert@
2019-07-15Initialize struct inpcb pool not on demand, but during initialization.Alexander Bluhm
Removes a global variable and avoids MP problems. OK mpi@ visa@
2019-07-12Count the number of TCP SACK options that were dropped due to theAlexander Bluhm
sack hole list length or pool limit. OK claudio@
2019-07-10Received SACK options are managed by a linked list at the TCP socket.Alexander Bluhm
There is a global tunable limit net.inet.tcp.sackholelimit, default is 32768. If an attacker manages to attach all these sack holes to a few TCP connections, the lists may grow long. Traversing them might cause higher CPU consumption on the victim machine. In practice such a situation is hard to create as the TCP retransmit and 2*msl timer flush the list periodically. For additional protection, enforce a per connection limit of 128 SACK holes in the list. reported by Reuven Plevinsky and Tal Vainshtein discussed with claudio@ and procter@; OK deraadt@
2019-07-08free(9) sizes for M_RTABLE.Martin Pieuchot
ok kn@
2019-07-05add ac_trunkport to arpcom so trunks can coordinate owning an interfaceDavid Gwynne
Ethernet interfaces can be used by trunk(4), and i'm about to commit a new aggr(4) driver which should not be able to use an interface while trunk owns it and visa versa.
2019-06-21Prevent recursions by not deleting entries inside rtable_walk(9).Martin Pieuchot
rtable_walk(9) now passes a routing entry back to the caller when a non zero value is returned and if it asked for it. This allows us to call rtdeletemsg()/rtrequest_delete() from the caller without creating a recursion because of rtflushclone(). Multicast code hasn't been adapted and is still possibly creating recursions. However multicast route entries aren't cloned so if a recursion exists it isn't because of rtflushclone(). Fix stack exhaustion triggered by the use of "-msave-args". Issue reported by Dániel Lévai on bugs@ confirmed by and ok bluhm@.
2019-06-13In arp_rtrequest and nd6_rtrequest return early if the RTF_MPLS flag isClaudio Jeker
set. These mpls routes use the rt_llinfo structure to store the MPLS label and would confuse the arp and nd6 code. OK bluhm@ anton@ Reported-by: syzbot+927e93a362f3ae33dd9c@syzkaller.appspotmail.com
2019-06-13Copy the user provided sockaddr into a normalized sockaddr in rtrequest()Claudio Jeker
before adding it to the routing table. The rtable code is doing memcmp() of those rt_dest sockaddrs so it is important that they are stored in a canonical form. To do this struct domain is extended to include the sockaddr size for this address family. OK bluhm@ anton@ Reported-by: syzbot+10fe9cd8d0211c562ead@syzkaller.appspotmail.com
2019-06-10use m_microtime instead of microtime for SO_TIMESTAMP socketopt handlingDavid Gwynne
drivers can set ph_timestamp when packets are received by the hardware, which should be more accurate and cheaper than getting the clock when the packet is queued on the socket.
2019-06-10Use mallocarray(9) & put some free(9) sizes for M_IPMOPTS allocations.Martin Pieuchot
ok semarie@, visa@
2019-06-04Add missing NULL check for the protocol control block (pcb) pointer inanton
mrt{6,}_ioctl. Calling shutdown(2) on the socket prior to the ioctl command can cause it to be NULL. ok bluhm@ claudio@ Reported-by: syzbot+bdc489ecb509995a21ed@syzkaller.appspotmail.com Reported-by: syzbot+156405fdea9f2ab15d40@syzkaller.appspotmail.com
2019-05-11unbreak the build without IPSEC.Sebastian Benoit
ok claudio@ deraadt@
2019-04-28Removes the KERNEL_LOCK() from bridge(4)'s output fast-path.Martin Pieuchot
This redefines the ifp <-> bridge relationship. No lock can be currently used across the multiples contexts where the bridge has tentacles to protect a pointer, use an interface index. Tested by various, ok dlg@, visa@
2019-04-23a first cut at converting some virtual ethernet interfaces to if_vinputDavid Gwynne
this let's input processing bypass ifiqs. there's a performance benefit from this, and it will let me tweak the backpressure detection mechanism that ifiqs use without impacting on a stack of virtual interfaces. ive tested all of these except mpw, which i will end up testing soon anyway.
2019-04-22In in_cksum() and in6_cksum() convert types to C99 style and makeAlexander Bluhm
both functions consistent. In in_cksum() panic if len is longer than mbuf, but in in6_cksum() do not panic if off and len match exactly to the end of mbuf. OK claudio@
2019-04-05In debug mode print TCP flag names to console correctly.Alexander Bluhm
from Mitchell Krome
2019-02-13change rt_ifa_add and rt_ifa_del so they take an rdomain argument.David Gwynne
this allows mpls interfaces (mpe, mpw) to pass the rdomain they wish the local label to be in, rather than have it implicitly forced to 0 by these functions. right now they'll pass 0, but it will soon be possible to have them rx packets in other rdomains. previously the functions used ifp->if_rdomain for the rdomain. everything other than mpls still passes ifp->if_rdomain. ok mpi@
2019-02-10remove the implict RTF_MPATH flag that rt_ifa_add() sets on new routes.David Gwynne
MPLS interfaces (ab)use rt_ifa_add for adding the local MPLS label that they listen on for incoming packets, while every other use of rt_ifa_add is for adding addresses on local interfaces. MPLS does this cos the addresses involved are in basically the same shape as ones used for setting up local addresses. It is appropriate for interfaces to want RTF_MPATH on local addresses, but in the MPLS case it means you can have multiple local things listening on the same label, which doesn't actually work. mpe in particular keeps track of in use labels to it can handle collisions, however, mpw does not. It is currently possible to have multiple mpw interfaces on the same local label, and sharing the same label as mpe or possible normal forwarding labels. Moving the RTF_MPATH flag out of rt_ifa_add means all the callers that still want it need to pass it themselves. The mpe and mpw callers are left alone without the flag, and will now get EEXIST from rt_ifa_add when a label is already in use. ok (and a huge amount of patience and help) mpi@ claudio@ is ok with the idea, but saw a much much earlier solution to the problem
2019-02-06Fix a possible mbuf leak in tcp_usrreq(). Make the error handlingAlexander Bluhm
more consistent to the other protocols' usrreq functions. OK visa@ claudio@
2019-02-04Avoid an mbuf double free in the oob soreceive() path. In theAlexander Bluhm
usrreq functions move the mbuf m_freem() logic to the release block instead of distributing it over the switch statement. Then the goto release in the initial check, whether the pcb still exists, will not free the mbuf for the PRU_RCVD, PRU_RVCOOB, PRU_SENSE command. OK claudio@ mpi@ visa@ Reported-by: syzbot+8e7997d4036ae523c79c@syzkaller.appspotmail.com
2019-01-20Refresh arp entries that are about to expire. Once their life time is lessClaudio Jeker
then 1/8 of net.inet.ip.arptimeout the system will send out a arp request about every 30 seconds until either the entry is updated or expired. Not refreshing arp entries will result in packet drop every time a entry expires which is not ideal for important gateway entries. Came up with this after a discussion with deraadt@. OK benno@ deraadt@
2019-01-18Bring back the ip_pcbopts() refactor. Pad the option buffer and thereforClaudio Jeker
the mbuf to the next word length as it is required by the standard. Also use the correct offset from the input mbuf. OK visa@, input & OK bluhm@
2019-01-18Revert Rev 1.351, the change is not quite right yet.Claudio Jeker
2019-01-08Botched up an if conditional in the last commit. The IP length needs toClaudio Jeker
bigger than the IP header len to be valid. With this I can traceroute again.