summaryrefslogtreecommitdiff
path: root/sys/netinet6
AgeCommit message (Collapse)Author
2022-05-09Protect sbappendaddr() in divert_packet() with kernel lock. WithAlexander Bluhm
divert-packet rules pf calls directly from IP layer to protocol layer. As the former has only shared net lock, additional protection against parallel access is needed. Kernel lock is a temporary workaround until the socket layer is MP safe. discussed with kettenis@ mvs@
2022-05-05Clean up divert_packet(). Function does not return error, make itAlexander Bluhm
void. Introduce mutex and refcounting for inp like in the other PCB functions. OK sashan@
2022-05-05Use static objects for struct rttimer_queue instead of dynamicallyClaudio Jeker
allocate them. Currently there are 6 rttimer_queues and not many more will follow. So change rt_timer_queue_create() to rt_timer_queue_init() which now takes a struct rttimer_queue * as argument which will be initialized. Since this changes the gloabl vars from pointer to struct adjust other callers as well. OK bluhm@
2022-05-04Move rttimer callback function from the rttimer itself to rttimer_queue.Claudio Jeker
All users use the same callback per queue so that makes sense. Also replace rt_timer_queue_destroy() with rt_timer_queue_flush(). OK bluhm@
2022-04-30Convert the 2nd rttimer callback from struct rttimer to u_int rtableid.Claudio Jeker
The callback only needs to know the rtableid all the other info from struct rtableid is not needed. Also change the default rttimer callback to only delete routes that are RTF_HOST and RTF_DYNAMIC. This way 2 of the ICMP handlers can use NULL as the callback. OK bluhm@
2022-04-28In the multicast router code don't allocate a rt timer queue for eachClaudio Jeker
rdomain. The rttimer API is rtable/rdomain aware and so there is no need to have so many queues. Also init the two queues (one for IPv4 and one for IPv6) early on. This will allow the rttable code to become simpler. OK bluhm@
2022-04-28Decouple IP input and forwarding from protocol input. This allowsAlexander Bluhm
to have parallel IP processing while the upper layers are still not MP safe. Introduce ip_ours() that enqueues the packets and ipintr() that dequeues and processes them with an exclusive netlock. Note that we still have only one softnet task. Running IP processing on multiple CPU will be the next step. lots of testing Hrvoje Popovski; OK sashan@
2022-04-20Route timeout was a mixture of int, u_int and long. Use type intAlexander Bluhm
for timeout, add sysctl bounds checking between 0 and max int, and use time_t for absolute times. Some code assumes that the route timeout queue can be NULL and at some places this was checked. Better make sure that all queues always exist. The pool_get for struct rttimer_queue is only called from initialization and from syscall, so PR_WAITOK is possible. Keep the special hack when ip_mtudisc is set to 0. Destroy the queue and generate an empty one. If redirect timeout is 0, it should not time out. Check the value in IPv6 to make the behavior like IPv4. Sysctl net.inet6.icmp6.redirtimeout had no effect as the queue timeout was not modified. Make icmp6_sysctl() look like icmp_sysctl(). OK claudio@
2022-04-14Relax address availability check for multicast binds.Claudio Jeker
While it makes sense to limit bind(2) of unicast addresses that overlap each other to be all from the same UID (like 0.0.0.0:53 and 127.0.0.1:53) it makes little sense for multicast. Multicast is delivered to all sockets that match so there is no risk of someone stealing traffic from someone else. This should hopefully help with mDNS as reported by robert@ OK deraadt@ bluhm@
2022-03-23For raw IPv6 packets rip6_input() traverses the loop of all PCBs.Alexander Bluhm
From there it calls sbappendaddr() while holding the raw6 table mutex. This ends in sorwakeup() where we finally grab the kernel lock while holding a mutex. Witness detects this misuse. Use the same solution as for PCB notify. Collect the affected PCBs in a temporary list. The list is protected by exclusive net lock. Reported-by: syzbot+5b2679ee9be0895d26f9@syzkaller.appspotmail.com OK claudio@
2022-03-22Extract the type from the ICMP6 header before looping over Raw IPv6Alexander Bluhm
PCBs. This make mutex and error handling easier. OK claudio@
2022-03-21Header netinet/in_pcb.h includes sys/mutex.h now. Recommit mutexAlexander Bluhm
for PCB tables. It does not break userland build anymore. pf_socket_lookup() calls in_pcbhashlookup() in the PCB layer. To run pf in parallel, make parts of the stack MP safe. Protect the list and hashes in the PCB tables with a mutex. Note that the protocol notify functions may call pf via tcp_output(). As the pf lock is a sleeping rw_lock, we must not hold a mutex. To solve this for now, collect these PCBs in inp_notify list and protect it with exclusive netlock. OK sashan@
2022-03-14Unbreak the tree, revert commitid aZ8fm4iaUnTCc0ulTheo Buehler
This reverts the commit protecting the list and hashes in the PCB tables with a mutex since the build of sysctl(8) breaks, as found by kettenis. ok sthen
2022-03-14pf_socket_lookup() calls in_pcbhashlookup() in the PCB layer. ToAlexander Bluhm
run pf in parallel, make parts of the stack MP safe. Protect the list and hashes in the PCB tables with a mutex. Note that the protocol notify functions may call pf via tcp_output(). As the pf lock is a sleeping rw_lock, we must not hold a mutex. To solve this for now, collect these PCBs in inp_notify list and protect it with exclusive netlock. OK sashan@
2022-03-02The return value of in6_pcbnotify() is never used. Make it a voidAlexander Bluhm
function. OK gnezdo@ mvs@ florian@ sashan@
2022-02-25Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.comPhilip Guenther
Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
2022-02-25Move pr_attach and pr_detach to a new structure pr_usrreqs that canPhilip Guenther
then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this. Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts. ok mvs@ bluhm@
2022-02-25in6_ioctl() is declared in in6_var.h as it's used in if_umb.c, soPhilip Guenther
don't declare it again in the .c file ok dlg@ mvs@ bluhm@
2022-02-22Delete unnecessary #includes of <netinet6/ip6protosw.h>: some neverPhilip Guenther
needed it and some no longer need it after moving the externs from there to <sys/protosw.h> ok jsg@
2022-02-22Delete unnecessary #includes of <sys/domain.h> and/or <sys/protosw.h>Philip Guenther
net/if_pppx.c pointed out by jsg@ ok gnezdo@ deraadt@ jsg@ mpi@ millert@
2022-02-22Move declarations of ip6_protox[] and inet6sw[] to <sys/protosw.h>Philip Guenther
where the IPv4 versions have been forever ok gnezdo@ deraadt@ jsg@ mpi@ millert@
2022-02-21futther -> furtherJonathan Gray
2022-02-07Checking ifaddr pointer for NULL without checking in6_ifaddr worksAlexander Bluhm
as ifaddr ia_ifa is the first field of in6_ifaddr. So the pointers are the same, and one NULL check works for both. But in ISO C NULL has some kind of type and this is undefined behavior. So add a second NULL check that the compiler can optimize away. The resulting assembler is the same. found by kubsan; OK tobhe@
2022-01-04Add `ipsec_flows_mtx' mutex(9) to protect `ipsp_ids_*' list andYASUOKA Masahiko
trees. ipsp_ids_lookup() returns `ids' with bumped reference counter. original diff from mvs ok mvs
2022-01-02spellingJonathan Gray
ok jmc@ reads ok tb@
2021-12-25For a long time ip_ours() and ip6_ours() are calling ip_deliver()Alexander Bluhm
without kernel lock. Unlock the two callers in ip6_input_if() that have been forgotten. OK mvs@ kn@
2021-12-23IPsec is not MP safe yet. To allow forwarding in parallel withoutAlexander Bluhm
dirty hacks, it is better to protect IPsec input and output with kernel lock. Not much is lost as crypto needs the kernel lock anyway. From here we can refine the lock later. Note that there is no kernel lock in the SPD lockup path. Goal is to keep that lock free to allow fast forwarding with non IPsec traffic. tested by Hrvoje Popovski; OK tobhe@
2021-12-20Use per-CPU counters for tunnel descriptor block (TDB) statistics.Vitaliy Makkoveev
'tdb_data' struct became unused and was removed. Tested by Hrvoje Popovski. ok bluhm@
2021-12-15structure pads can leak uninitialized memory to userland via copyout,Theo de Raadt
therefore the mandatory idiom is completely clearing structs before building them for copyout -- that means ALMOST ALL STRUCTS, because we never know when some architecture will pad a struct.. In two more cases, the clearing wasn't performed. from Reno Robert ZDI ok millert bluhm
2021-12-13nd6_dad_ns_input() could trigger a NULL deref in nd6_dad_duplicated().Alexander Bluhm
It checks dp in two of three places. One check got lost in revision 1.83. Do a dp == NULL once at the beginning. OK jsg@ Reported-by: syzbot+88c0ce914a0b10b7e1c8@syzkaller.appspotmail.com
2021-12-03Add TDB reference counting to ipsp_spd_lookup(). If an outputAlexander Bluhm
pointer is passed to the function, it will return a refcounted TDB. The ref happens when ipsp_spd_inp() copies the pointer from ipo->ipo_tdb. The caller of ipsp_spd_lookup() has to unref after using it. tested by Hrvoje Popovski; OK mvs@ tobhe@
2021-12-01Let ipsp_spd_lookup() return an error instead of a TDB. The TDBAlexander Bluhm
is not always needed, but the error value is necessary for the caller. As TDB should be refcounted, it makes not sense to always return it. Pass an output pointer for the TDB which can be NULL. OK mvs@ tobhe@
2021-11-24When sending ICMP packets for IPsec path MTU discovery, the firstAlexander Bluhm
ICMP packet could be wrong. The mtu was taken from the loopback interface as the tdb mtu was copied to the route too late. Without crypto task, ipsp_process_packet() returns the EMSGSIZE error earlier. Immediately update tdb and route mtu. IPv4 part from markus@; OK tobhe@
2021-11-22Copy code from ip_forward() to ip6_forward() to fix Path MTU discoveryAlexander Bluhm
in IPsec IPv6 tunnel. Implement sending ICMP6 packet too big messages. Also implement the pf error case in ip6_forward(). While there, do some cleanup and make the IPv4 and IPv6 code look similar. OK tobhe@
2021-11-07net.inet6.icmp6.nd6_debug doesn't need to warn about RDNSS/DNSSL optionsStuart Henderson
ok phessler@
2021-10-24Remove code duplication by merging the v4 and v6 input functionsAlexander Bluhm
for ah, esp, and ipcomp. Move common code into ipsec_protoff() which finds the offset of the next protocol field in the previous header. OK tobhe@
2021-10-14ip6_output_ipsec_send() may change the route embeded in struct roAlexander Bluhm
during path MTU discovery. ip6_forward() has to update its rt variable to the new route in ro. Otherwise it could operate on a freed route. from markus@
2021-07-27Revert "Use per-CPU counters for tunnel descriptor block" diff.mvs
Panic reported by Hrvoje Popovski.
2021-07-26Use per-CPU counters for tunnel descriptor block (tdb) statistics.mvs
'tdb_data' struct became unused and was removed. ok bluhm@
2021-07-26The mbuf header cleanup in revision 1.173 of ip_icmp.c was tooAlexander Bluhm
strict. ICMP error packets generated by pf were not passed immediately, but could be blocked. Preserve PF_TAG_GENERATED flag in icmp_reflect() and icmp6_reflect(). reported by sf@; OK patrick@ kn@
2021-07-08Debug printfs in encdebug were inconsistent, some missing newlinesAlexander Bluhm
produced ugly output. Move the function name and the newline into the DPRINTF macro. This simplifies the debug statements. OK tobhe@
2021-06-03ip6_input_if used the ip6_hdr pointer uninitted after i refactored it.David Gwynne
i did test this, but i guess i was lucky. very lucky. Coverity CID 1505114
2021-06-02don't init a pointer just to immediately set it again.David Gwynne
this is in ip6_input_if just before ipv6_check returns the pointer we end up using. pointed out by bluhm@
2021-06-02factor out the code that does sanity checks on ipv6 headers and addresses.David Gwynne
this will allow these checks to be reused for ip packet inspection in bridge, veb, and tpmr. ok bluhm@ sashan@
2021-05-25As network features are not added dynamically, the domain structuresAlexander Bluhm
are constant. Having more const makes MP review easier. More pointers are mapped read-only in the kernel image. OK deraadt@ mvs@
2021-05-17Stop setting IPV6_MINMTU in ip6_send() which is used by the ICMP code.Claudio Jeker
Because of this large ping packets where fragmented even if the MTU did not indicate the need for it. This causes some trouble when system do not expect to receive a fragmented answer from a system. One such case is the automated link test from google routers before allowing to establish a BGP peering session with them. In general PMTU problems should be an issue from the past and if not it may be better to also break on ping packets and not only for UDP and TCP. ICMP ping is normaly the first tool in the admins toolbox to figure out network issues. OK phessler@ florian@ bluhm@
2021-05-12Use local copy of `ps_rtableid' in ip{,6}_ctloutput() and markmvs
`ps_rtableid' as atomic. This allows us to unlock setrtable(2). ok claudio@ mpi@
2021-04-30Rearrange the implementation of bounded sysctl. The primitiveAlexander Bluhm
functions are sysctl_int() and sysctl_rdint(). This brings us back the 4.4BSD implementation. Then sysctl_int_bounded() builds the magic for range checks on top. sysctl_bounded_arr() is a wrapper around it to support multiple variables. Introduce macros that describe the meaning of the magic boundary values. Use these macros in obvious places. input and OK gnezdo@ mvs@
2021-03-15Clear AUTOCONF6TEMP flag when we detach inet6.Florian Obser
2021-03-10spellingJonathan Gray
ok gnezdo@ semarie@ mpi@