src - OpenBSD base system

Age	Commit message (Collapse)	Author
2022-07-14	Use capital letters for global ipsec(4) locks description. Use 'D'	Vitaliy Makkoveev
	instead of 's' for `tdb_sadb_mtx' mutex(9) because this is 'D'atabase. No functional changes. ok bluhm@
2022-06-29	Nullify `ipsecflowinfo' when mbuf(9) has no ipsec flowinfo data.	Vitaliy Makkoveev
	Otherwise we use `ipsecflowinfo' obtained from previous packet. ok claudio@
2022-06-28	Use btrace(8) to debug reference counting. dt(4) provides a static	Alexander Bluhm
	tracepoint for each type of refcnt we have. As a start, add inpcb and tdb refcnt. When the counter changes, btrace may print the actual object, the current counter, the change value and optionally the stack trace. discussed with visa@; OK mpi@
2022-06-27	Push the kernel lock down into arpresolve(). We still need it to	Alexander Bluhm
	prevent concurrent access to rt_llinfo from rtrequest_delete(). But the common case, when the MAC address is already known, works without lock. tested by Hrvoje Popovski; OK mvs@
2022-06-27	Instead of calling getuptime() all the time in ARP code, do it only	Alexander Bluhm
	once per function. This gives a more consistent time value. OK claudio@ miod@ mvs@
2022-06-26	The "ifq_set_maxlen(..., 1);" hack we use to enforce pipex(4) related	Vitaliy Makkoveev
	(*if_qstart)() be always called with netlock held doesn't work anymore with PPPOE sessions. Introduce `pipex_list_mtx' mutex(9) and use it to protect global pipex(4) lists and radix trees. Protect pipex(4) `session' dereference with reference counters, because we could sleep when accessing pipex(4) from ioctl(2) path, and this is not possible with mutex(9) held. ok bluhm@
2022-06-17	The timeout for ipsec acquire does not decrement the reference	Alexander Bluhm
	counter to 0 properly. We have one reference count for the lists, and one for the timeout handler. When the timout fires, it has to decrement the reference to itself. Then the ipa is removed from the lists and decremented again. from Stefan Butz; OK tobhe@ mvs@
2022-06-06	Simplify solock() and sounlock(). There is no reason to return a value	Claudio Jeker
	for the lock operation and to pass a value to the unlock operation. sofree() still needs an extra flag to know if sounlock() should be called or not. But sofree() is called less often and mostly without keeping the lock. OK mpi@ mvs@
2022-05-25	Call if_put(9) after we finish with `ia' within ip_getmoptions().	Vitaliy Makkoveev
	if_put(9) call means we finish work with `ifp' and it could be destroyed. `ia' is the pointer to 'in_ifaddr' data belongs to `ifp', so we need to release corresponding `ifp' after we finish deal with `ia'. `if_addrlist' list destruction and ip_getmoptions() are serialized with kernel and net locks so this is not critical, but looks inconsistent. ok bluhm@
2022-05-15	have in_pcbselsrc copy the selected address to memory provided by the caller.	David Gwynne
	having it return a pointer to something that has a lifetime managed by a lock without accounting for it or taking a reference count or anything like that is asking for trouble. copying the address to caller provded memory while still inside the lock is a lot safer. discussed with visa@ ok bluhm@ claudio@
2022-05-09	Protect sbappendaddr() in divert_packet() with kernel lock. With	Alexander Bluhm
	divert-packet rules pf calls directly from IP layer to protocol layer. As the former has only shared net lock, additional protection against parallel access is needed. Kernel lock is a temporary workaround until the socket layer is MP safe. discussed with kettenis@ mvs@
2022-05-05	Clean up divert_packet(). Function does not return error, make it	Alexander Bluhm
	void. Introduce mutex and refcounting for inp like in the other PCB functions. OK sashan@
2022-05-05	Use static objects for struct rttimer_queue instead of dynamically	Claudio Jeker
	allocate them. Currently there are 6 rttimer_queues and not many more will follow. So change rt_timer_queue_create() to rt_timer_queue_init() which now takes a struct rttimer_queue * as argument which will be initialized. Since this changes the gloabl vars from pointer to struct adjust other callers as well. OK bluhm@
2022-05-05	No longer consider IN_EXPERIMENTAL aka 240/4 as not forwardable.	Claudio Jeker
	We already allow 240/4 in and out so lets allow it through as well. One of many steps to make 240/4 useable. Diff by Seth David Schoen (schoen at loyalty.org) OK bluhm@ djm@
2022-05-04	Move rttimer callback function from the rttimer itself to rttimer_queue.	Claudio Jeker
	All users use the same callback per queue so that makes sense. Also replace rt_timer_queue_destroy() with rt_timer_queue_flush(). OK bluhm@
2022-05-04	In ipsp_spd_lookup() rename the parameter tdbp to tdbin as it is	Alexander Bluhm
	always the incoming TDB that has to be checked. from markus@
2022-05-03	Retire CRYPTO_F_MPSAFE it is no longer of any use. The crypto framework	Claudio Jeker
	no longer uses a callback and so there is no need to define the callback as MPSAFE. OK bluhm@
2022-04-30	When performing ipsp_ids_free(), grab `ipsec_flows_mtx' mutex(9) before do	Vitaliy Makkoveev
	`id_refcount' decrement. This should be consistent with `ipsp_ids_gc_list' list modifications, otherwise concurrent ipsp_ids_insert() could remove this dying `ids' from the list before if was placed there by ipsp_ids_free(). This makes atomic operations with `id_refcount' useless. Also prevent ipsp_ids_lookup() to return dying `ids'. ok bluhm@
2022-04-30	Convert the 2nd rttimer callback from struct rttimer to u_int rtableid.	Claudio Jeker
	The callback only needs to know the rtableid all the other info from struct rtableid is not needed. Also change the default rttimer callback to only delete routes that are RTF_HOST and RTF_DYNAMIC. This way 2 of the ICMP handlers can use NULL as the callback. OK bluhm@
2022-04-28	In the multicast router code don't allocate a rt timer queue for each	Claudio Jeker
	rdomain. The rttimer API is rtable/rdomain aware and so there is no need to have so many queues. Also init the two queues (one for IPv4 and one for IPv6) early on. This will allow the rttable code to become simpler. OK bluhm@
2022-04-28	Decouple IP input and forwarding from protocol input. This allows	Alexander Bluhm
	to have parallel IP processing while the upper layers are still not MP safe. Introduce ip_ours() that enqueues the packets and ipintr() that dequeues and processes them with an exclusive netlock. Note that we still have only one softnet task. Running IP processing on multiple CPU will be the next step. lots of testing Hrvoje Popovski; OK sashan@
2022-04-21	Introduce a dedicated link entries for snapshots in pfsync(4). The purpose	Alexandr Nedvedicky
	of snapshots is to allow pfsync(4) to move items from global lists to local lists (a.k.a. snapshots) under a mutex protection. Snapshots are then processed without holding any mutexes. Such idea does not fly well if link entry is currently used for global lists as well as snapshots. Feedback by bluhm@ Credits also goes to hrvoje@ for extensive testing. OK bluhm@
2022-04-20	Route timeout was a mixture of int, u_int and long. Use type int	Alexander Bluhm
	for timeout, add sysctl bounds checking between 0 and max int, and use time_t for absolute times. Some code assumes that the route timeout queue can be NULL and at some places this was checked. Better make sure that all queues always exist. The pool_get for struct rttimer_queue is only called from initialization and from syscall, so PR_WAITOK is possible. Keep the special hack when ip_mtudisc is set to 0. Destroy the queue and generate an empty one. If redirect timeout is 0, it should not time out. Check the value in IPv6 to make the behavior like IPv4. Sysctl net.inet6.icmp6.redirtimeout had no effect as the queue timeout was not modified. Make icmp6_sysctl() look like icmp_sysctl(). OK claudio@
2022-04-14	Relax address availability check for multicast binds.	Claudio Jeker
	While it makes sense to limit bind(2) of unicast addresses that overlap each other to be all from the same UID (like 0.0.0.0:53 and 127.0.0.1:53) it makes little sense for multicast. Multicast is delivered to all sockets that match so there is no risk of someone stealing traffic from someone else. This should hopefully help with mDNS as reported by robert@ OK deraadt@ bluhm@
2022-03-28	if_detach() does if_remove(ifp); NET_LOCK(); rti_delete(). New	Alexander Bluhm
	igmp groups may join while sleeping in interface destruction. In this case if_get() in igmp_joingroup() fails and rti_fill() is not called. Then inm->inm_rti may be NULL. This is the condition when syzkaller crashes in igmp_leavegroup(). Pass the ifp the current CPU is already holding down to igmp_joingroup() and igmp_leavegroup() to avoid half constructed igmp groups. Calling if_get() in caller and callee makes no sense anyway. Reported-by: syzbot+146823a676b7bea83649@syzkaller.appspotmail.com OK denis@
2022-03-23	Move global variable ripsrc onto stack, it is only used once within	Alexander Bluhm
	rip_input(). from dhill@
2022-03-22	For raw IP packets rip_input() traverses the loop of all PCBs. From	Alexander Bluhm
	there it calls sbappendaddr() while holding the raw table mutex. This ends in sorwakeup() where we finally grab the kernel lock while holding a mutex. Witness detects this misuse. Use the same solution as for PCB notify. Collect the affected PCBs in a temporary list. The list is protected by exclusive net lock. syzbot+ebe3f03a472fecf5e42e@syzkaller.appspotmail.com OK claudio@
2022-03-22	Fix whitespace.	Alexander Bluhm

2022-03-21	For multicast and broadcast packets udp_input() traverses the loop	Alexander Bluhm
	of all UDP PCBs. From there it calls udp_sbappend() while holding the UDP table mutex. This ends in sorwakeup() where we finally grab the kernel lock while holding a mutex. Witness detects this misuse. Use the same solution as for PCB notify. Collect the affected PCBs in a temporary list. The list is protected by exclusive net lock. Reported-by: syzbot+7596cb96fb9f3c9d6f4f@syzkaller.appspotmail.com OK sashan@
2022-03-21	Fix whitespace. Wrap long lines. Adjust outdated comment.	Alexander Bluhm

2022-03-21	Header netinet/in_pcb.h includes sys/mutex.h now. Recommit mutex	Alexander Bluhm
	for PCB tables. It does not break userland build anymore. pf_socket_lookup() calls in_pcbhashlookup() in the PCB layer. To run pf in parallel, make parts of the stack MP safe. Protect the list and hashes in the PCB tables with a mutex. Note that the protocol notify functions may call pf via tcp_output(). As the pf lock is a sleeping rw_lock, we must not hold a mutex. To solve this for now, collect these PCBs in inp_notify list and protect it with exclusive netlock. OK sashan@
2022-03-21	call in_pcbselsrc from rip_output so route sourceaddr can take effect.	David Gwynne
	previously things that used sendto or similar with raw sockets would ignore any configured sourceaddr. this made it inconsistent with other traffic, which in turn makes things confusing to debug if you're using ping or traceroute (which use raw sockets) to figure out what's happening to other packets. the ipv6 equiv already does this too. ok sthen@ claudio@
2022-03-21	treat 255.255.255.255 like an mcast address in in_pcbselsrc.	David Gwynne
	this allows the IP_MULTICAST_IF sockopt to specify which address you want to send a limited broadcast (255.255.255.255) packet out of. requested by and ok claudio@
2022-03-20	Include sys/mutex.h from netinet/in_pcb.h. Struct mutex will be	Alexander Bluhm
	needed to make inpcb in kernel MP safe. To build sysctl and libkvm based programs, we have to export it to userland. OK claudio@
2022-03-14	Unbreak the tree, revert commitid aZ8fm4iaUnTCc0ul	Theo Buehler
	This reverts the commit protecting the list and hashes in the PCB tables with a mutex since the build of sysctl(8) breaks, as found by kettenis. ok sthen
2022-03-14	pf_socket_lookup() calls in_pcbhashlookup() in the PCB layer. To	Alexander Bluhm
	run pf in parallel, make parts of the stack MP safe. Protect the list and hashes in the PCB tables with a mutex. Note that the protocol notify functions may call pf via tcp_output(). As the pf lock is a sleeping rw_lock, we must not hold a mutex. To solve this for now, collect these PCBs in inp_notify list and protect it with exclusive netlock. OK sashan@
2022-03-13	Hrvoje has hit a crash with IPsec acquire while testing the parallel	Alexander Bluhm
	IP forwarding diff. Add mutex and refcount to make memory management of struct ipsec_acquire MP safe. testing Hrvoje Popovski; input sashan@; OK mvs@
2022-03-10	Use atomic load and store functions to access refcnt and wait	Alexander Bluhm
	variables. Although not necessary everywhere, using atomic functions exclusively for variables marked as atomic is clearer. OK mvs@ visa@
2022-03-08	In IPsec policy replace integer refcount with atomic refcount.	Alexander Bluhm
	OK tobhe@ mvs@
2022-03-06	Usually we check ipsec_in_use as shortcut to avoid IPsec lookups,	Alexander Bluhm
	but that does not work when coming from tcp_output() as inp != NULL. This seems to be done to block packets from sockets with options in inp_seclevel. But instead of doing the route lookup, go directly to ipsp_spd_inp() where the socket policy checks are done. Calling rtable_l2() before the shortcut also costs a bit, do it when needed. OK tobhe@
2022-03-04	in_addmulti() is only called from ioctl(2) or setsockopt(2). Wait	Alexander Bluhm
	for malloc(9) to make the system call reliable. OK mvs@
2022-03-04	in_pcbinit() is called during boot. There malloc(9) cannot fail,	Alexander Bluhm
	but would panic instead of waiting. Remove needless error handling. OK mvs@
2022-03-02	Use NULL instead of 0 for pointer.	Alexander Bluhm

2022-03-02	Merge two comments describing the locks into one.	Alexander Bluhm

2022-03-02	The return value of in6_pcbnotify() is never used. Make it a void	Alexander Bluhm
	function. OK gnezdo@ mvs@ florian@ sashan@
2022-03-01	Remove outdated comment about v4-mapped v6 addresses. They are not	Alexander Bluhm
	supported anymore.
2022-02-25	Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com	Philip Guenther
	Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
2022-02-25	Move pr_attach and pr_detach to a new structure pr_usrreqs that can	Philip Guenther
	then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this. Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts. ok mvs@ bluhm@
2022-02-22	Delete unnecessary #includes of <netinet6/ip6protosw.h>: some never	Philip Guenther
	needed it and some no longer need it after moving the externs from there to <sys/protosw.h> ok jsg@
2022-02-22	Delete unnecessary #includes of <sys/domain.h> and/or <sys/protosw.h>	Philip Guenther
	net/if_pppx.c pointed out by jsg@ ok gnezdo@ deraadt@ jsg@ mpi@ millert@