src - OpenBSD base system

Age	Commit message (Collapse)	Author
2022-03-22	For raw IP packets rip_input() traverses the loop of all PCBs. From	Alexander Bluhm
	there it calls sbappendaddr() while holding the raw table mutex. This ends in sorwakeup() where we finally grab the kernel lock while holding a mutex. Witness detects this misuse. Use the same solution as for PCB notify. Collect the affected PCBs in a temporary list. The list is protected by exclusive net lock. syzbot+ebe3f03a472fecf5e42e@syzkaller.appspotmail.com OK claudio@
2022-03-22	Fix whitespace.	Alexander Bluhm

2022-03-21	For multicast and broadcast packets udp_input() traverses the loop	Alexander Bluhm
	of all UDP PCBs. From there it calls udp_sbappend() while holding the UDP table mutex. This ends in sorwakeup() where we finally grab the kernel lock while holding a mutex. Witness detects this misuse. Use the same solution as for PCB notify. Collect the affected PCBs in a temporary list. The list is protected by exclusive net lock. Reported-by: syzbot+7596cb96fb9f3c9d6f4f@syzkaller.appspotmail.com OK sashan@
2022-03-21	Fix whitespace. Wrap long lines. Adjust outdated comment.	Alexander Bluhm

2022-03-21	Header netinet/in_pcb.h includes sys/mutex.h now. Recommit mutex	Alexander Bluhm
	for PCB tables. It does not break userland build anymore. pf_socket_lookup() calls in_pcbhashlookup() in the PCB layer. To run pf in parallel, make parts of the stack MP safe. Protect the list and hashes in the PCB tables with a mutex. Note that the protocol notify functions may call pf via tcp_output(). As the pf lock is a sleeping rw_lock, we must not hold a mutex. To solve this for now, collect these PCBs in inp_notify list and protect it with exclusive netlock. OK sashan@
2022-03-21	call in_pcbselsrc from rip_output so route sourceaddr can take effect.	David Gwynne
	previously things that used sendto or similar with raw sockets would ignore any configured sourceaddr. this made it inconsistent with other traffic, which in turn makes things confusing to debug if you're using ping or traceroute (which use raw sockets) to figure out what's happening to other packets. the ipv6 equiv already does this too. ok sthen@ claudio@
2022-03-21	treat 255.255.255.255 like an mcast address in in_pcbselsrc.	David Gwynne
	this allows the IP_MULTICAST_IF sockopt to specify which address you want to send a limited broadcast (255.255.255.255) packet out of. requested by and ok claudio@
2022-03-20	Include sys/mutex.h from netinet/in_pcb.h. Struct mutex will be	Alexander Bluhm
	needed to make inpcb in kernel MP safe. To build sysctl and libkvm based programs, we have to export it to userland. OK claudio@
2022-03-14	Unbreak the tree, revert commitid aZ8fm4iaUnTCc0ul	Theo Buehler
	This reverts the commit protecting the list and hashes in the PCB tables with a mutex since the build of sysctl(8) breaks, as found by kettenis. ok sthen
2022-03-14	pf_socket_lookup() calls in_pcbhashlookup() in the PCB layer. To	Alexander Bluhm
	run pf in parallel, make parts of the stack MP safe. Protect the list and hashes in the PCB tables with a mutex. Note that the protocol notify functions may call pf via tcp_output(). As the pf lock is a sleeping rw_lock, we must not hold a mutex. To solve this for now, collect these PCBs in inp_notify list and protect it with exclusive netlock. OK sashan@
2022-03-13	Hrvoje has hit a crash with IPsec acquire while testing the parallel	Alexander Bluhm
	IP forwarding diff. Add mutex and refcount to make memory management of struct ipsec_acquire MP safe. testing Hrvoje Popovski; input sashan@; OK mvs@
2022-03-10	Use atomic load and store functions to access refcnt and wait	Alexander Bluhm
	variables. Although not necessary everywhere, using atomic functions exclusively for variables marked as atomic is clearer. OK mvs@ visa@
2022-03-08	In IPsec policy replace integer refcount with atomic refcount.	Alexander Bluhm
	OK tobhe@ mvs@
2022-03-06	Usually we check ipsec_in_use as shortcut to avoid IPsec lookups,	Alexander Bluhm
	but that does not work when coming from tcp_output() as inp != NULL. This seems to be done to block packets from sockets with options in inp_seclevel. But instead of doing the route lookup, go directly to ipsp_spd_inp() where the socket policy checks are done. Calling rtable_l2() before the shortcut also costs a bit, do it when needed. OK tobhe@
2022-03-04	in_addmulti() is only called from ioctl(2) or setsockopt(2). Wait	Alexander Bluhm
	for malloc(9) to make the system call reliable. OK mvs@
2022-03-04	in_pcbinit() is called during boot. There malloc(9) cannot fail,	Alexander Bluhm
	but would panic instead of waiting. Remove needless error handling. OK mvs@
2022-03-02	Use NULL instead of 0 for pointer.	Alexander Bluhm

2022-03-02	Merge two comments describing the locks into one.	Alexander Bluhm

2022-03-02	The return value of in6_pcbnotify() is never used. Make it a void	Alexander Bluhm
	function. OK gnezdo@ mvs@ florian@ sashan@
2022-03-01	Remove outdated comment about v4-mapped v6 addresses. They are not	Alexander Bluhm
	supported anymore.
2022-02-25	Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com	Philip Guenther
	Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
2022-02-25	Move pr_attach and pr_detach to a new structure pr_usrreqs that can	Philip Guenther
	then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this. Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts. ok mvs@ bluhm@
2022-02-22	Delete unnecessary #includes of <netinet6/ip6protosw.h>: some never	Philip Guenther
	needed it and some no longer need it after moving the externs from there to <sys/protosw.h> ok jsg@
2022-02-22	Delete unnecessary #includes of <sys/domain.h> and/or <sys/protosw.h>	Philip Guenther
	net/if_pppx.c pointed out by jsg@ ok gnezdo@ deraadt@ jsg@ mpi@ millert@
2022-02-16	rewrite vxlan to better fit the current kernel infrastructure.	David Gwynne
	the big change is removing the integration with and reliance on bridge(4) for learning vxlan endpoints. we have the etherbridge layer now (which is used by veb, nvgre, bpe, etc) so vxlan can operate independently of bridge(4) (or any other driver) while still dynamically learning about other endpoints. vxlan now uses the udp socket upcall mechanism to receive packets. this means it actually creates and binds udp sockets to use rather adding code in the udp layer for stealing packets from the udp layer. i think it's also important to note that this adds loop prevention to the code. this stops a vxlan interface being used to transmit a packet that was encapsulated in itself. i want to clear this out of my tree where it's been sitting for nearly a year. noone seems too concerned with the change either way. ok claudio@
2022-02-01	When a struct ipovly needs to be computed and checksummed in in4_cksum(),	Miod Vallat
	do not bother operating on its first 8 bytes, which will always be zero. ok visa@
2022-01-25	Capture a repeated pattern into sysctl_securelevel_int function	Greg Steuck
	A few variables in the kernel are only writeable before securelevel is raised. It makes sense to handle them with less code. OK sthen@ bluhm@
2022-01-23	Define all TCP TF_ flags as unsigned numbers. They are stored in	Alexander Bluhm
	u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
2022-01-20	Shifting signed integers left by 31 is undefined behavior in C.	Alexander Bluhm
	found by kubsan; joint work with tobhe@; OK miod@
2022-01-04	Add `ipsec_flows_mtx' mutex(9) to protect `ipsp_ids_*' list and	YASUOKA Masahiko
	trees. ipsp_ids_lookup() returns `ids' with bumped reference counter. original diff from mvs ok mvs
2022-01-02	spelling	Jonathan Gray
	ok jmc@ reads ok tb@
2021-12-23	Remove unused variables and assignments in ah and esp output.	Alexander Bluhm
	found by clang 13; OK tobhe@
2021-12-23	IPsec is not MP safe yet. To allow forwarding in parallel without	Alexander Bluhm
	dirty hacks, it is better to protect IPsec input and output with kernel lock. Not much is lost as crypto needs the kernel lock anyway. From here we can refine the lock later. Note that there is no kernel lock in the SPD lockup path. Goal is to keep that lock free to allow fast forwarding with non IPsec traffic. tested by Hrvoje Popovski; OK tobhe@
2021-12-22	Consolidate enc_getif() lookups in IPsec input path to save one lookup	Tobias Heider
	per packet and improve readability. ok bluhm@
2021-12-20	Remove unused variable 'clen'.	Tobias Heider
	ok bluhm@
2021-12-20	Use per-CPU counters for tunnel descriptor block (TDB) statistics.	Vitaliy Makkoveev
	'tdb_data' struct became unused and was removed. Tested by Hrvoje Popovski. ok bluhm@
2021-12-20	Fix function name in panic string.	Alexander Bluhm

2021-12-19	There are occasions where the walker function in tdb_walk() might	Alexander Bluhm
	sleep. So holding the tdb_sadb_mtx() when calling walker() is not allowed. Move the TDB from the TDB-Hash to a temporary list that is protected by netlock. Then unlock tdb_sadb_mtx and traverse the list to call the walker. OK mvs@
2021-12-16	Fix a tiny race in tdb_delete() between TDBF_DELETED, tdb_unlink()	Alexander Bluhm
	and tdb_cleanspd(). gettdb...() can return a TDB before tdb_unlink(). Then ipsp_spd_lookup() could add it to tdb_policy_head after tdb_cleanspd(). There it would stay until it hits the kassert in tdb_free(). OK tobhe@
2021-12-15	structure pads can leak uninitialized memory to userland via copyout,	Theo de Raadt
	therefore the mandatory idiom is completely clearing structs before building them for copyout -- that means ALMOST ALL STRUCTS, because we never know when some architecture will pad a struct.. In two more cases, the clearing wasn't performed. from Reno Robert ZDI ok millert bluhm
2021-12-15	Syzkaller found a dereference in igmp_leavegroup() where inm->inm_rti	Alexander Bluhm
	is NULL. It should be set in rti_fill(), but is not if malloc(9) fails. There is no rollback after malloc failure so the field stays uninitialized. The code is only called from ioctl, setsockopt or a task. Malloc should wait instead of failing, otherwise syscalls would be unreliable. While there also put an M_WAIT in the init code. During init malloc must not fail. OK mvs@ Reported-by: syzbot+e22326057ccf34908d78@syzkaller.appspotmail.com
2021-12-14	Correct value for IPTOS_DSCP_LE since it needs to allow for the preceeding	Darren Tucker
	two ECN bits. From daisuke.higashi at gmail.com via OpenSSH bz#3373, ok claudio@, job@, djm@.
2021-12-14	To cache lookups, the policy ipo is linked to its SA tdb. There	Alexander Bluhm
	is also a list of SAs that belong to a policy. To make it MP safe, protect these pointers with a mutex. tested by Hrvoje Popovski; OK mvs@
2021-12-11	Protect the write access to the TDB flags field with a mutex per	Alexander Bluhm
	TDB. Clearing the timeout flags just before pool put in tdb_free() does not make sense. Move this to tdb_delete(). While there make the parentheses in the flag check consistent. tested by Hrvoje Popovski; OK tobhe@
2021-12-08	Start documenting the locking strategy of struct tdb fields. Note	Alexander Bluhm
	that gettdb_dir() is MP safe now. Add the tdb_sadb_mtx mutex in udpencap_ctlinput() to protect the access to tdb_snext. Make the braces consistently for all these TDB loops. Move NET_ASSERT_LOCKED() into the functions where the read access happens. OK mvs@
2021-12-07	In ipo_tdb the flow contains a reference counted TDB cache. This	Alexander Bluhm
	may prevent that tdb_free() is called. It is not a real leak as ipsecctl -F or termination of iked flush this cache when they remove the IPsec policy. Move the code from tdb_free() to tdb_delete(), then the kernel does the cleanup itself. OK mvs@ tobhe@
2021-12-03	Add tdb_delete_locked() to replace duplicate tdb deletion code in	Tobias Heider
	pfkey_flush(). ok bluhm@ mvs@
2021-12-03	Add TDB reference counting to ipsp_spd_lookup(). If an output	Alexander Bluhm
	pointer is passed to the function, it will return a refcounted TDB. The ref happens when ipsp_spd_inp() copies the pointer from ipo->ipo_tdb. The caller of ipsp_spd_lookup() has to unref after using it. tested by Hrvoje Popovski; OK mvs@ tobhe@
2021-12-02	ipsec_common_input_cb() extracted the inner IP header of IPsec	Alexander Bluhm
	tunnels. It is never used, so this is useless code. Remove ipn and ip6n IP header variables and the m_copydata() to fill them. OK mvs@ kn@ sthen@
2021-12-02	Allow to build kernel without IPSEC or INET6 defines.	Alexander Bluhm
	OK mpi@ mvs@