src - OpenBSD base system

Age	Commit message (Collapse)	Author
2024-04-17	Use struct ipsec_level within inpcb.	Alexander Bluhm
	Instead of passing around u_char[4], introduce struct ipsec_level that contains 4 ipsec levels. This provides better type safety. The embedding struct inpcb is globally visible for netstat(1), so put struct ipsec_level outside of #ifdef _KERNEL. OK deraadt@ mvs@
2023-11-26	Remove inp parameter from ip_output().	Alexander Bluhm
	ip_output() received inp as parameter. This is only used to lookup the IPsec level of the socket. Reasoning about MP locking is much easier if only relevant data is passed around. Convert ip_output() to receive constant inp_seclevel as argument and mark it as protected by net lock. OK mvs@
2023-10-11	Prevent deref-after-free when tdb_timeout() fires on invalid new tdb.	Tobias Heider
	When receiving a pfkeyv2 SADB_ADD message, a newly created tdb can fail in tdb_init(), which causes the tdb to not get added to the global tdb list and an immediate dereference. If a lifetime timeout triggers on this tdb, it will unconditionally try to remove it from the list and in the process deref once more than allowed, causing a one bit corruption in the already freed up slot in the tdb pool. We resolve this issue by moving timeout_add() after tdb_init() just before puttdb(). This means tdbs failing initialization get discarded immediately as they only hold a single reference. Valid tdbs get their timeouts activated just before we add them to the tdb list, meaning the timeout can safely assume they are linked. Feedback from mvs@ and millert@ ok mvs@ mbuhl@
2023-08-07	start adding support for route-based ipsec vpns.	David Gwynne
	rather than use ipsec flows (aka, entries in the ipsec security policy database) to decide which traffic should be encapsulated in ipsec and sent to a peer, this tweaks security associations (SAs) so they can refer to a tunnel interface. when traffic is routed over that tunnel interface, an ipsec SA is looked up and used to encapsulate traffic before being sent to the peer on the SA. When traffic is received from a peer using an interface SA, the specified interface is looked up and the packet is handed to it so it looks like packets come out of the tunnel. to support this, SAs get a TDBF_IFACE flag and iface and iface_dir fields. When TDBF_IFACE is set the iface and dir fields are considered valid, and the tdb/SA should be used with the tunnel interface instead of the SPD. support from many including markus@ tobhe@ claudio@ sthen@ patrick@ now is a good time deraadt@
2023-07-06	big update to pfsync to try and clean up locking in particular.	David Gwynne
	moving pf forward has been a real struggle, and pfsync has been a constant source of pain. we have been papering over the problems for a while now, but it reached the point that it needed a fundamental restructure, which is what this diff is. the big headliner changes in this diff are: - pfsync specific locks this is the whole reason for this diff. rather than rely on NET_LOCK or KERNEL_LOCK or whatever, pfsync now has it's own locks to protect it's internal data structures. this is important because pfsync runs a bunch of timeouts and tasks to push pfsync packets out on the wire, or when it's handling requests generated by incoming pfsync packets, both of which happen outside pf itself running. having pfsync specific locks around pfsync data structures makes the mutations of these data structures a lot more explicit and auditable. - partitioning to enable future parallelisation of the network stack, this rewrite includes support for pfsync to partition states into different "slices". these slices run independently, ie, the states collected by one slice are serialised into a separate packet to the states collected and serialised by another slice. states are mapped to pfsync slices based on the pf state hash, which is the same hash that the rest of the network stack and multiq hardware uses. - no more pfsync called from netisr pfsync used to be called from netisr to try and bundle packets, but now that there's multiple pfsync slices this doesnt make sense. instead it uses tasks in softnet tqs. - improved bulk transfer handling there's shiny new state machines around both the bulk transmit and receive handling. pfsync used to do horrible things to carp demotion counters, but now it is very predictable and returns the counters back where they started. - better tdb handling the tdb handling was pretty hairy, but hrvoje has kicked this around a lot with ipsec and sasyncd and we've found and fixed a bunch of issues as a result of that testing. - mpsafe pf state purges this was committed previously, but because the locks pfsync relied on weren't clear this just caused a ton of bugs. as part of this diff it's now reliable, and moves a big chunk of work out from under KERNEL_LOCK, which in turn improves the responsiveness and throughput of a firewall even if you're not using pfsync. there's a bunch of other little changes along the way, but the above are the big ones. hrvoje has done performance testing with this diff and notes a big improvement when pfsync is not in use. performance when pfsync is enabled is about the same, but im hoping the slices means we can scale along with pf as it improves. lots (months) of testing by me and hrvoje on pfsync boxes tests and ok sashan@ deraadt@ says this is a good time to put it in
2022-07-14	Use capital letters for global ipsec(4) locks description. Use 'D'	Vitaliy Makkoveev
	instead of 's' for `tdb_sadb_mtx' mutex(9) because this is 'D'atabase. No functional changes. ok bluhm@
2022-04-30	When performing ipsp_ids_free(), grab `ipsec_flows_mtx' mutex(9) before do	Vitaliy Makkoveev
	`id_refcount' decrement. This should be consistent with `ipsp_ids_gc_list' list modifications, otherwise concurrent ipsp_ids_insert() could remove this dying `ids' from the list before if was placed there by ipsp_ids_free(). This makes atomic operations with `id_refcount' useless. Also prevent ipsp_ids_lookup() to return dying `ids'. ok bluhm@
2022-04-21	Introduce a dedicated link entries for snapshots in pfsync(4). The purpose	Alexandr Nedvedicky
	of snapshots is to allow pfsync(4) to move items from global lists to local lists (a.k.a. snapshots) under a mutex protection. Snapshots are then processed without holding any mutexes. Such idea does not fly well if link entry is currently used for global lists as well as snapshots. Feedback by bluhm@ Credits also goes to hrvoje@ for extensive testing. OK bluhm@
2022-03-13	Hrvoje has hit a crash with IPsec acquire while testing the parallel	Alexander Bluhm
	IP forwarding diff. Add mutex and refcount to make memory management of struct ipsec_acquire MP safe. testing Hrvoje Popovski; input sashan@; OK mvs@
2022-03-08	In IPsec policy replace integer refcount with atomic refcount.	Alexander Bluhm
	OK tobhe@ mvs@
2022-03-02	Merge two comments describing the locks into one.	Alexander Bluhm

2022-01-04	Add `ipsec_flows_mtx' mutex(9) to protect `ipsp_ids_*' list and	YASUOKA Masahiko
	trees. ipsp_ids_lookup() returns `ids' with bumped reference counter. original diff from mvs ok mvs
2021-12-20	Use per-CPU counters for tunnel descriptor block (TDB) statistics.	Vitaliy Makkoveev
	'tdb_data' struct became unused and was removed. Tested by Hrvoje Popovski. ok bluhm@
2021-12-19	There are occasions where the walker function in tdb_walk() might	Alexander Bluhm
	sleep. So holding the tdb_sadb_mtx() when calling walker() is not allowed. Move the TDB from the TDB-Hash to a temporary list that is protected by netlock. Then unlock tdb_sadb_mtx and traverse the list to call the walker. OK mvs@
2021-12-14	To cache lookups, the policy ipo is linked to its SA tdb. There	Alexander Bluhm
	is also a list of SAs that belong to a policy. To make it MP safe, protect these pointers with a mutex. tested by Hrvoje Popovski; OK mvs@
2021-12-11	Protect the write access to the TDB flags field with a mutex per	Alexander Bluhm
	TDB. Clearing the timeout flags just before pool put in tdb_free() does not make sense. Move this to tdb_delete(). While there make the parentheses in the flag check consistent. tested by Hrvoje Popovski; OK tobhe@
2021-12-08	Start documenting the locking strategy of struct tdb fields. Note	Alexander Bluhm
	that gettdb_dir() is MP safe now. Add the tdb_sadb_mtx mutex in udpencap_ctlinput() to protect the access to tdb_snext. Make the braces consistently for all these TDB loops. Move NET_ASSERT_LOCKED() into the functions where the read access happens. OK mvs@
2021-12-07	In ipo_tdb the flow contains a reference counted TDB cache. This	Alexander Bluhm
	may prevent that tdb_free() is called. It is not a real leak as ipsecctl -F or termination of iked flush this cache when they remove the IPsec policy. Move the code from tdb_free() to tdb_delete(), then the kernel does the cleanup itself. OK mvs@ tobhe@
2021-12-03	Add tdb_delete_locked() to replace duplicate tdb deletion code in	Tobias Heider
	pfkey_flush(). ok bluhm@ mvs@
2021-12-01	Reintroduce the TDBF_DELETED flag. Checking next pointer to figure	Alexander Bluhm
	out whether the TDB is linked to the hash bucket does not work. This fixes removal of SAs that could not be flushed with ipsecctl -F. OK tobhe@
2021-12-01	Let ipsp_spd_lookup() return an error instead of a TDB. The TDB	Alexander Bluhm
	is not always needed, but the error value is necessary for the caller. As TDB should be refcounted, it makes not sense to always return it. Pass an output pointer for the TDB which can be NULL. OK mvs@ tobhe@
2021-11-30	Remove unused parameter from ipsp_spd_inp().	Alexander Bluhm
	OK mvs@ yasuoka@
2021-11-26	Replace TDBF_DELETED flag with check if tdb was already unlinked.	Tobias Heider
	Protect tdb_unlink() and puttdb() for SADB_UPDATE with tdb_sadb_mutex. Tested by Hrvoje Popovski ok bluhm@ mvs@
2021-11-25	Implement reference counting for IPsec tdbs. Not all cases are	Alexander Bluhm
	covered yet, more ref counts to come. The timeouts are protected, so the racy tdb_reaper() gets retired. The tdb_policy_head, onext and inext lists are protected. All gettdb...() functions return a tdb that is ref counted and has to be unrefed later. A flag ensures that tdb_delete() is called only once. Tested by Hrvoje Popovski; OK sthen@ mvs@ tobhe@
2021-11-21	Add the new `ipsec_exctdb' ipsec(4) counter to count and expose to the	Vitaliy Makkoveev
	userland the TDBs which exceeded hard limit. Also the `ipsec_notdb' counter description in header doesn't math to netstat(1) description. We never count `ipsec_notdb' and the netstat(1) description looks more appropriate so it's used to avoid confusion with the new counter. ok bluhm@
2021-11-16	To debug IPsec and tdb refcounting it is useful to have "show tdb"	Alexander Bluhm
	and "show all tdbs" in ddb. tested by Hrvoje Popovski; OK mvs@
2021-10-25	Call a locked variant of tdb_unlink() from tdb_walk(). Fixes a	Alexander Bluhm
	mutex locking against myself panic introduced by my previous commit. OK beck@ patrick@
2021-10-24	Merge esp_input_cb() intp esp_input().	Tobias Heider
	ok bluhm@
2021-10-24	Remove code duplication by merging the v4 and v6 input functions	Alexander Bluhm
	for ah, esp, and ipcomp. Move common code into ipsec_protoff() which finds the offset of the next protocol field in the previous header. OK tobhe@
2021-10-24	Refactor ah_input() and ah_output() for new crypto API.	Tobias Heider
	ok bluhm@
2021-10-24	Refactor ipcomp_input() and ipcomp_output(). Remove obsolete code related	Tobias Heider
	to old crypto API. ok bluhm@
2021-10-24	There are more m_pullup() in IPsec input. Pass down the pointer	Alexander Bluhm
	to the mbuf to update it globally. At the end it will reach ip_deliver() which expects a pointer to an mbuf. OK sashan@
2021-10-24	Remove 'struct tdb_crypto' allocations from esp_input() and esp_output().	Tobias Heider
	This was needed to pass arguments to the callback function, but is no longer necessary after the API makeover. ok bluhm@
2021-10-23	There is an m_pullup() down in AH input. As it may free or change	Alexander Bluhm
	the mbuf, the callers must be careful. Although there is no bug, use the common pattern to handle this. Pass down an mbuf pointer mp and let m_pullup() update the pointer in all callers. It looks like the tcp signature functions should not be called. Avoid an mbuf leak and return an error. OK mvs@
2021-10-23	Retire asynchronous crypto API as it is no longer required by any driver and	Tobias Heider
	adds unnecessary complexity. Dedicated crypto offloading devices are not common anymore. Modern CPU crypto acceleration works synchronously, eliminating the need for callbacks. Replace all occurrences of crypto_dispatch() with crypto_invoke(), which is blocking and only returns after the operation has completed or an error occured. Invoke callback functions directly from the consumer (e.g. IPsec, softraid) instead of relying on the crypto driver to call crypto_done(). ok bluhm@ mvs@ patrick@
2021-10-13	The function ipip_output() was registered as .xf_output() xform	Alexander Bluhm
	function. But was is never called via this pointer. It would have immediatley crashed as mp is always NULL when called via .xf_output(). Do not set .xf_output to ipip_output. This allows to pass only the parameters which are actually needed and the control flow is clearer. OK mpi@
2021-10-05	Cleanup the error handling in ipsec ipip_output() and consistently	Alexander Bluhm
	goto drop instead of return. An ENOBUFS should be EINVAL in IPv6 case. Also use combined packet and byte counter. OK sthen@ dlg@
2021-10-05	Move setting ipsec mtu into a function. The NULL and invalid check	Alexander Bluhm
	in ipsec_common_ctlinput() is not necessary, the loop in ipsec_set_mtu() does that anyway. udpencap_ctlinput() did not work for bundled SA, this also needs the loop in ipsec_set_mtu(). OK sthen@
2021-09-29	Global variables to track initialisation behave poorly with MP.	Alexander Bluhm
	Move the tdb pool init into an init function. OK mvs@
2021-08-10	Remove unused `ipa_pcb' from 'ipsec_acquire' structure.	mvs
	ok gnezdo@
2021-07-27	Revert "Use per-CPU counters for tunnel descriptor block" diff.	mvs
	Panic reported by Hrvoje Popovski.
2021-07-26	Use per-CPU counters for tunnel descriptor block (tdb) statistics.	mvs
	'tdb_data' struct became unused and was removed. ok bluhm@
2021-07-18	Introduce and use garbage collector for 'ipsec_ids' struct entities	mvs
	destruction instead of using per-entity timeout. This fixes the races between ipsp_ids_insert(), ipsp_ids_free() and ipsp_ids_timeout(). ipsp_ids_insert() can't stop ipsp_ids_timeout() timeout handler which is already running and awaiting netlock to be released, so reused `ids' will be silently removed in this case. ipsp_ids_free() can't determine is ipsp_ids_timeout() timeout handler running because timeout_del(9) called by ipsp_ids_insert() clears it's triggered state. So ipsp_ids_timeout() could be scheduled to run twice in this case. Also hrvoje@ reported about ipsec(4) throughput increased with this diff so it seems we caught significant count of ipsp_ids_insert() races. tests and feedback by hrvoje@ ok bluhm@
2021-07-18	The IPsec authentication before decryption used a different replay	Alexander Bluhm
	counter than after decryption. This could result in "esp_input_cb: authentication failed for packet in SA" errors. As we run crypto operations async, thousands of packets are stored in the crypto task. During the queueing the replay counter of the tdb can change. Then the higher 32 bits may increment although the lower 32 bits did not wrap. checkreplaywindow() must be called twice per packet with the same replay counter. Store the value in struct tdb_crypto while dangling in the task queue and doing crypto operations. tested by Hrvoje Popovski; joint work with tobhe@
2021-07-13	Remove unused `PolicyHead' from 'sockaddr_encap' structure.	mvs
	ok tobhe@
2021-07-08	The xformsw array never changes. Declare struct xformsw constant	Alexander Bluhm
	and map data read only. OK deraadt@ mvs@ mpi@
2021-07-08	The properties of the crypto algorithms never change. Declare them	Alexander Bluhm
	constant. Then they are mapped as read only. OK deraadt@ dlg@
2021-07-07	Fix whitespaces in IPsec code.	Alexander Bluhm

2021-05-04	Initialize `ipsec_policy_pool' within pfkey_init() instead of doing that	mvs
	in runtime within pfkeyv2_send(). Also set it's interrupt protection level to IPL_SOFTNET. ok bluhm@ mpi@
2020-11-05	Enable support for ASN1_DN ipsec identifiers.	Peter Hessler
	Tested with multiple Window 10 Pro (ver 2004) clients, and OpenBSD+iked as the server. OK tobhe@ sthen@ kn@