src - OpenBSD base system

Age	Commit message (Collapse)	Author
4 days	provide ifq_deq_set_oactive.	David Gwynne
	ifq_deq_set_oactive is a variation on ifq_set_oactive that can be called inside an if_deq_begin "transaction". afresh@ found de(4) was calling ifq_set_oactive while holding the ifq mutex via ifq_deq_begin, which led to a panic because ifq_set_oactive also tries to take the ifq mutex. ifq_deq_set_oactive assumes the caller is already holding the mutex. de(4) is confusing, so it seemed simpler to add a small tweak to ifqs than try and do major surgery on such a hairy driver. tested by afresh@
4 days	use a tailq for the global list of bpf_if structs.	David Gwynne
	this replaces a hand rolled list that's been here since 1.1. ok claudio@ kn@ tb@
5 days	fix tcpdump on pfsync interfaces.	David Gwynne
	after the last rewrite i was showing bpf ip packets, not the pfsync payload like the PFSYNC DLT expected. this also lets bpf see packets being processed by pfsync input handling, so if you want to see only what's being sent you'll need to filter by direction. reported by Marc Boisis
6 days	bump the "mru" up to MAXMCLBYTES.	David Gwynne
	there's no reason to limit tun/tap to small packets. ok claudio@
6 days	include tun_hdr in the length reported by FIONREAD and kq if it's enabled.	David Gwynne

6 days	make sure bpfsdetach is holding a bpf_d ref when invalidating stuff.	David Gwynne
	when bpfsdetach is called by an interface being destroyed, it iterates over the bpf descriptors using the interface and calls vdevgone and klist_invalidate against them. however, i'm not sure the reference the interface holds against the bpf_d is accounted for properly, so vdevgone might drop it to 0 and free it, which makes the klist_invalidate a use after free. avoid this by taking a bpf_d ref before calling vdevgone and klist_invalidate so the memory can't be freed out from under the feet of bpfsdetach. Reported-by: syzbot+b3927f8ad162452a2f39@syzkaller.appspotmail.com i wasn't able to reproduce whatever syzkaller did. it's possible this is a double free, but we'll wait and see if it pops up again. ok mpi@
7 days	provide network offloads between the kernel and userland again	David Gwynne
	userland can request that network packets that are read from or written to the device special file get prepended with a "tun_hdr" struct. this struct contains bits which say what offloads are requested for the packet, including things like ip/tcp/udp/icmp checksums, tcp segmentation offloads, or ethernet vlan tags. userland can write a packet with any of these offloads requested into the kernel at any time, but has to request which ones it's able to handle coming from the kernel. enabling the tun_hdr struct and which offloads userland can handle is done with a new TUNSCAP ioctl. this is based on the virtio_net_hdr in linux, which jan@ actually implemented and had working with vmd. however, claudio@ and i strongly opposed to what feels like a layer violation by pulling virtio structures into the tun driver, and then trying to emulate virtio/linux semantics in our network stack, and playing catch up when the "upstream" projects decide to change the shape or meaning of these bits. tun_hdr is specific to the openbsd network stack and it's semantics, which simplifies our kernel implementation. jan has been pretty gracious about the extra work on the vmd side of things. tested by and ok jan@ ok claudio@ sthen@ backed this out cos of confusion with the ioctl numbers i picked to controlling this feature. i've picked new numbers that don't conflict this time.
9 days	revert tun(4) changes for now, breaks in kdump build (TUNSCAP/TIOCEXT clash)	Stuart Henderson
	tb@ agrees
10 days	provide a way to negotiate network offloads between the kernel and userland.	David Gwynne
	userland can request that network packets that are read from or written to the device special file get prepended with a "tun_hdr" struct. this struct contains bits which say what offloads are requested for the packet, including things like ip/tcp/udp/icmp checksums, tcp segmentation offloads, or ethernet vlan tags. userland can write a packet with any of these offloads requested into the kernel at any time, but has to request which ones it's able to handle coming from the kernel. enabling the tun_hdr struct and which offloads userland can handle is done with a new TUNSCAP ioctl. this is based on the virtio_net_hdr in linux, which jan@ actually implemented and had working with vmd. however, claudio@ and i strongly opposed to what feels like a layer violation by pulling virtio structures into the tun driver, and then trying to emulate virtio/linux semantics in our network stack, and playing catch up when the "upstream" projects decide to change the shape or meaning of these bits. tun_hdr is specific to the openbsd network stack and it's semantics, which simplifies our kernel implementation. jan has been pretty gracious about the extra work on the vmd side of things. tested by and ok jan@ ok claudio@
12 days	bump the type used to specify traffic queue bandwidth to 64bit.	David Gwynne
	this should let people specify interface and queue bandwidths greater than ~4Gbit. this changes the pf ioctls used to specify queues, so if you want to try this you'll need a new kernel, new headers, and a new pfctl (and systat). or upgrade using a snapshot. the effort and benefit of providing compat isn't worth it. putting it in now so people can kick it around.
2024-11-09	remove unused ifq_is_serialized()	Jonathan Gray
	missed when the prototype was removed in ifq.h rev 1.25 ok dlg@
2024-11-08	pf(4) when doing af-to translation for ICMP protocol sends packets	Alexandr Nedvedicky
	with TTL field to zero. To fix it function pf_test_state_icmp() must initialize ttl field in pf_pdesc structure for inner packet. feedback from bluhm@ OK bluhm@
2024-11-04	remove unused inline function; ok dlg@	Jonathan Gray

2024-11-01	remove unused local variable	Jonathan Gray

2024-10-31	Rewrite mbuf handling in wg(4).	Claudio Jeker
	. Use m_align() to ensure that mbufs are packed towards the end so that additional headers don't require costly m_prepends. . Stop using m_copyback(), the way it was used there was actually wrong, instead just use memcpy since this is just a single mbuf. . Kill all usage of m_calchdrlen(), again this is not needed or can simply be m->m_pkthdr.len = m->m_len since all this code uses a single buffer. . In wg_encap() remove the min() with t->t_mtu when calculating plaintext_len and out_len. The code does not correctly cope with this min() at all with severe consequences. Initial diff by dhill@ who found the m_prepend() issue. Tested by various people. OK dhill@ mvs@ bluhm@ sthen@
2024-10-31	Drop forgotten backslashes within vxlan_input(). Seems they are stalled	Vitaliy Makkoveev
	from macro copy-paste. No functional changes. ok mpi dlg
2024-10-29	move hfsc to using nanoseconds for keeping times.	David Gwynne
	before it was using 256000000 things per second, so this isn't a huge change, but it can use nsecuptime() to get the time. kjc and cheloa like it ok claudio@
2024-10-29	use nsecuptime instead of using nanouptime and doing a bunch of maths.	David Gwynne
	ok claudio@
2024-10-22	correct argument to klist_free(); ok visa@ mvs@	Jonathan Gray

2024-10-17	remove unneeded if_wg.h and pfsync.h includes	Jonathan Gray

2024-10-16	cut tun_init() out, it does pointless work.	David Gwynne
	tun_init turns interface/stack config into a set of flags that tun(4) keeps in tun_softc sc_flags, but never uses. ok miod@ kn@
2024-10-16	remove SIOCSIFDSTADDR from the network ioctls.	David Gwynne
	netintro says it's deprecated, and most of our other drivers are doing fine without it. ok miod@ kn@ patrick@
2024-10-15	remove struct arpreq from net/if_arp.h	Jonathan Gray
	unused since "rewrite to merge arp and routing tables" in CSRG if_ether.c 7.14 (Berkeley) 06/25/91 used by SIOCSARP, SIOCGARP, SIOCDARP, OSIOCGARP ioctls in Net/2 which were removed before 4.4BSD-Lite ok sthen@ who tested this with a ports build
2024-10-13	remove unneeded limits.h and errno.h includes	Jonathan Gray

2024-10-12	remove unneeded rwlock.h include	Jonathan Gray

2024-10-12	remove unneeded time.h include	Jonathan Gray

2024-10-12	remove unneeded percpu.h include	Jonathan Gray

2024-10-10	neuter the tun/tap ioctls that try and modify interface flags.	David Gwynne
	historically there was just tun(4) that supported both layer 3 p2p and ethernet modes, but had to be reconfigured at runtime by userland to properly change the interface type and interface flags. this is obviously not a great idea, mostly because a lot of stack behaviour around address management makes assumptions based on these parameters, and changing them at runtime confuses things. splitting tun so ethernet was handled by a specific tap(4) driver was a first step at locking this down. this takes a further step by restricting userlands ability to reconfigure the interface flags, specifically IFF_BROADCAST, IFF_MULTICAST, and IFF_POINTOPOINT. this change lets userland pass those values via the ioctls, but only if they match the current set of flags on the interface. these flags are set appropriate for the type of interface when it's created, but should not be changed afterwards. nothing in base uses these ioctls, so the only fall out will be from ports doing weird things. ok claudio@ kn@
2024-09-27	Previous pipex.c,v 1.155 was broken if the client was not behind a NAT.	YASUOKA Masahiko
	ok mvs
2024-09-20	remove unneeded semicolons; checked by millert@	Jonathan Gray

2024-09-09	Don't take netlock while setting `if_description'.	Vitaliy Makkoveev
	net/if_pppx.c is the only place where `if_description' accessed outside ifioctl() path and there is no reason to take netlock here. SIOCSIFDESCR case of ifioctl() modifies `if_description' with the only kernel lock. ok bluhm
2024-09-07	fix RBT_ENTRY in pf_state and pf_state_key	aisha
	ok sashan@
2024-09-04	Fix some spelling.	Marcus Glocker
	Input and ok jmc@, jsg@
2024-09-01	spelling; checked by jmc@, ok miod@ mglocker@ krw@	Jonathan Gray

2024-08-31	add rport(4) for p2p l3 connectivity between route domains.	David Gwynne
	you can basically plug rdomains together and route between them over rport interfaces. people keep asking me if this is so you can leak routes between rdomains, and the answer is yes. this is like pair(4) but cheaper because it avoids all the mucking around with putting an ethernet header on the mbuf just to take it off again later, and is more efficient with address space because it's a p2p ip interface. it has a small tweak from mvs@ ok denis@ claudio@
2024-08-27	remove some dead code that wasn't cleaned up	aisha
	ok sashan
2024-08-20	Unlock etherip_sysctl().	Vitaliy Makkoveev
	- ETHERIPCTL_ALLOW - atomically accessed integer; - ETHERIPCTL_STATS - per-CPU counters ok bluhm
2024-08-17	Allow PPP interface to run in an rdomain and get a default route installed ↵	Denis Fondras
	in the same routing domain Input and OK claudio@
2024-08-15	add BIOCSETFNR, which is like BIOCSETF but doesnt reset the buffer or stats.	David Gwynne
	from Matthew Luckie <mjl@luckie.org.nz> via tech@ deraadt@ likes it.
2024-08-12	Prepare bpf_sysctl() for upcoming net_sysctl() unlocking.	Vitaliy Makkoveev
	Both NET_BPF_MAXBUFSIZE and NET_BPF_BUFSIZE (`bpf_maxbufsize' and `bpf_bufsize' respectively) are atomically accessed integers. No locks required to modify them. ok bluhm
2024-08-06	Unlock sysctl net.inet.ip.directed-broadcast.	Alexander Bluhm
	ip_directedbcast is read once in either ip_input() or pf_test() during packet processing. So writing the variable does not need net lock. OK mvs@
2024-08-05	restrict the maximum wait time you can set via BIOCSWTIMEOUT to 5 minutes.	David Gwynne
	this is avoids passing excessively large values to timeout_add_nsec. Reported-by: syzbot+f650785d4f2b3fe28284@syzkaller.appspotmail.com
2024-08-05	Fix bridging IPv6 fragments with pf reassembly.	Alexander Bluhm
	Sending IPv6 fragments over a bridge with pf did not work. During input pf reassembles the packet, and at bridge output it should be refragmented. This is only done for PF_FWD direction, but bridge(4) and veb(4) called pf_test() with PF_OUT argument. OK sashan@
2024-07-30	Exports the statistics when PIPEXDSESSION. Found by ymatsui at iij.	YASUOKA Masahiko
	ok mvs
2024-07-26	Mark ipsecflowinfo immutable.	YASUOKA Masahiko
	ok mvs
2024-07-26	In pipex_l2tp_input(), check if ipsecflowinfo is not changed instead	YASUOKA Masahiko
	of updating it blindly. ok mvs
2024-07-23	Accept and ignore SADB_X_EXT_REPLAY and SADB_X_EXT_COUNTER payloads for	Tobias Heider
	incoming SADB_ADD and SADB_UPDATE message. Since we send them as part of the SADB_GET reply we must also accept them on SADB_ADD/UPDATE as sasyncd will forward payloads previously received in SADB_GET. Fixes a bug where sasync can't restore SAs because pfkey returns EINVAL. From Rafa\xc5\x82 Ramocki ok bluhm@
2024-07-18	In pfattach() pass malloc type instead of flags to cpumem_malloc().	Alexander Bluhm
	from markus@
2024-07-14	Unlock IPv6 sysctl net.inet6.ip6.forwarding from net lock.	Alexander Bluhm
	Use atomic operations to read ip6_forwarding while processing packets in the network stack. To make clear where actually the router property is needed, use the i_am_router variable based on ip6_forwarding. It already existed in nd6_nbr. Move i_am_router setting up the call stack until all users are independent. The forwarding decisions in pf_test, pf_refragment6, ip6_input do also not interfere. Use a new array ipv6ctl_vars_unlocked to make transition of all the integer sysctls easier. Adapt IPv4 to the new style. OK mvs@
2024-07-12	Switch `so_snd' of udp(4) sockets to the new locking scheme.	Vitaliy Makkoveev
	udp_send() and following udp{,6}_output() do not append packets to `so_snd' socket buffer. This mean the sosend() and sosplice() sending paths are dummy pru_send() and there is no problems to simultaneously run them on the same socket. Push shared solock() deep down to sesend() and take it only around pru_send(), but keep somove() running unedr exclusive solock(). Since sosend() doesn't modify `so_snd' the unlocked `so_snd' space checks within somove() are safe. Corresponding `sb_state' and `sb_flags' modifications are protected by `sb_mtx' mutex(9). Tested and OK bluhm.