src - OpenBSD base system

Age	Commit message (Collapse)	Author
2020-09-01	Convert *_sysctl in ipsec_input.c to sysctl_bounded_arr	gnezdo
	The best-guessed limits will be tested by trial.
2020-09-01	Convert icmp6_sysct to sysctl_bounded_args	gnezdo
	The best-guessed limits will be tested by trial.
2020-08-24	Convert divert*_sysctl to sysctl_bounded_args	gnezdo
	OK sashan
2020-08-22	Convert icmp_sysctl to sysctl_bounded_args	gnezdo
	... these all look fine, derradt@
2020-08-22	Convert ip_sysctl to sysctl_bounded_args	gnezdo

2020-08-22	Convert udp_sysctl to sysctl_bounded_args	gnezdo

2020-08-18	Style fixups from hurried commits	gnezdo
	Thanks kettenis@ for pointing out. ok kettenis@
2020-08-18	Convert tcp_sysctl to sysctl_bounded_args	gnezdo
	This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn. ok derradt@
2020-08-17	Simplify igmp_sysctl to directly return error in default case	gnezdo
	This replaces a piece of observationally identical code which was much more complicated. ok mpi@
2020-08-08	No longer prevent TCP connections to IPv6 anycast addresses.	Florian Obser
	RFC 4291 dropped this requirement from RFC 3513: o An anycast address must not be used as the source address of an IPv6 packet. And from that requirement draft-itojun-ipv6-tcp-to-anycast rightly concluded that TCP connections must be prevented. The draft also states: The proposed method MUST be removed when one of the following events happens in the future: o Restriction imposed on IPv6 anycast address is loosened, so that anycast address can be placed into source address field of the IPv6 header[...] OK jca
2020-08-05	Don't compare pointers against zero.	Marcus Glocker
	Reported by Peter J. Philipp. ok mvs@ deraadt@
2020-08-01	Move range check inside sysctl_int_arr	gnezdo
	Range violations are now consistently reported as EOPNOTSUPP. Previously they were mixed with ENOPROTOOPT. OK kn@
2020-07-28	Don't treat an error if carppeer is an unicast and the peer is down.	YASUOKA Masahiko
	ok kn
2020-07-28	After the previous commit, src/regress/sys/netinet/carp triggered	Alexander Bluhm
	an uvm fault. Check that ifp0 is not NULL. OK sashan@ mvs@
2020-07-24	netinet: tcp_close(): delay reaper timeout by one tick	cheloha
	Zero-tick timeouts rely on implicit behavior in the timeout layer that inhibits optimizations in softclock(). bluhm@ says waiting a tick for the reaper shouldn't break anything. ok bluhm@
2020-07-24	Use interface index instead of pointer to `ifnet' in carp(4).	mvs
	ok sashan@
2020-07-22	deprecate interface input handler lists, just use one input function.	David Gwynne
	the interface input handler lists were originally set up to help us during the intial mpsafe network stack work. at the time not all the virtual ethernet interfaces (vlan, svlan, bridge, trunk, etc) were mpsafe, so we wanted a way to avoid them by default, and only take the kernel lock hit when they were specifically enabled on the interface. since then, they have been fixed up to be mpsafe. i could leave the list in place, but it has some semantic problems. because virtual interfaces filter packets based on the order they were attached to the parent interface, you can get packets taken away in surprising ways, especially when you reboot and netstart does something different to what you did by hand. by hardcoding the order that things like vlan and bridge get to look at packets, we can document the behaviour and get consistency. it also means we can get rid of a use of SRPs which were difficult to replace with SMRs. the interface input handler list is an SRPL, which we would like to deprecate. it turns out that you can sleep during stack processing, which you're not supposed to do with SRPs or SMRs, but SRPs are a lot more forgiving and it worked. lastly, it turns out that this code is faster than the input list handling, so lots of winning all around. special thanks to hrvoje popovski and aaron bieber for testing. this has been in snaps as part of a larger diff for over a week.
2020-07-22	move carp_input into ether_input, instead of via an input handler.	David Gwynne
	carp_input is only tried after vlan and bridge handling is done, and after the ethernet packet doesnt match the parent interfaces mac address. this has been in snaps as part of a larger diff for over a week.
2020-07-22	add code to coordinate how bridges attach to ethernet interfaces.	David Gwynne
	this is the first step in refactoring how ethernet frames are demuxed by virtual interfaces, and also in deprecating interface input list handling. we now have drivers for three types of virtual bridges, bridge(4), switch(4), and tpmr(4), and it doesn't make sense for any of them to be enabled on the same "port" interfaces at the same time. currently you can add a port interface to multiple types of bridge, but which one gets to steal the packets depends on the order in which they were attached. this creates an ether_brport structure that holds an input function for the bridge, and optionally some per port state that the bridge can use. arpcom has a single pointer to one of these structs that will be used during normal ether_input processing to see if a packet should be passed to a bridge, and will be used instead of an if input handler. because it is a single pointer, it will make sure only one bridge of any type is attached to a port at any one time. this has been in snaps as part of a larger diff for over a week.
2020-06-24	kernel: use gettime(9)/getuptime(9) in lieu of time_second(9)/time_uptime(9)	cheloha
	time_second(9) and time_uptime(9) are widely used in the kernel to quickly get the system UTC or system uptime as a time_t. However, time_t is 64-bit everywhere, so it is not generally safe to use them on 32-bit platforms: you have a split-read problem if your hardware cannot perform atomic 64-bit reads. This patch replaces time_second(9) with gettime(9), a safer successor interface, throughout the kernel. Similarly, time_uptime(9) is replaced with getuptime(9). There is a performance cost on 32-bit platforms in exchange for eliminating the split-read problem: instead of two register reads you now have a lockless read loop to pull the values from the timehands. This is really not too bad in the grand scheme of things, but compared to what we were doing before it is several times slower. There is no performance cost on 64-bit (__LP64__) platforms. With input from visa@, dlg@, and tedu@. Several bugs squashed by visa@. ok kettenis@
2020-06-21	wrap a long line. no functional change.	David Gwynne

2020-06-21	if an inp_upcall is set, let it look at and maybe steal the udp packet.	David Gwynne
	i wrote the original version of this, but it was tweaked by Matt Dunwoodie and Jason A. Donenfeld for use with wireguard.
2020-06-21	knf: the inp_upcall line was too long.	David Gwynne

2020-06-21	add a inp_upcall function pointer and inp_upcall_arg to struct in_pcb.	David Gwynne
	this is so protocols (eg, udp) can let things (eg, kernel support for wireguard or vxlan or geneve) look at and possibly steal packets before they get added to a socket buffer. i wrote the original version of this, but it was tweaked by Matt Dunwoodie and Jason A. Donenfeld for use with wireguard.
2020-06-19	Break a glass ceiling on cwnd due to integer division during congestion	Richard Procter
	avoidance. The problem and fix is noted in RFC5681 section 3.1, page 7. Report, diff and testing from Brian Brombacher, thanks! Testing and a cosmetic tweak by myself. ok claudio
2020-06-18	Refuse to set 0 or a negative value for net.inet.tcp.synbucketlimit.	Martin Pieuchot
	Prevent a panic in syn_cache_insert() found by syzbot. Reported-by: syzbot+aee24ad9b7bf5665912d@syzkaller.appspotmail.com ok sashan@, anton@, millert@
2020-05-27	Connectionless sockets like UDP can be re-connected to a different	Alexander Bluhm
	address. In that case, the linking to the pf state must be dissolved as the latter still contains the old address. If it is a divert state, also remove the state as any divert state must be associated with a matching socket. Call pf_remove_divert_state() and pf_inp_unlink() from in_pcbconnect(). reported by Tim Kuijsten; OK sashan@ claudio@
2020-05-27	Document the various flavors of NET_LOCK() and rename the reader version.	Martin Pieuchot
	Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path take the reader lock. This is mostly for documentation purpose as long as the softnet thread is converted back to use a read lock. dlg@ said that comments should be good enough. ok sashan@
2020-05-21	don't count packets in the carp protocol handling against an interface.	David Gwynne
	these packets have generally already been counted on the interface because that's where they were sent or received from. the protocol handling side of things already counts things like packets, which you see with netstat -sp carp.
2020-05-21	implement a carp_transmit that bypasses the ifq on output.	David Gwynne
	this is modelled on vlan_transmit, and basically enqueues the packet directly on the parent interface. even though carp is generally not used to transmit packets, we run dhcp relays on it at work and hit a situation where we unecessarily dropped packets because it's ifq maxlen was 1. i've been running this for a month in production. ok jmatthew@
2020-04-29	remove some trailing whitespace. no functional change.	David Gwynne

2020-04-23	Add support for autmatically moving traffic between rdomains on ipsec(4)	tobhe
	encryption or decryption. This allows us to keep plaintext and encrypted network traffic seperated and reduces the attack surface for network sidechannel attacks. The only way to reach the inner rdomain from outside is by successful decryption and integrity verification through the responsible Security Association (SA). The only way for internal traffic to get out is getting encrypted and moved through the outgoing SA. Multiple plaintext rdomains can share the same encrypted rdomain while the unencrypted packets are still kept seperate. The encrypted and unencrypted rdomains can have different default routes. The rdomains can be configured with the new SADB_X_EXT_RDOMAIN pfkey extension. Each SA (tdb) gets a new attribute 'tdb_rdomain_post'. If this differs from 'tdb_rdomain' then the packet is moved to 'tdb_rdomain_post' afer IPsec processing. Flows and outgoing IPsec SAs are installed in the plaintext rdomain, incoming IPsec SAs are installed in the encrypted rdomain. IPCOMP SAs are always installed in the plaintext rdomain. They can be viewed with 'route -T X exec ipsecctl -sa' where X is the rdomain ID. As the kernel does not create encX devices automatically when creating rdomains they have to be added by hand with ifconfig for IPsec to work in non-default rdomains. discussed with chris@ and kn@ ok markus@, patrick@
2020-04-12	Stop processing packets under non-exclusive (read) netlock.	Martin Pieuchot
	Prevent concurrency in the socket layer which is not ready for that. Two recent data corruptions in pfsync(4) and the socket layer pointed out that, at least, tun(4) was incorrectly using NET_RUNLOCK(). Until we find a way in software to avoid future mistakes and to make sure that only the softnet thread and some ioctls are safe to use a read version of the lock, put everything back to the exclusive version. ok stsp@, visa@
2020-03-15	Guard SIOCDELMULTI if_ioctl calls with KERNEL_LOCK() where the call is	Visa Hankala
	made from socket close path. Most device drivers are not MP-safe yet, and the closing of AF_INET and AF_INET6 sockets is no longer under the kernel lock. This fixes a panic seen by jcs@. OK mpi@
2020-03-06	Fix uninitialized use of variable 'len'.	tobhe
	ok bluhm@
2020-01-26	add define for IPTOS_DSCP_LE; "low effort" DSCP codepoint standardised	Damien Miller
	in RFC8622; ok job@
2019-12-23	rdr-to with loopback destination should work even though	Alexandr Nedvedicky
	IP forwarding is disabled. Issue reported by Daniel Jakots (danj@) OK bluhm@
2019-12-10	Make bundled IPcomp/ESP policies work with IPSEC_LEVEL_REQUIRE.	tobhe
	We only install flows for IPcomp. When processing an incoming ESP SA, look for a bundled IPcomp SA and use that in the policy check. ok bluhm@
2019-12-09	always pull in if_types.h, to unbreak ramdisks	Theo de Raadt

2019-12-08	Make sure packet destination address matches interface address,	Alexandr Nedvedicky
	where such packet is bound to. This check is enforced if and only IP forwarding is disabled. Change discussed with bluhm@, claudio@, deraadt@, markus@, tobhe@ OK bluhm@, claudio@, tobhe@
2019-12-06	Checking the IPsec policy is expensive. Check only when IPsec is used.	tobhe
	ok bluhm@
2019-12-01	Don't require a valid sa_len for a bunch of IPv4 "get" ioctls	Jeremie Courreges-Anglas
	Same fix as for the IPv6 case. Fixes a regression in ports/net/openvpn spotted by landry@, ok bluhm@
2019-11-29	Change the default security level for incoming IPsec flows from	tobhe
	isakmpd and iked to REQUIRE. Filter policy violations earlier. ok sashan@ bluhm@
2019-11-28	Although ifconfig(8) checks it already, enforce contiguous inet	Alexander Bluhm
	netmask in the kernel. OK visa@
2019-11-13	Add DoT 853 to DEFBADDYNAMICPORTS_TCP. This port will be increasingly	Theo de Raadt
	unfiltered in the future, so this prevents rresvport_af(3) from randomly exposing a service intended for local visibility only. ok florian
2019-11-11	Prevent underflows in tp->snd_wnd if the remote side ACKs more than	Alexander Bluhm
	tp->snd_wnd. This can happen, for example, when the remote side responds to a window probe by ACKing the one byte it contains. from FreeBSD; via markus@; OK sashan@ tobhe@
2019-11-08	void being too clever about setting/clearing ifpromisc on the parent.	David Gwynne
	ifpromisc() already refcounts, so carp doesn't have to do it implicitly with the carpdev list. there's no functional change, the code just gets a bit simpler.
2019-11-08	convert interface address change hooks to tasks and a task_list.	David Gwynne
	this follows what's been done for detach and link state hooks, and makes handling of hooks generally more robust. address hooks are a bit different to detach/link state hooks in that there's only a few things that register hooks (carp, pf, vxlan), but a lot of places to run the hooks (lots of ipv4 and ipv6 address configuration). an address hook cookie was in struct pfi_kif, which is part of the pf abi. rather than break pfctl -sI, this maintains the void * used for the cookie and uses it to store a task, which is then used as intended with the new api.
2019-11-07	Do propper kernel input validation for in_control() ioctl(2)	Alexander Bluhm
	SIOCGIFADDR, SIOCGIFNETMASK, SIOCGIFDSTADDR, SIOCGIFBRDADDR, SIOCSIFADDR, SIOCSIFNETMASK, SIOCSIFDSTADDR, and SIOCSIFBRDADDR. Name in_ioctl_set_ifaddr() consistently. Use in_sa2sin() to validate inet address. Combine if_addrlist loops and add comment. Although netmask is not a inet address, length must be valid. Reported-by: syzbot+5fc6da002fc4e8d994be@syzkaller.appspotmail.com OK visa@
2019-11-07	Avoid NULL dereference in arpinvalidate() and nd6_invalidate() by	Kenneth R Westerback
	making RTM_INVALIDATE code path perform same check as RTM_DELETE does. ok mpi@