src - OpenBSD base system

Age	Commit message (Collapse)	Author
2022-07-14	Protect all writers to ifm_cur with a mutex. ifmedia_match() does	Alexander Bluhm
	not return any pointers without lock anymore. OK mvs@ mbuhl@
2022-07-14	Turn pppoe(4) back to kernel lock. We can't predict netlock state within	Vitaliy Makkoveev
	pppoe_start(), so we can't use it for pppoe(4) data protection. Except input path, pppoe(4) always accessed with kernel lock held, so grab it around pppoeintr() too. Interfaces should not use netlock for their data protection. They should rely on kernel lock or implement their own. ok bluhm@ bket@
2022-07-14	Replace tabs by spaces after "#define". No functional changes, just	Vitaliy Makkoveev
	prevent future diffs to be ugly. ok bluhm@
2022-07-12	Use __func__ in interface media debug printf().	Alexander Bluhm

2022-07-12	Protect interface media list with a mutex. This is just a start	Alexander Bluhm
	to make make media structures MP safe. OK mvs@
2022-07-12	Remove PIPEXCSESSION pipex(4) ioctl(2) command from kernel and man page.	Vitaliy Makkoveev
	Long time ago pipex(4) session can't be deleted until both pipex(4) input and output queues become empty. Dead sessions were linked to the stack and the `ip_forward' flag was used to prevent packets forwarding. npppd(8) marked such sessions by doing PIPEXCSESSION ioctl(2) call. But since we started to unlink close session from the stack, this logic became unnecessary. Also pipex(4) session could be closed just after close request. npppd(8) was the only userland program which did PIPEXCSESSION ioctl(2) call, and we removed it week ago. It's time to remove the remains. Now the `flags' member of 'pipex_session' structure became immutable. ok yasuoka@
2022-07-10	Add missing `pipex_list_mtx' mutex(9) around all sessions loop within	Vitaliy Makkoveev
	pipex_ip_output(). The all sessions loop was reworked to make possible to drop the lock within. ok bluhm@ yasuoka@.
2022-07-10	if_detach() should wait until concurrent (*if_qstart)() interface start	Vitaliy Makkoveev
	routines finished. Call ifq_barrier(9) just after we unlinked dying interface from the stack. From this point it is not accessible by if_get(9) and if_unit(9), and all concurrent threads owning interface pointer finished. It also detached from pseudo drivers like bridge(4). We only could have concurrent (*if_qstart)() handlers running, so wait them and then continue destruction. Reported and tested by Hrvoje Popovski. ok bluhm@
2022-07-10	Add _cb suffix to callback fields in struct ifmedia. Makes code	Alexander Bluhm
	easier to read and grep as ifm_status was used in both structs ifmediareq and ifmedia with different meaning. OK mvs@
2022-07-09	Fix the error path of the 'SIOCSIFMTU' pppoe_ioctl() case. Return error	Vitaliy Makkoveev
	value if the `error' is set instead of continue to sppp_ioctl(). ok bluhm@
2022-07-09	Unwrap klist from struct selinfo as this code no longer uses selwakeup().	Visa Hankala
	OK jsg@
2022-07-05	Remove old poll/select wakeup mechanism.	Visa Hankala
	Also remove unneeded seltrue() and selfalse(). OK mpi@ jsg@
2022-07-02	Remove unused device poll functions.	Visa Hankala
	Also remove unneeded includes of <sys/poll.h> and <sys/select.h>. Some addenda from jsg@. OK miod@ mpi@
2022-06-29	Between the calls to art_match() and SRPL_FIRST() another CPU may	Alexander Bluhm
	remove the route from the list. In rtable_match() check if the route entry is NULL. discussed with mpi@ jmatthew@ claudio@; OK mpi@
2022-06-29	Remove switch(4) remains.	Vitaliy Makkoveev
	ok claudio@ mpi@
2022-06-29	ether_input() called with shared netlock, but pppoe(4) wants it to be	Vitaliy Makkoveev
	exclusive. Do the pppoe(4) input within netisr handler with exclusive netlok held and remove kernel lock hack from ether_input(). This is the step back, but it makes ether_input() path better then it is now. Tested by Hrvoje Popovski. ok bluhm@ claudio@
2022-06-28	Don't call pipex_rele_session() when `session' is NULL.	Vitaliy Makkoveev
	Reported by Hrvoje Popovski. ok bluhm@
2022-06-28	fix syncookies in conjunction with tcp fast port reuse.	Henning Brauer
	This really pointed out that the place syncookies were hooked in was almost, but not completely right. The way it was the special case for tcp fast port reuse in pf_test_state wasn't hit, because the first packet hitting that was the ACK from the peer finishing the 3WHS, and the reconstructed SYN came after. We're now doing pf_find_state (and only that) first, then syncookies, then going on so that the old state is thrown away properly and we get a new one with the sequence number modulator set up correctly Bonus: -11 lines of code tracked down (that took a while) + fixed under contract with Hush Communications Canada; special thanks to Lyndon ok sashan
2022-06-28	Use refcnt API for struct rtentry instead of hand-crafted atomic	Alexander Bluhm
	operations. OK mvs@
2022-06-28	ifconfig(8) return "Not supported" if you try to configure tso on a non-tso	Jan Klemkow
	supported interface. pointed out by bluhm@ OK bluhm@
2022-06-28	Introduce `pipexoutq' mbuf(9) queue, and put outgoing pipex(4) related	Vitaliy Makkoveev
	PPPOE packets within. Do (if_output)() calls within netisr handler with netlock held. We can't predict netlock state when pipex(4) related (if_qstart)() handlers called. This means we can't use netlock within pppac_qstart() and pppx_if_qstart() handlers. ok bluhm@
2022-06-27	Rework the rttimer code. Instead of a global queue and a global timeout	Claudio Jeker
	use a per rttimer struct timeout. On enqueue the struct rttimer belongs to the timeout, in case the route is removed before the timer fires cleanup based on the timeout_del() return value. If the timeout currently running then just clear the rtt_rt pointer and let the timeout handle the cleanup. This should hopefully fix the icmp_pmtu_timeout crashes reported by some people. OK bluhm@
2022-06-27	Push the kernel lock down into arpresolve(). We still need it to	Alexander Bluhm
	prevent concurrent access to rt_llinfo from rtrequest_delete(). But the common case, when the MAC address is already known, works without lock. tested by Hrvoje Popovski; OK mvs@
2022-06-27	Fix white space and wrap long lines.	Alexander Bluhm

2022-06-27	Introduce Large Receive Offloading of TCP segment offloading for ix(4). It is	Jan Klemkow
	disabled by default. Also add a tso option to ifconfig(8) to enable and disable this feature. ok deraadt
2022-06-27	Don't copy more than sa_len from the sockaddr to the sysctl / rt msg buffer.	Claudio Jeker
	In the rt msg buffer the size of the full buffer is calculated first then filled out after allocating the mbuf. In the sysctl code this is not needed since the buffer is already provided. OK mvs@
2022-06-26	Mark `pipex_enable' as atomic. We never check `pipex_enable' within	Vitaliy Makkoveev
	(*if_qstart)() and we don't worry it's not serialized with the rest of output path. Also we will process already enqueued pipex(4) packets regardless on `pipex_enable' state. Use the local copy of `pipex_enable' within pppx_if_output(), otherwise we loose consistency. pointed and ok by bluhm@
2022-06-26	Don't reset `idle_time' timeout on closed pipex(4) sessions in packet	Vitaliy Makkoveev
	processing path. Such sessions already reached time to live timeout, and the garbage collector waits a little to before kill them. Otherwise we could make session's life time more then PIPEX_CLOSE_TIMEOUT. ok bluhm@
2022-06-26	Don't take kernel lock on pipex(4) pppoe input. This extra serialization	Vitaliy Makkoveev
	is not required. In packet processing path we have shared netlock held, but we do read-only access on per session `flags' and `ifindex'. We always modify them from ioctl(2) path with exclusive netlock held. The rest of pipex(4) session is immutable or uses per-session locks. ok bluhm@
2022-06-26	Fix spacing.	Vitaliy Makkoveev

2022-06-26	Switch walkargs for the buffer size to size_t and change the overflow	Claudio Jeker
	check to the less awkward w->w_needed <= w->w_given. OK bluhm@
2022-06-26	The "ifq_set_maxlen(..., 1);" hack we use to enforce pipex(4) related	Vitaliy Makkoveev
	(*if_qstart)() be always called with netlock held doesn't work anymore with PPPOE sessions. Introduce `pipex_list_mtx' mutex(9) and use it to protect global pipex(4) lists and radix trees. Protect pipex(4) `session' dereference with reference counters, because we could sleep when accessing pipex(4) from ioctl(2) path, and this is not possible with mutex(9) held. ok bluhm@
2022-06-26	'pipex_mppe' and 'pipex_session' structures have uint16_t bit fields	Vitaliy Makkoveev
	which represent flags. We mix unlocked access to immutable flags with protected access to mutable ones. This could be not MP independent on some architectures, so convert these fields to u_int `flags' variables. ok bluhm@
2022-06-26	Allow waiting during ktable allocation in pf_ioctl.	mbuhl
	OK bluhm Reported-by: syzbot+50ea4f33ed5dd9264918@syzkaller.appspotmail.com Reported-by: syzbot+df65f8b7ee8c0089e885@syzkaller.appspotmail.com
2022-06-16	pfctl reports existing table as being added. glitch has	Alexandr Nedvedicky
	been spotted and reported by jmc@ OK kn@
2022-06-16	Mark routes sent via sysctl(2) with RTF_DONE like it is done on the	Claudio Jeker
	route socket. All messages passed are by definition done. This may allow to share more code between sysctl and route socket parsers. OK mpi@
2022-06-13	fix logic bug in pf_find_state()	Henning Brauer
	a state in PFTM_PURGE could potentially hide another state on the same state key that is active and we'd incorrectly block the packet I believe that cannot happen as things are now. ok sashan
2022-06-07	fixes potential memory leak. if_vinput() should always consume packet	Alexandr Nedvedicky
	by either passing it further or releasing it. OK mvs@
2022-06-07	fixes NULL pointer dereference panic triggered by relayd.	Alexandr Nedvedicky
	same panic can be triggered when address table is part of anchor loaded by 'load anchor ... from ..,' statement. pf_find_or_create_ruleset() function called by pfr_add_tables() must receive ruleset name which comes from pre-allocated root table. OK claudio@ dlg@
2022-06-06	Simplify solock() and sounlock(). There is no reason to return a value	Claudio Jeker
	for the lock operation and to pass a value to the unlock operation. sofree() still needs an extra flag to know if sounlock() should be called or not. But sofree() is called less often and mostly without keeping the lock. OK mpi@ mvs@
2022-06-01	callers to pf(4) must continue to run with packet as returned	Alexandr Nedvedicky
	by firewall. OK dlg@
2022-05-23	In pf the kernel paniced if IP options in packet within ICMP payload	Alexander Bluhm
	were truncated. Drop such packets instead. Reported-by: syzbot+91abd3aa2fdfe900f9ce@syzkaller.appspotmail.com OK sashan@ claudio@
2022-05-23	Fix white space.	Alexander Bluhm

2022-05-18	Remove #ifdef DDB specific includes, added in 1.968 but related code bits	Miod Vallat
	removed in 1.970. ok bluhm@
2022-05-16	pfi_kif_alloc() may be called with M_NOWAIT. Add NULL check to	Alexander Bluhm
	handle malloc(9) failure. from markus@; OK sashan@
2022-05-15	Use strncmp() and IFNAMSIZ for if_xname in veb(4) consistently.	Alexander Bluhm
	OK dlg@
2022-05-15	gcc insists the decl for veb_ports_free also use inline	Theo de Raadt

2022-05-15	avoid calling if_enqueue from an smr critical section.	David Gwynne
	claudio@ is right that as a rule of thumb it is a bad idea to call arbitrary code from an smr crit section because the scope of what is called is very hard to keep in your head. in this particular case sashan@ points out that if_enqueue can call vport handlers, which calls if_vinput, which will push a packet into the network stack, which will call pf and try to take an rwlock. you can't sleep in an smr crit section. SMRs in this situation are protecting references to ports in the list of span and actual ports attached to a veb. when we needed to send a packet to an unknown unicast, broadcast, or multicast packet the code would SMR_TAILQ_FOREACH over all the ports, duplicating the mbuf and calling if_enqueue against the port. span port handling is basically the same, but we unconditionally send to them. this replaces the SMR_TAILQ with maps (arrays) of ports. the veb port map data structure contains a struct refcnt and the number of ports. the forwarding paths use an SMR crit section to get a reference to the map, increase the refcnt, and then leaves the smr crit section before iterating over the array of ports in the map. after the iteration it releases the refcnt. this does add a couple of atomic ops in the forwarding path, but only in the uncommon case (most packets are (should be) to known unicast addresses), and it's only one set of ops for all ports instead of ops per port. the known unicast case follows this pattern too. reported by Barbaros Bilek on bugs@ fix tested by me and hrvoje popovski ok claudio@ sashan@ bluhm@ (who also did a lot of the initial analysis)
2022-05-14	When receiving a PADO offer, clear stored tags from previous PADO packets.	Tobias Heider
	Keeping and combining tags from multiple previous packets could result in a single accumulated reply overrunning mbuf size limits. Also make sure the tag size fields are reset to 0 if allocation fails. Add size check on mbuf cluster allocation and fail if more than MCLBYTES are requested. From NetBSD. tested by naddy@ ok bluhm@
2022-05-10	move memory allocations in pfr_add_tables() out of	Alexandr Nedvedicky
	NET_LOCK()/PF_LOCK() scope. bluhm@ helped a lot to put this diff into shape. OK bluhm@