src - OpenBSD base system

Age	Commit message (Collapse)	Author
2021-02-08	Start refcounting interface groups with 1. if_creategroup() returns	Alexander Bluhm
	a new object that is already refcounted, so carp attach does not reach into internal structures. Add kasserts to detect counter overflow or underflow. OK mvs@
2021-02-06	Simplex interface sends packet back without hardware checksum	Alexander Bluhm
	offloading. The checksum must be calculated in software. Use the same condition in ether_resolve() to send the broadcast packet back to the stack and in in_ifcap_cksum() to force software checksumming. This fixes regress/sys/kern/sosplice/loop. OK procter@
2021-02-05	Fix whitespace.	Alexander Bluhm

2021-02-04	make if_pfsync.c a better friend with PF_LOCK	Alexandr Nedvedicky
	The code delivered in this change is currently disabled. Brave souls may enable the code by adding -DWITH_PF_LOCK when building customized kernel. Big thanks goes to Hrvoje@ for providing test equipment and testing. As soon as we enter the next release cycle, the WITH_PF_LOCK will be defined as default option for MP kernels. OK dlg@
2021-02-03	change pf_route so pf only runs when packets enter and leave the stack.	David Gwynne
	before this change pf_route operated on the semantic that pf runs when packets go over an interface, so when pf_route changed which interface the packet was on it would run pf_test again. this change changes (restores) the semantic that pf is only supposed to run when packets go in or out of the network stack, even if route-to is responsibly for short circuiting past the network stack. just to be clear, for normal packets (ie, those not touched by route-to/reply-to/dup-to), there isn't a difference between running pf when packets enter or leave the stack, or having pf run when a packet goes over an interface. the main reason for this change is that running the same packet through pf multiple times creates confusion for the state table. by default, pf states are floating, meaning that packets are matched to states regardless of which interface they're going over. if a packet leaving on em0 is rerouted out em1, both traversals will end up using the same state, which at best will make the accounting look weird, or at worst fail some checks in the state and get dropped. another reason for this commit is is to make handling of the changes that route-to makes consistent with other changes that are made to packet. eg, when nat is applied to a packet, we don't run pf_test again with the new addresses. the main caveat with this diff is you can't have one rule that pushes a packet out a different interface, and then have a rule on that second interface that NATs the packet. i'm not convinced this ever worked reliably or was used much anyway, so we don't think it's a big concern. discussed with many, with special thanks to bluhm@, sashan@ and sthen@ for weathering most of that pain. ok claudio@ sashan@ jmatthew@
2021-02-01	Netlock should be grabbed before pppx_if_find() call in pppxwrite().	mvs
	Otherwise this `pxi' can be killed by concurrent thread after context switch caused by following netlock. ok yasuoka@
2021-02-01	Remove dummy TUNSIFMODE ioctl(2) call from pppac(4) and npppd(8). Since	mvs
	OpenBSD 6.7 npppd(8) can't work over tun(4). ok yasuoka@
2021-02-01	ifunit() was fully replaced by if_unit(9) and should go away.	mvs
	ok bluhm@ dlg@
2021-02-01	change route-to so it sends packets to IPs instead of interfaces.	David Gwynne
	this is a significant (and breaking) reworking of the policy based routing that pf can do. the intention is to make it as easy as nat/rdr to use, and more robust when it's operating. the main reasons for this change are: - route-to, reply-to, and dup-to do not work with pfsync this is because the information about where to route-to is stored in rules, and it is hard to have a ruleset synced between firewalls, and impossible to have them synced 100% of the time. - i can make my boxes panic in certain situations using route-to yeah... - the configuration and syntax for route-to rules are confusing. the argument to route-to and co is an interace name with an optional ip address. there are several problems with this. one is that people tend to think about routing as sending packets to peers by their address, not by the interface they're reachable on. another is that we currently have no way to synchronise interface topology information between firewalls, so using an interface to say where packets go means we can't do failover of these states with pfsync. another is that a change in routing topology means a host may become reachable over a different interface. tying routing policy to interfaces gets in the way of failover and load balancing. this change does the following: - stores the route info in the state instead of the pf rule this allows route-to to keep working when the ruleset changes, and allows route-to info to be sent over pfsync. there's enough spare bits in pfsync messages that the protocol doesnt break. the caveat is that route-to becomes tied to pass rules that create state, like rdr-to and nat-to. - the argument to route-to etc is a destination ip address it's not limited to a next-hop address (thought a next-hop can be a destination address). this allows for the failover and load balancing referred to above. - deprecates the address@interface host syntax in pfctl because routing is done entirely by IPs, the interface is derived from the route lookup, not pf. any attempt to use the @interface syntax will fail now in all contexts. there's enthusiasm from proctor@ jmatthew@ and others ok sashan@ bluhm@
2021-01-28	bridge(4): convert ifunit() to if_unit(9)	mvs
	ok bluhm@ sashan@
2021-01-28	trunk(4): convert ifunit to if_unit(9)	mvs
	ok bluhm@
2021-01-28	handle "once" rules before letting pfsync defer tx of a packet.	David Gwynne
	pfsync may want to defer the transmission of a packet. it does this so it can try and get a state over to a peer firewall before a host may send a reply to the peer, which would get dropped cos there's no matching state. i think the once rule processing should happen before that. the state is created from the rule, whether the packet the state is for goes out immediately or not shouldn't matter. ok sashan@
2021-01-27	if the route resolved in pf_route is invalid, generate an icmp error.	David Gwynne
	of course this is limited to the !dup-to case. ok sashan@ bluhm@
2021-01-27	have pf_route{,6} clear the pf_pdesc mbuf ref early for route-to/reply-to.	David Gwynne
	pf_route and pf_route6 are called to take over delivery of the packet with route-to and reply-to instead of letting it get processed normally. for the dup-to handling, it copies the mbuf but leaves the original mbuf in place. pf_route takes over the packet by clearing the mbuf pointer in the pf_pdesc struct. this diff moves the clearing of that pointer to the start of the function, rather than checking for dup-to again on the way out of the function. i think this is better because it means that it's more robust in the face of future code changes. even if that's not true, it's still shorter code in a forwarding path. ok sashan@ jmatthew@
2021-01-27	don't run copies of packets made by dup-to through pf_test.	David Gwynne
	dup-to is kind of like what you do with a span port, but is a bit more fine grained. it copies packets in a connection out an interface so that connection can be monitored. it doesnt make sense for pf to see the copied packets and try to match or create new states for them either. at best it needs config to stop pf seeing the copies (eg, set skip on $dup_to_tgt_if). at worst it breaks the connections you're monitoring because the states in pf get confused. found while discussing larger route-to changes on tech@. ok bluhm@ sashan@
2021-01-25	We have this sequence in bridge(4) ioctl(2) path:	mvs
	ifs = ifunit(req->ifbr_ifsname); if (ifs == NULL) { error = ENOENT; break; } if (ifs->if_bridgeidx != ifp->if_index) { error = ESRCH; break; } bif = bridge_getbif(ifs); This sequence repeats 8 times. Also we don't check value returned by bridge_getbig() before use. Newly introduced bridge_getbig() function replaces this sequence. This not only reduces duplicated code but also makes `bif' dereference safe. ok bluhm@
2021-01-25	Fix wg(4) ioctl to be able to handle multiple wgpeers.	YASUOKA Masahiko
	Diff from Yuichiro NAITO. ok procter
2021-01-21	vlan(4): convert ifunit() to if_unit(9)	mvs
	ok dlg@ kn@
2021-01-21	let vfs keep track of nonblocking state for us.	David Gwynne
	ok claudio@ mvs@
2021-01-20	An invalid packet may not have set src and dst in packet descriptor.	Alexander Bluhm
	Add a NULL check to prevent crash in pflog(4) introduced in previous commit. Reported-by: syzbot+c6d2f2ad34b822bce98a@syzkaller.appspotmail.com
2021-01-20	Print rewritten addresses in tcpdump(8) logged with pflog(4) for	Alexander Bluhm
	rdr-to, nat-to, af-to rules. The kernel uses the information from the packet description and fills it into the fields in the pflog header. While doing this, it is trival to figure out whether the packet has been rewritten. OK sashan@
2021-01-19	pflog(4) tried to log the translated packet with rdr-to, nat-to,	Alexander Bluhm
	and af-to addresses and ports applied. Therefore it created a mbuf chain on the stack with a partial copy. This is too complicated for IP options, extension header, NAT46 af-to, and fragmented mbuf chains. It even caused a crash in syzkaller. Usually the length checks in pf_setup_pdesc() rejected the faked mbuf and the goto copy logged the packet unmodified. Remove the pflog_mtap() function and call bpf_mtap_hdr() directly. As the old buggy code was bypassed in most cases, tcpdump(8) output of pflog does not change. Uncondionally log the unmodified packet. Reported-by: syzbot+947e89e06ac3fec187d0@syzkaller.appspotmail.com OK sashan@
2021-01-19	pipex(4): convert ifunit() to if_unit(9)	mvs
	ok dlg@
2021-01-19	switch(4): convert ifunit to if_unit(9)	mvs
	ok dlg@
2021-01-19	pppoe(4): convert ifunit() to if_unit(9)	mvs
	ok dlg@ kn@
2021-01-19	pipex(4): convert ifunit() to if_unit(9)	mvs
	ok dlg@
2021-01-19	gre(4): convert ifunit() to if_unit(9)	mvs
	ok dlg@
2021-01-19	tpmr(4): convert ifunit() to if_unit(9)	mvs
	ok dlg@
2021-01-19	bpe(4): convert ifunit() to if_unit(9)	mvs
	ok dlg@
2021-01-19	aggr(4): convert ifunit() to if_unit(9)	mvs
	ok dlg@
2021-01-18	Convert ifunit() to if_unit(9).	mvs
	ok sashan@
2021-01-18	Introduce new function if_unit(9). This function returns a pointer the	mvs
	interface descriptor corresponding to the unique name. This descriptor is guaranteed to be valid until if_put(9) is called on the returned pointer. if_unit(9) should replace already existent ifunit() which returns descriptor not safe for dereference when context was switched. This allow us to avoid some use-after-free issues in ioctl(2) path. Also this unifies interface descriptor usage. ok claudio@ sashan@
2021-01-17	don't encode the mbuf prio as part of the vlan tag in bpf_mtap_ether.	David Gwynne
	the vlan tag we're injecting into the mbuf chain is either straight off the wire and therefore already has the vlan priority encoded, or is straight after it's been set up by vlan(4), which also has the prio already encoded. ok kn@ visa@ mvs@
2021-01-16	The sysctl variable net.inet.ip.forwarding is checked before	Alexander Bluhm
	ip_input() passes the packet to ip_forward(). But with an af-to rule, pf(4) calls ip_forward() directly. Check the forwarding sysctl also in pf to get consistent behavior. This requires to set both ip and ip6 forwarding to get packet flow in both directions over af-to rules. OK kn@
2021-01-15	Remove a check that bypasses pf state tests. It dates back to 2003	Alexander Bluhm
	when NAT was implemented differently. Now it does not seem to make sense anymore. sashan@ has identified cases where it does harm. dlg@ wants to remove it to simplify route-to code. from dlg@; OK sashan@
2021-01-14	Fix build without carp: ifp0 is only used within #if NCARP > 0.	Theo Buehler
	ok kn mvs
2021-01-13	Link pflog(4) instances to `pflog_ifs' list instead of allocating	mvs
	`pflogifs' array. This was done to prevent panics caused by internal malloc(9) limit. Also we avoid the case while single pflog(4) interface with a high index allocates an array for all indices below and eats up kernel memory. Since we have a very little count of pflog(4) interfaces linear search does not performance impact. ok bluhm@ claudio@ kn@
2021-01-13	Send without kernel lock	kn
	The output path can run without kernel lock just fine as is. Looking at CVS log, it seems this was not done during import because IFXF_MPSAFE only became a thing afterwards. OK mvs
2021-01-12	Sometimes a user ID was logged in pflog(4) although the logopt of	Alexander Bluhm
	the rule did not specify it. Check the option again for the log rule in case another rule has triggered a socket lookup. Remove logopt group, it is not documented and cannot work as struct pfloghdr does not contain a gid. Rename PF_LOG_SOCKET_LOOKUP to PF_LOG_USER to express what it does. The lookup involved is only an implemntation detail. OK kn@ sashan@ mvs@
2021-01-11	Remove unused start routine	kn
	pflog(4) does not send or generate packets by design. OK mvs sashan
2021-01-09	Enforce range with sysctl_int_bounded in etherip_sysctl	gnezdo
	OK millert@
2021-01-09	Enforce range with sysctl_int_bounded in pipex_sysctl	gnezdo
	OK millert@
2021-01-09	Syzkaller has found a stack overflow in socket splicing. Broadcast	Alexander Bluhm
	packets were resent through simplex broadcast delivery and socket splicing. Although there is an M_LOOP check in somove(9), it did not take effect. if_input_local() cleared the M_BCAST and M_MCAST flags with m_resethdr(). As if_input_local() is used for broadcast and multicast delivery, it was a mistake to delete them. Keep the M_BCAST and M_MCAST mbuf flags when packets are reinjected into the network stack. Reported-by: syzbot+a43ace363f1b663238f8@syzkaller.appspotmail.com OK anton@; discussed with claudio@
2021-01-08	don't check local carp addresses as part of the antispoof checks.	David Gwynne
	bridge(4) drops packets coming from somewhere else that have a source MAC address that's owned by one of the interfaces that's a member of the bridge. because this check was done with bridge_ourether, it included the addresses of active carp interfaces hanging off these member interfaces. this meant if the local machine is the carp master while another machine is trying to preempt it by sending hellos, the packets from the other machine were dropped because the local one is already the master. carp roles are supposed to move around a l2 network, so another host sending a packet with a carp mac address is actually normal and necessary. found by and fix tested by stsp@ ok stsp@ claudio@
2021-01-05	pppoeintr() is no more	kn

2021-01-04	Process pppoe(4) packets directly, do not queue through netis	kn
	Less scheduling, lock contention and queues. Previously, if_netisr() handled the net lock around those calls, now if_input_process() does it before calling ether_input(), so no need to add or remove NET_*LOCK() anywhere. OK mvs claudio
2021-01-04	Remove kernel lock from pppoe(4) input path	kn
	"struct pppoe_softc" documents no member being protected by the kernel lock (alone); further review of the code paths starting from pppoeintr() shows no sleeping points which must be avoided in the softnet thread. Everything is fine as is to run without the big lock, so remove it. Tests sthen Feedback mpi mvs OK mvs claudio
2021-01-04	Minor refactoring in pf(4). Note that struct pfsync_state is no	Alexander Bluhm
	longer memcopied but assigned. Alignment should not be an issue as it is __packed. Part of a larger diff from dlg@; OK dlg@ sashan@
2021-01-04	Remove unused `pipex_iface_context' struct.	mvs
	ok ok@ yasuoka@
2021-01-02	Don't call if_deactivate() in switch_clone_destroy(). Following	mvs
	if_detach() will do this. ok kn@