src - OpenBSD base system

Age	Commit message (Collapse)	Author
2009-11-03	rtables are stacked on rdomains (it is possible to have multiple routing	Claudio Jeker
	tables on top of a rdomain) but until now our code was a crazy mix so that it was impossible to correctly use rtables in that case. Additionally pf(4) only knows about rtables and not about rdomains. This is especially bad when tracking (possibly conflicting) states in various domains. This diff fixes all or most of these issues. It adds a lookup function to get the rdomain id based on a rtable id. Makes pf understand rdomains and allows pf to move packets between rdomains (it is similar to NAT). Because pf states now track the rdomain id as well it is necessary to modify the pfsync wire format. So old and new systems will not sync up. A lot of help by dlg@, tested by sthen@, jsg@ and probably more OK dlg@, mpf@, deraadt@
2009-10-28	*NULL store in IP_AUTH_LEVEL, IP_ESP_TRANS_LEVEL, IP_ESP_NETWORK_LEVEL,	Theo de Raadt
	IP_IPCOMP_LEVEL found by Clement LECIGNE, localhost root exploitable on userland/kernel shared vm machines (ie. i386, amd64, arm, sparc (but not sparc64), sh, ...) on OpenBSD 4.3 or older ok claudio
2009-10-25	Get rid of unused macro `la_timer'.	Michael Knudsen
	`if it is unused nuke it' claudio
2009-10-17	Allow us to accept gratuitous ARP requests in cases where the	Marco Pfatschbacher
	link-route points over the carp interface. (IP-less carpdev) The descision whether to drop an ARP query is now expressed with a goto out; rather than a second check later, which prevented the carpdev case to work. Also add some comments to make in_arpinput() easier to understand. OK henning, markus.
2009-10-06	Redo the route lookup in the output (and IPv6 forwarding) path if the	Claudio Jeker
	destination of a packet was changed by pf. This allows for some evil games with rdr-to or nat-to but is mostly needed for better rdomain/rtable support. This is a first step and more work and cleanup is needed. Here a list of what works and what does not (needs a patched pfctl): pass out rdr-to: from local rdr-to local addr works (if state tracking on lo0 is done) from remote rdr-to local addr does NOT work from local rdr-to remote works from remote rdr-to remote works pass in nat-to: from remote nat-to local addr does NOT work from remote nat-to non-local addr works non-local is an IP that is routed to the FW but is not assigned on the FW. The non working cases need some magic to correctly rewrite the incomming packet since the rewriting would happen outbound which is too late. "time to get it in" deraadt@
2009-10-04	Add (again) support for divert sockets. They allow you to:	Michele Marchetto
	- queue packets from pf(4) to a userspace application - reinject packets from the application into the kernel stack. The divert socket can be bound to a special "divert port" and will receive every packet diverted to that port by pf(4). The pf syntax is pretty simple, e.g.: pass on em0 inet proto tcp from any to any port 80 divert-packet port 1 A lot of discussion have happened since my last commit that resulted in many changes and improvements. I would really like to thank everyone who took part in the discussion especially canacar@ who spotted out which are the limitations of this approach. OpenBSD divert(4) is meant to be compatible with software running on top of FreeBSD's divert sockets even though they are pretty different and will become even more with time. discusses with many, but mainly reyk@ canacar@ deraadt@ dlg@ claudio@ beck@ tested by reyk@ and myself ok reyk@ claudio@ beck@ manpage help and ok by jmc@
2009-09-08	I had not enough oks to commit this diff.	Michele Marchetto
	Sorry.
2009-09-08	Add support for divert sockets. They allow you to:	Michele Marchetto
	- queue packets from pf(4) to a userspace application - reinject packets from the application into the kernel stack. The divert socket can be bound to a special "divert port" and will receive every packet diverted to that port by pf(4). The pf syntax is pretty simple, e.g.: pass on em0 inet proto tcp from any to any port 80 divert-packet port 8000 test, bugfix and ok by reyk@ manpage help and ok by jmc@ no objections from many others.
2009-08-23	revert the icmp error diff again (r1.167-1.169)	David Krause
	seems to be causing some kind of memory corruption after several hours of heavy IPsec traffic. connections start becoming very slow eventually leading to all IPsec packets being lost. a reboot solves the issue for several more hours before it appears again.
2009-08-20	fix indentation	Alexander Bluhm
	no binary change; ok grunk@
2009-08-12	don't confuse chars with strings; ok oga@	Martynas Venckus

2009-08-10	7 years of	Henning Brauer
	#if 1 reasonable #else bullshit required by some committee #endif are enough. theo ok
2009-08-10	we need to null mcopy, gotos bite. theo and i both missed them, theo ok	Henning Brauer

2009-08-10	fix previous:	Henning Brauer
	-m_copydata istead of straight bcopy. noticed by damien -handle the pretty much impossible case that the packet header grows so much that MHLEN < 68. i bet this had been the least of our worries, in that case, but code oughta be correct anyway. ok theo and dlg
2009-08-10	this is basically a fixed version of r1.165, avoid m_copym of each and every	Henning Brauer
	forwarded packet in case ip_output returns an error and we have to quote some of it back in an icmp error message. this implementation done from scratch: place an mbuf on the stack. copy the pkthdr from the forwarded packet and the first 68 bytes of payload. if we need to send an icmp error, just m_copym our mbuf-on-the-stack into a real one that icmp_error can fuck with and eat as it desires. ok theo dlg
2009-08-10	sockets created via a listening socket lose the rdomain and fail to work	Claudio Jeker
	therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
2009-08-09	once again ipsec tries to be clever and plays fast, this time by	Henning Brauer
	recycling an mbuf tag and changing its type. just always get a new one. theo ok
2009-08-01	timeout_add -> timeout_add_msec	Bret Lambert
	ok michele@ claudio@
2009-07-28	revert the avoidance of the mbuf copy for the icmp errors (r1.165)	David Gwynne
	some greater care must be taken to ensure the mbuf generated for icmp errors is a good copy.
2009-07-27	Define the common DiffServ Codepoints so pf(4) can use them.	Claudio Jeker
	Agreed by mcbride@, sthen@ and henning@
2009-07-26	no need to cast the return value of m_freem() to void	Thordur I. Bjornsson
	as its a void function. ok claudio@
2009-07-24	for every packet we forwarded, we copied the first 68 bytes of it in case	David Gwynne
	ip_output failed and we had to generate an icmp packet. since ip_output frees the mbuf we give it, we copied the original into a new mbuf. if ip_output succeeded, we threw the copy away. the problem with this is that copying the mbuf is about a third of the cost of ip_forward. this diff copies the data we might need onto the stack, and only builds the mbuf for the icmp error if it actually needs it, ie, if ip_output fails. this gives a noticable improvement in pps for forwarded traffic. ok claudio@ markus@ henning@ tested by markus@ and by me in production for several days at work
2009-07-13	Get rid of the token bucket filter.	Michele Marchetto
	Traffic shaping code should not be inside routing code. If you want to rate-limit use altq instead. ok claudio@ henning@ dlg@
2009-07-09	Use MAXTTL instead of the hardcoded value.	Michele Marchetto

2009-06-17	Correctly handle the carp demote counter in all input cases.	Marco Pfatschbacher
	E.g. give up the MASTER status if there's a host with a lower demote count, even if it has a higher advskew. At the moment this shouldn't cause any change, but this is a first step towards the removal of the "bump the advskew to 240 in case of errors" hack, without breaking backward compatibility. OK henning@
2009-06-09	By default, don't accept IPv4 ICMP redirects. This behaviour can be	Stuart Henderson
	changed with a sysctl, so note it in sysctl.conf. v6 needs further testing following discussions on the tech mailing list; rainer@ points out possible interactions with neighbour discovery which need to be investigated first. "go ahead on the v4 part" deraadt@
2009-06-08	remove stray * from comment, probably a rewrapping artefact	Stuart Henderson

2009-06-05	Initial support for routing domains. This allows to bind interfaces to	Claudio Jeker
	alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
2009-06-04	the decision on wether a packet is to be delivered locally or forwarded	Henning Brauer
	is pretty expensive, the more the more addresses are configured locally, since we walk a list. when pf is on and we have a state key pointer, and that state key is linked to another state key, we know for sure this is not local. when it has a link to a pcb, it certainly goes to the local codepath. on a box with 1000 adresses forwarding 3 times as fast as before. theo ok
2009-06-03	add the basic infrastructure to take advantage of TCP and UDP receive	Christian Weisgerber
	checksum offload over IPv6; ok deraadt@
2009-06-02	Shuffle function declarations a bit; ipsp_kern doesn't actually exist,	Bret Lambert
	and tdb_hash is only used in ip_ipsp.c, so there's no need to declare it as extern in ip_ipsp.h ok claudio@ henning@
2009-06-02	do the pf_pkt_addr_changed(m) magic just like gif etc	Henning Brauer
	tested by Manuel Rodriguez Morales <marodriguez at grupogdt.com>
2009-06-02	satosin was already defined in in.h, no need to redefine it here	Bret Lambert
	ok claudio@
2009-06-02	0 -> NULL	Bret Lambert
	ok claudio@
2009-06-02	Fix an off-by-one in the ddb-only debugging function tdb_hashstats.	Owain Ainsworth
	when we check if a hash chain is over 15 long, we would access one past the end of the array. change the static array size to a define because it makes this checking easier to verify. Found by Parfait. ok deraadt@.
2009-05-18	The routing table index rtableid has type unsigned int in the routing	Alexander Bluhm
	code. In pf rtableid == -1 means don't change the rtableid because of this rule. So it has to be signed int there. Before the value is passed from pf to route it is always checked to be >= 0. Change the type to int in pf and to u_int in netinet and netinet6 to make the checks work. Otherwise -1 may be used as an array index and the kernel crashes. ok henning@
2009-03-15	Introduce splsoftassert(), similar to splassert() but for soft interrupt	Miod Vallat
	levels. This will allow for platforms where soft interrupt levels do not map to real hardware interrupt levels to have soft ipl values overlapping hard ipl values without breaking spl asserts.
2009-02-16	pfsync v5, mostly written at n2k9, but based on work done at n2k8.	David Gwynne
	WARNING: THIS BREAKS COMPATIBILITY WITH THE PREVIOUS VERSION OF PFSYNC this is a new variant of the protocol and a large reworking of the pfsync code to address some performance issues. the single largest benefit comes from having multiple pfsync messages of different types handled in a single packet. pfsyncs handling of pf states is highly optimised now, along with packet parsing and construction. huggz for beck@ for testing. huge thanks to mcbride@ for his help during development and for finding all the bugs during the initial tests. thanks to peter sutton for letting me get credit for this work. ok beck@ mcbride@ "good." deraadt@
2009-01-30	When don't-fragment packets need to get fragemnted some code tries to	Claudio Jeker
	update the route specific MTU from the interface (because it could have changed in between). This only makes sense if we actually have a valid route but e.g. multicast traffic does no route lookup and so there is no route at all and we don't need to update anything. Hit by dlg@'s pfsync rewrite which already found 3 other bugs in the network stack and slowly makes us wonder how it worked in the first place. OK mcbride@ dlg@
2009-01-29	Always zero the IP checksum field for packets and packet fragments	Christian Weisgerber
	being passed down if using HW checksum offload. From Brad, inspired by NetBSD/FreeBSD. ok markus@
2009-01-27	In IPsec acquire mode, if the flow was configured for the "any"	Alexander Bluhm
	network 0.0.0.0/0 or ::/0, the SA was established for the IP address in the packet instead of the network in the flow. That means the SA was not negotiated for the network 0.0.0.0 with mask 0 but for the remote IP with mask 255.255.255.255. This SA did not match the flow and did not work. To differentiate between general flows that are used to trigger specific host-to-host SAs and flows for matching network SAs, the if condition only uses the ipo->ipo_dst field now. For a flow without peer, an SA must be negotiated for each host-to-host combination. Otherwise, if a peer exists at the flow, the kernel acquires one SA for the whole network. tested by todd@, ok hshoexer@, angelos@, todd@
2008-12-24	Fix two mbuf leaks in arpresolve. The first one happens on IFF_NOARP	Claudio Jeker
	interfaces and is probably never hit. The other one happens when the number of packets on the arp hold queue is exceeded. If arpresolve() returns NULL the mbuf must be on the hold queue or freed. Fixes the mbuf leak seen by dlg@. Found with dlg@'s insane mbuf leak diff. OK dlg@
2008-12-24	report the number of packets that arp resolution is holding onto until it	David Gwynne
	gets a mac addr for an ip under net.inet.ip.arpqueued. ok deraadt@
2008-11-26	call pf_pkt_addr_changed() when we do encapsulate	Henning Brauer
	fixes v6-over-v4 gifs wrt pf chatter about state linking mismatches ok jsing claudio, tested by Ant La Porte <ant at ukbsd.org>
2008-11-08	fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom	David Gwynne
	ok deraadt@ otto@
2008-11-02	Remove the M_ANYCAST6 mbuf flag by doing the detection all in ip6_input().	Claudio Jeker
	M_ANYCAST6 was only used to signal tcp6_input() that it should drop the packet and send back icmp error. This can be done in ip6_input() without the need for a mbuf flag. Gives us back one slot in m_flags for possible future need. Looked at and some input by naddy@ and henning@. OK dlg@
2008-10-31	Be way more strict in the number of packets allowed to be queued in the	Claudio Jeker
	arp layer. With a lot of input from deraadt@. OK dlg@, looks good gollo@ + deraadt@
2008-10-30	Arpresolve could loose few packets during resolving an ethernet	Joerg Goltermann
	address. This cvs commit introduces a queue that buffers a small burst of packets and resending the packets in correct order when the ethernet address is resolved. Code written by Armin Wolfermann <aw@osn.de>. OK: claudio@ henning@
2008-10-28	Do not keep retrying to send advertisements if there is	Marco Pfatschbacher
	no carpdev configured. I don't see how we can run into this at all, but let's leave this test for a a little extra safety. OK henning@
2008-10-23	use the correct idiom for NFOO things which come from "foo.h" files	Theo de Raadt
	ok dlg