summaryrefslogtreecommitdiff
path: root/sys/netinet
AgeCommit message (Collapse)Author
2009-12-07do not forward and drop packets with M_MCAST flag set in ip_forward()Joerg Goltermann
ok henning@, claudio@ "I think this should go in"
2009-11-27Add setrdomain() and getrdomain() system calls. Committing now toPhilip Guenthe
catch the libc major bump per request from deraadt@ Diff by reyk. ok guenther@
2009-11-21Add a way to bind the tunnel endpoint of a gif/gre interface into aClaudio Jeker
different rdomain than the default one. This allows to do MPLS VPNs without the MPLS madness. OK deraadt@, henning@
2009-11-20NULL dereference in IPV6_PORTRANGE and IP_IPSEC_*, found by Clement LECIGNE,Philip Guenthe
localhost DoS everywhere. To help minimize further issues, make the mbuf != NULL test explicit instead of implicit in a length test. Suggestions and initial work by mpf@ and miod@ ok henning@, mpf@, claudio@,
2009-11-19avoid overflow since protos > IPPROTO_MAX exist. From FreeBSD withOtto Moerbeek
a twist; ok millert@ kettenis@
2009-11-13Extend the protosw pr_ctlinput function to include the rdomain. This isClaudio Jeker
needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
2009-11-13Packets generated by ip_fragment() need to inherit the rdomain from theClaudio Jeker
original packet or they will trigger the diagnostic check in the interface output routines. OK jsg@
2009-11-03rtables are stacked on rdomains (it is possible to have multiple routingClaudio Jeker
tables on top of a rdomain) but until now our code was a crazy mix so that it was impossible to correctly use rtables in that case. Additionally pf(4) only knows about rtables and not about rdomains. This is especially bad when tracking (possibly conflicting) states in various domains. This diff fixes all or most of these issues. It adds a lookup function to get the rdomain id based on a rtable id. Makes pf understand rdomains and allows pf to move packets between rdomains (it is similar to NAT). Because pf states now track the rdomain id as well it is necessary to modify the pfsync wire format. So old and new systems will not sync up. A lot of help by dlg@, tested by sthen@, jsg@ and probably more OK dlg@, mpf@, deraadt@
2009-10-28*NULL store in IP_AUTH_LEVEL, IP_ESP_TRANS_LEVEL, IP_ESP_NETWORK_LEVEL,Theo de Raadt
IP_IPCOMP_LEVEL found by Clement LECIGNE, localhost root exploitable on userland/kernel shared vm machines (ie. i386, amd64, arm, sparc (but not sparc64), sh, ...) on OpenBSD 4.3 or older ok claudio
2009-10-25Get rid of unused macro `la_timer'.Michael Knudsen
`if it is unused nuke it' claudio
2009-10-17Allow us to accept gratuitous ARP requests in cases where theMarco Pfatschbacher
link-route points over the carp interface. (IP-less carpdev) The descision whether to drop an ARP query is now expressed with a goto out; rather than a second check later, which prevented the carpdev case to work. Also add some comments to make in_arpinput() easier to understand. OK henning, markus.
2009-10-06Redo the route lookup in the output (and IPv6 forwarding) path if theClaudio Jeker
destination of a packet was changed by pf. This allows for some evil games with rdr-to or nat-to but is mostly needed for better rdomain/rtable support. This is a first step and more work and cleanup is needed. Here a list of what works and what does not (needs a patched pfctl): pass out rdr-to: from local rdr-to local addr works (if state tracking on lo0 is done) from remote rdr-to local addr does NOT work from local rdr-to remote works from remote rdr-to remote works pass in nat-to: from remote nat-to local addr does NOT work from remote nat-to non-local addr works non-local is an IP that is routed to the FW but is not assigned on the FW. The non working cases need some magic to correctly rewrite the incomming packet since the rewriting would happen outbound which is too late. "time to get it in" deraadt@
2009-10-04Add (again) support for divert sockets. They allow you to:Michele Marchetto
- queue packets from pf(4) to a userspace application - reinject packets from the application into the kernel stack. The divert socket can be bound to a special "divert port" and will receive every packet diverted to that port by pf(4). The pf syntax is pretty simple, e.g.: pass on em0 inet proto tcp from any to any port 80 divert-packet port 1 A lot of discussion have happened since my last commit that resulted in many changes and improvements. I would *really* like to thank everyone who took part in the discussion especially canacar@ who spotted out which are the limitations of this approach. OpenBSD divert(4) is meant to be compatible with software running on top of FreeBSD's divert sockets even though they are pretty different and will become even more with time. discusses with many, but mainly reyk@ canacar@ deraadt@ dlg@ claudio@ beck@ tested by reyk@ and myself ok reyk@ claudio@ beck@ manpage help and ok by jmc@
2009-09-08I had not enough oks to commit this diff.Michele Marchetto
Sorry.
2009-09-08Add support for divert sockets. They allow you to:Michele Marchetto
- queue packets from pf(4) to a userspace application - reinject packets from the application into the kernel stack. The divert socket can be bound to a special "divert port" and will receive every packet diverted to that port by pf(4). The pf syntax is pretty simple, e.g.: pass on em0 inet proto tcp from any to any port 80 divert-packet port 8000 test, bugfix and ok by reyk@ manpage help and ok by jmc@ no objections from many others.
2009-08-23revert the icmp error diff again (r1.167-1.169)David Krause
seems to be causing some kind of memory corruption after several hours of heavy IPsec traffic. connections start becoming very slow eventually leading to all IPsec packets being lost. a reboot solves the issue for several more hours before it appears again.
2009-08-20fix indentationAlexander Bluhm
no binary change; ok grunk@
2009-08-12don't confuse chars with strings; ok oga@Martynas Venckus
2009-08-107 years ofHenning Brauer
#if 1 reasonable #else bullshit required by some committee #endif are enough. theo ok
2009-08-10we need to null mcopy, gotos bite. theo and i both missed them, theo okHenning Brauer
2009-08-10fix previous:Henning Brauer
-m_copydata istead of straight bcopy. noticed by damien -handle the pretty much impossible case that the packet header grows so much that MHLEN < 68. i bet this had been the least of our worries, in that case, but code oughta be correct anyway. ok theo and dlg
2009-08-10this is basically a fixed version of r1.165, avoid m_copym of each and everyHenning Brauer
forwarded packet in case ip_output returns an error and we have to quote some of it back in an icmp error message. this implementation done from scratch: place an mbuf on the stack. copy the pkthdr from the forwarded packet and the first 68 bytes of payload. if we need to send an icmp error, just m_copym our mbuf-on-the-stack into a real one that icmp_error can fuck with and eat as it desires. ok theo dlg
2009-08-10sockets created via a listening socket lose the rdomain and fail to workClaudio Jeker
therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
2009-08-09once again ipsec tries to be clever and plays fast, this time byHenning Brauer
recycling an mbuf tag and changing its type. just always get a new one. theo ok
2009-08-01timeout_add -> timeout_add_msecBret Lambert
ok michele@ claudio@
2009-07-28revert the avoidance of the mbuf copy for the icmp errors (r1.165)David Gwynne
some greater care must be taken to ensure the mbuf generated for icmp errors is a good copy.
2009-07-27Define the common DiffServ Codepoints so pf(4) can use them.Claudio Jeker
Agreed by mcbride@, sthen@ and henning@
2009-07-26no need to cast the return value of m_freem() to voidThordur I. Bjornsson
as its a void function. ok claudio@
2009-07-24for every packet we forwarded, we copied the first 68 bytes of it in caseDavid Gwynne
ip_output failed and we had to generate an icmp packet. since ip_output frees the mbuf we give it, we copied the original into a new mbuf. if ip_output succeeded, we threw the copy away. the problem with this is that copying the mbuf is about a third of the cost of ip_forward. this diff copies the data we might need onto the stack, and only builds the mbuf for the icmp error if it actually needs it, ie, if ip_output fails. this gives a noticable improvement in pps for forwarded traffic. ok claudio@ markus@ henning@ tested by markus@ and by me in production for several days at work
2009-07-13Get rid of the token bucket filter.Michele Marchetto
Traffic shaping code should not be inside routing code. If you want to rate-limit use altq instead. ok claudio@ henning@ dlg@
2009-07-09Use MAXTTL instead of the hardcoded value.Michele Marchetto
2009-06-17Correctly handle the carp demote counter in all input cases.Marco Pfatschbacher
E.g. give up the MASTER status if there's a host with a lower demote count, even if it has a higher advskew. At the moment this shouldn't cause any change, but this is a first step towards the removal of the "bump the advskew to 240 in case of errors" hack, without breaking backward compatibility. OK henning@
2009-06-09By default, don't accept IPv4 ICMP redirects. This behaviour can beStuart Henderson
changed with a sysctl, so note it in sysctl.conf. v6 needs further testing following discussions on the tech mailing list; rainer@ points out possible interactions with neighbour discovery which need to be investigated first. "go ahead on the v4 part" deraadt@
2009-06-08remove stray * from comment, probably a rewrapping artefactStuart Henderson
2009-06-05Initial support for routing domains. This allows to bind interfaces toClaudio Jeker
alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
2009-06-04the decision on wether a packet is to be delivered locally or forwardedHenning Brauer
is pretty expensive, the more the more addresses are configured locally, since we walk a list. when pf is on and we have a state key pointer, and that state key is linked to another state key, we know for sure this is not local. when it has a link to a pcb, it certainly goes to the local codepath. on a box with 1000 adresses forwarding 3 times as fast as before. theo ok
2009-06-03add the basic infrastructure to take advantage of TCP and UDP receiveChristian Weisgerber
checksum offload over IPv6; ok deraadt@
2009-06-02Shuffle function declarations a bit; ipsp_kern doesn't actually exist,Bret Lambert
and tdb_hash is only used in ip_ipsp.c, so there's no need to declare it as extern in ip_ipsp.h ok claudio@ henning@
2009-06-02do the pf_pkt_addr_changed(m) magic just like gif etcHenning Brauer
tested by Manuel Rodriguez Morales <marodriguez at grupogdt.com>
2009-06-02satosin was already defined in in.h, no need to redefine it hereBret Lambert
ok claudio@
2009-06-020 -> NULLBret Lambert
ok claudio@
2009-06-02Fix an off-by-one in the ddb-only debugging function tdb_hashstats.Owain Ainsworth
when we check if a hash chain is over 15 long, we would access one past the end of the array. change the static array size to a define because it makes this checking easier to verify. Found by Parfait. ok deraadt@.
2009-05-18The routing table index rtableid has type unsigned int in the routingAlexander Bluhm
code. In pf rtableid == -1 means don't change the rtableid because of this rule. So it has to be signed int there. Before the value is passed from pf to route it is always checked to be >= 0. Change the type to int in pf and to u_int in netinet and netinet6 to make the checks work. Otherwise -1 may be used as an array index and the kernel crashes. ok henning@
2009-03-15Introduce splsoftassert(), similar to splassert() but for soft interruptMiod Vallat
levels. This will allow for platforms where soft interrupt levels do not map to real hardware interrupt levels to have soft ipl values overlapping hard ipl values without breaking spl asserts.
2009-02-16pfsync v5, mostly written at n2k9, but based on work done at n2k8.David Gwynne
WARNING: THIS BREAKS COMPATIBILITY WITH THE PREVIOUS VERSION OF PFSYNC this is a new variant of the protocol and a large reworking of the pfsync code to address some performance issues. the single largest benefit comes from having multiple pfsync messages of different types handled in a single packet. pfsyncs handling of pf states is highly optimised now, along with packet parsing and construction. huggz for beck@ for testing. huge thanks to mcbride@ for his help during development and for finding all the bugs during the initial tests. thanks to peter sutton for letting me get credit for this work. ok beck@ mcbride@ "good." deraadt@
2009-01-30When don't-fragment packets need to get fragemnted some code tries toClaudio Jeker
update the route specific MTU from the interface (because it could have changed in between). This only makes sense if we actually have a valid route but e.g. multicast traffic does no route lookup and so there is no route at all and we don't need to update anything. Hit by dlg@'s pfsync rewrite which already found 3 other bugs in the network stack and slowly makes us wonder how it worked in the first place. OK mcbride@ dlg@
2009-01-29Always zero the IP checksum field for packets and packet fragmentsChristian Weisgerber
being passed down if using HW checksum offload. From Brad, inspired by NetBSD/FreeBSD. ok markus@
2009-01-27In IPsec acquire mode, if the flow was configured for the "any"Alexander Bluhm
network 0.0.0.0/0 or ::/0, the SA was established for the IP address in the packet instead of the network in the flow. That means the SA was not negotiated for the network 0.0.0.0 with mask 0 but for the remote IP with mask 255.255.255.255. This SA did not match the flow and did not work. To differentiate between general flows that are used to trigger specific host-to-host SAs and flows for matching network SAs, the if condition only uses the ipo->ipo_dst field now. For a flow without peer, an SA must be negotiated for each host-to-host combination. Otherwise, if a peer exists at the flow, the kernel acquires one SA for the whole network. tested by todd@, ok hshoexer@, angelos@, todd@
2008-12-24Fix two mbuf leaks in arpresolve. The first one happens on IFF_NOARPClaudio Jeker
interfaces and is probably never hit. The other one happens when the number of packets on the arp hold queue is exceeded. If arpresolve() returns NULL the mbuf must be on the hold queue or freed. Fixes the mbuf leak seen by dlg@. Found with dlg@'s insane mbuf leak diff. OK dlg@
2008-12-24report the number of packets that arp resolution is holding onto until itDavid Gwynne
gets a mac addr for an ip under net.inet.ip.arpqueued. ok deraadt@