summaryrefslogtreecommitdiff
path: root/sys/netinet/ip_input.c
AgeCommit message (Collapse)Author
2011-07-06cosnistently use IFQ_SET_MAXLEN, surfaced in a discussion with + ok bluhmHenning Brauer
2011-07-05ansifyDavid Hill
ok claudio@
2011-07-04Bye bye pf_test6(). Only one pf_test function for both IPv4 and v6.Claudio Jeker
The functions were 95% identical anyway. While there use struct pf_addr in struct pf_divert instead of some union which is the same. OK bluhm@ mcbride@ and most probably henning@ as well
2011-06-15Add IP_RECVRTABLE socket option to be used with a IPPROTO_IPMike Belopuhov
level that allows one to retrieve the original routing domain of UDP datagrams diverted by the pf via "divert-to" with a recvmsg(2). ok claudio
2011-04-19reintroduce using the RB tree for local address lookups. this isDavid Gwynne
confusing because both addresses and broadcast addresses are put into the tree. there are two types of local address lookup. the first is when the socket layer wants a local address, the second is in ip_input when the kernel is figuring out the packet is for it to process or forward. ip_input considers local addresses and broadcast addresses as local, however, the handling of broadcast addresses is different depending on whether ip_directedbcast is set. if if ip_directbcast is unset then a packet coming in on any interface to any of the systems broadcast addresses is considered local, otherwise the broadcast packet must exist on the interface it was received on. the code also needs to consider classful broadcast addresses so we can continue some legacy applications (eg, netbooting old sparcs that use rarp and bootparam requests to classful broadcast addresses as per PR6382). this diff maintains that support, but restricts it to packets that are broadcast on the link layer (eg, ethernet broadcasted packets), and it only looks up addresses on the local interface. we now only support classful broadcast addresses on local interfaces to avoid weird side effects with packets routed to us. the ip4 socket layer does lookups for local addresses with a wrapper around the global address tree that rejects matches against broadcast addresses. we now no longer support bind sockets to broadcast addresses, no matter what the value of ip_directedbcast is. ok henning@ testing (and possibly ok) claudio@
2011-04-14Backout the in_iawithaddr() -> ifa_ifwithaddr() change.Claudio Jeker
There is a massive issue with broadcast addrs because ifa_ifwithaddr() handles them differently then in_iawithaddr().
2011-04-04The forced IP header pullup in the multicast case is only needed whenClaudio Jeker
the system is a multicast forwarder so move the code into that block and save a few unneeded m_pullups. Found by dlg a long time ago. OK dlg@
2011-04-04make in_iawithaddr a wrapper for ifa_ifwithaddr plus a hack for old ancientHenning Brauer
classful broadcast so we can still netboot sparc and the like. compat hack untested, i will deal with the fallout if there is any later at the same time stop exporting in_iawithaddr, everything but ip_input should (and now does) use ifa_ifwithaddr directly ok dlg sthen and agreement from many
2011-04-02rmeove the link1 hack, it is in the way, it is only half-baked and doesn'tHenning Brauer
work as you think it does, and the same can easily be achieved using pf ok claudio dlg sthen theo
2011-02-11In ip_forward() free the mbuf chain mcopy with m_freem() insteadAlexander Bluhm
of m_free(). The was no leak before as m_copym() and m_pullup() are always called with the same length. But it is better to use the correct function anyway. ok henning@ mpf@ markus@
2011-02-03ip_ttl is u_int8_t, not u_char so adjust sizeof for consistency.Todd C. Miller
No binary change. OK otto@
2010-09-08Return EACCES when pf_test() blocks a packet in ip_output(). This allowsClaudio Jeker
ip_forward() to know the difference between blocked packets and those that can't be forwarded (EHOSTUNREACH). Only in the latter case an ICMP should be sent. In the other callers of ip_output() change the error back to EHOSTUNREACH since userland may not expect EACCES on a sendto(). OK henning@, markus@
2010-08-20white space fixDavid Gwynne
2010-07-09Add support for using IPsec in multiple rdomains.Reyk Floeter
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1. Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain. ok claudio@ naddy@
2010-06-07unfortunately classful routing isn't 100% dead, mostly thanks to ancientHenning Brauer
netboot methods using rarp, thus only learning their IP address without mask. And of course the next step is a broadcast - which goes to the broadcast address calculated classful. *sigh*. PR6382 instead of storing a second broadcast address per ifaddr as we used to figure out wether we're dealing with a classful broadcast on the fly. the math is extremely cheap and all my previous profilings showed that cpu cycles are basically free, we're constrained by memory access. excellent analysis by Pascal Lalonde <plalonde at overnet.qc.ca> who also submitted the PR. claudio ok
2010-06-04Missed this file in previous commit; previous commit message was:Bret Lambert
rt_timer_queue_destroy() did not actually destroy, leading to a potential memory leak due to misleading nomenclature. Change it to actually destroy, not just clean, the the rt_timer_queue passed to it and adjust the correct caller accordingly (i.e., no need to free the mem on our own now). As a bonus, this gets rid of one of the ridiculous R_Malloc/Bzero/Free cycles, and lets us sneak another bzero -> M_ZERO conversion in. ok claudio@
2010-05-07Start cleaning up the mess called rtalloc*. Kill rtalloc2, make rtalloc1Claudio Jeker
accept flags for report and nocloning. Move the rtableid into struct route (with a minor twist for now) and make a few more codepathes rdomain aware. Appart from the pf.c and route.c bits the diff is mostly mechanical. More to come... OK michele, henning
2010-04-20remove proc.h include from uvm_map.h. This has far reaching effects, asTed Unangst
sysctl.h was reliant on this particular include, and many drivers included sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed. ok deraadt
2010-01-13no point in looking for the old "all host bits zero" broadcast address anyHenning Brauer
more here either
2010-01-13we don't need broadcast for the classful network AND broadcast for theHenning Brauer
subnet of the classful network. at least, not since 1992. ok mpf dlg bob
2010-01-13let's admit it's not 1992 any more. CIDR is around for a long time, evenHenning Brauer
that router vendor doesn't default to classful routing any more, and there really is no point in having a classful netmask and a subnetmask to split it. we still do classful guesses on the netmask if it isn't supplied by userland, but that's about it. i decided to keep ia_netmask and kill ia_subnetmask which makes this diff bigish, the classful ia_netmask wasn't really used all that much. the real changes are in in.c, the rest is mostly s/ia_subnetmask/ia_netmask. ok claudio dlg ryan
2009-12-07do not forward and drop packets with M_MCAST flag set in ip_forward()Joerg Goltermann
ok henning@, claudio@ "I think this should go in"
2009-11-19avoid overflow since protos > IPPROTO_MAX exist. From FreeBSD withOtto Moerbeek
a twist; ok millert@ kettenis@
2009-11-03rtables are stacked on rdomains (it is possible to have multiple routingClaudio Jeker
tables on top of a rdomain) but until now our code was a crazy mix so that it was impossible to correctly use rtables in that case. Additionally pf(4) only knows about rtables and not about rdomains. This is especially bad when tracking (possibly conflicting) states in various domains. This diff fixes all or most of these issues. It adds a lookup function to get the rdomain id based on a rtable id. Makes pf understand rdomains and allows pf to move packets between rdomains (it is similar to NAT). Because pf states now track the rdomain id as well it is necessary to modify the pfsync wire format. So old and new systems will not sync up. A lot of help by dlg@, tested by sthen@, jsg@ and probably more OK dlg@, mpf@, deraadt@
2009-08-23revert the icmp error diff again (r1.167-1.169)David Krause
seems to be causing some kind of memory corruption after several hours of heavy IPsec traffic. connections start becoming very slow eventually leading to all IPsec packets being lost. a reboot solves the issue for several more hours before it appears again.
2009-08-107 years ofHenning Brauer
#if 1 reasonable #else bullshit required by some committee #endif are enough. theo ok
2009-08-10we need to null mcopy, gotos bite. theo and i both missed them, theo okHenning Brauer
2009-08-10fix previous:Henning Brauer
-m_copydata istead of straight bcopy. noticed by damien -handle the pretty much impossible case that the packet header grows so much that MHLEN < 68. i bet this had been the least of our worries, in that case, but code oughta be correct anyway. ok theo and dlg
2009-08-10this is basically a fixed version of r1.165, avoid m_copym of each and everyHenning Brauer
forwarded packet in case ip_output returns an error and we have to quote some of it back in an icmp error message. this implementation done from scratch: place an mbuf on the stack. copy the pkthdr from the forwarded packet and the first 68 bytes of payload. if we need to send an icmp error, just m_copym our mbuf-on-the-stack into a real one that icmp_error can fuck with and eat as it desires. ok theo dlg
2009-07-28revert the avoidance of the mbuf copy for the icmp errors (r1.165)David Gwynne
some greater care must be taken to ensure the mbuf generated for icmp errors is a good copy.
2009-07-24for every packet we forwarded, we copied the first 68 bytes of it in caseDavid Gwynne
ip_output failed and we had to generate an icmp packet. since ip_output frees the mbuf we give it, we copied the original into a new mbuf. if ip_output succeeded, we threw the copy away. the problem with this is that copying the mbuf is about a third of the cost of ip_forward. this diff copies the data we might need onto the stack, and only builds the mbuf for the icmp error if it actually needs it, ie, if ip_output fails. this gives a noticable improvement in pps for forwarded traffic. ok claudio@ markus@ henning@ tested by markus@ and by me in production for several days at work
2009-06-05Initial support for routing domains. This allows to bind interfaces toClaudio Jeker
alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
2009-06-04the decision on wether a packet is to be delivered locally or forwardedHenning Brauer
is pretty expensive, the more the more addresses are configured locally, since we walk a list. when pf is on and we have a state key pointer, and that state key is linked to another state key, we know for sure this is not local. when it has a link to a pcb, it certainly goes to the local codepath. on a box with 1000 adresses forwarding 3 times as fast as before. theo ok
2009-05-18The routing table index rtableid has type unsigned int in the routingAlexander Bluhm
code. In pf rtableid == -1 means don't change the rtableid because of this rule. So it has to be signed int there. Before the value is passed from pf to route it is always checked to be >= 0. Change the type to int in pf and to u_int in netinet and netinet6 to make the checks work. Otherwise -1 may be used as an array index and the kernel crashes. ok henning@
2008-12-24report the number of packets that arp resolution is holding onto until itDavid Gwynne
gets a mac addr for an ip under net.inet.ip.arpqueued. ok deraadt@
2008-06-08alloc ipq's for fragment reassembly from a pool instead of usingThordur I. Bjornsson
malloc(); ok henning@ some time ago
2008-05-09divert packets to local socket without modifying the ip header;Markus Friedl
makes transparent proxies much easier; ok beck@, feedback claudio@
2008-04-24the softnet intr handlers check if the input queue has packets onDavid Gwynne
it by reading the queues head pointer. if that pointer is not null then it takes splnet and dequeues a packet for handling. this is bad because the ifqueue head is modified at splnet and the sofnet handlers read it without holding splnet. this removes that check of the head pointer and simply checks if the dequeue gave us a packet or not before proceeding. found while reading mpls code. discussed with norby@ and henning@ ok mcbride@ henning@
2008-02-05Move carp load balancing (ARP/IP) to a simpler configuration scheme.Marco Pfatschbacher
Instead of using the same IP on multiple interfaces, carp has to be configured with the new "carpnodes" and "balancing" options. # ifconfig carp0 carpnodes 1:0,2:100,3:100 balancing ip carpdev sis0 192.168.5.50 Please note, that this is a flag day for anyone using carp balancing. You'll need to adjust your configuration accordingly. Addititionally this diff adds IPv6 NDP balancing support. Tested and OK mcbride@, reyk@. Manpage help by jmc@.
2007-12-14add sysctl entry points into various network layers, in particular toTheo de Raadt
provide netstat(1) with data it needs; ok claudio reyk
2007-12-13implement sysctls to report IP, TCP, UDP, and ICMP statistics andReyk Floeter
change netstat to use them instead of accessing kvm for it. more protocols will be added later. discussed with deraadt@ claudio@ gilles@ ok deraadt@
2007-10-29MALLOC/FREE -> malloc/freeCharles Longeau
ok krw@
2007-09-10Remove the ipq locking, it isn't strictly needed right nowThordur I. Bjornsson
and is actually wrong in some cases, since we can enter functions without taking the lock because the return value of ipq_lock() isn't checked properly. However, this needs to be revisited when we start calling ip_drain() from the pool code when we are running out of memory, but this isn't done currently. OK art@, henning@
2007-09-01since theHenning Brauer
MGET* macros were changed to function calls, there wasn't any need for the pool declarations and the inclusion of pool.h From: tbert <bret.lambert@gmail.com>
2007-05-30no need to declare extern ipsec_in_use, we get it via ip_ipsp.hHenning Brauer
found by itojun
2007-05-29gain another 5+% in ip forwarding performance.Henning Brauer
boring details: skip looking for ipsec tags and descending into ip_spd_lookup if there are no ipsec flows, except in one case in ip_output (spotted by markus) where we have to if we have a pcb. ip_spd_lookup has the shortcut already, but there is enough work done before so that skipping that gains us about 5%. ok theo, markus
2007-05-28double pf performance.Henning Brauer
boring details: pf used to use an mbuf tag to keep track of route-to etc, altq, tags, routing table IDs, packets redirected to localhost etc. so each and every packet going through pf got an mbuf tag. mbuf tags use malloc'd memory, and that is knda slow. instead, stuff the information into the mbuf header directly. bridging soekris with just "pass" as ruleset went from 29 MBit/s to 58 MBit/s with that (before ryan's randomness fix, now it is even betterer) thanks to chris for the test setup! ok ryan ryan ckuethe reyk
2007-05-27-static on appropriate functionsDavid Gwynne
2007-03-18Add IP load balancing support for carp(4).Marco Pfatschbacher
This provides a similar functionality as ARP balancing, but also works for traffic that comes across routers. IPv6 is supported as well. The configuration scheme will change as soon we have sth better. Also add support for changing the MAC address on carp(4) interfaces. (code from mcbride) Tested by pyr@ and reyk@ OK mcbride@
2006-12-28check if ifqueue has anything queued before doing the dance ofTheo de Raadt
splnet/IF_DEQUEUE/splx; ok various people