summaryrefslogtreecommitdiff
path: root/sys/netinet/tcp_input.c
AgeCommit message (Collapse)Author
2014-04-14"struct pkthdr" holds a routing table ID, not a routing domain one.Martin Pieuchot
Avoid the confusion by using an appropriate name for the variable. Note that since routing domain IDs are a subset of the set of routing table IDs, the following idiom is correct: rtableid = rdomain But to get the routing domain ID corresponding to a given routing table ID, you must call rtable_l2(9). claudio@ likes it, ok mikeb@
2014-01-24clearing the _CSUM_IN_OK flags is now utterly pointless, was only done forHenning Brauer
statistics sideeffects before. ok lteo naddy
2014-01-23since the cksum rewrite the counters for hardware checksummed packetsHenning Brauer
are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
2014-01-07Propagate an rdomain number to the nd6_lookup independently fromMike Belopuhov
the ifp pointer which can be NULL. This prevents a crash reported by David Hill <dhill at mindcry ! org>. OK bluhm
2013-10-20Put a large chunk of the IPv6 rdomain support in-tree.Peter Hessler
Still some important missing pieces, and this is not yet enabled. OK bluhm@
2013-09-06In one core dump the pointers to socket, inpcb, tcpcb on the stackAlexander Bluhm
of tcp_input() and tcp_output() were very inconsistent. Especially the so->so_pcb is NULL which can only happen after the inp has been detached. The whole issue looks similar to the old panic: pool_do_get(inpcbpl): free list modified. http://marc.info/?l=openbsd-bugs&m=132630237316970&w=2 To get more information, add some asserts that guarantee the consistency of the socket, inpcb, tcpcb linking. They should trigger when an inp is taken from the pcb hashes after it has been freed. OK henning@
2013-08-13When net.inet.ip.sourceroute is enable, store the source routeMartin Pieuchot
of incoming IPv4 packets with the SSRR or LSRR header option in a m_tag rather than in a single static entry. Use a new m_tag type, PACKET_TAG_SRCROUTE, for this and bump PACKET_TAG_MAXSIZE accordingly. Adapted from FreeBSD r135274 with inputs from bluhm@. ok bluhm@, mikeb@
2013-07-31Move bridge_broadcast and subsequently all IPsec SPD lookup code outMike Belopuhov
of the IPL_NET. pf_test should be no longer called under IPL_NET as well. The problem became evident after the related issue was brought up by David Hill <dhill at mindcry ! org>. With input from and OK mpi. Tested by David and me.
2013-07-01The reverse parameter of in_pcblookup_listen() is a boolean and notAlexander Bluhm
a flag. Rename the variable inpl_flags in tcp_input() to inpl_reverse like in udp_input(). No binary change. OK mikeb@
2013-06-20Always make sure that the temporary TCP protocol control blockMike Belopuhov
structure is zeroed out before use. From David Hill <dhill at mindcry ! org>; ok blambert claudio henning
2013-06-09Increment udpstat.udps_nosec and tcpstat.tcps_rcvnosec in case packet isYASUOKA Masahiko
dropped by IPsec security policy. input from and ok mikeb
2013-06-03Link pf states and socket inpcbs together more tightly. The linkingAlexander Bluhm
was only done when a packet traveled up the stack from pf to tcp_input(). Now also link the state and inpcb when the packet is going down from tcp_output() to pf. As a consequence, divert-reply states where the initial SYN does not get an answer, can be handled more correctly. This change is part of a larger diff that has been backed out in 2011. Bring the feature back in small steps to see when bad things start to happen. OK henning deraadt
2013-06-03Merge the duplicate IPv4 and IPv6 checksum checking code in tcp_input()Alexander Bluhm
into one block. OK mpi@
2013-04-10Remove various external variable declaration from sources files andMartin Pieuchot
move them to the corresponding header with an appropriate comment if necessary. ok guenther@
2013-04-02Use macros sotoinpcb() and intotcpcb() instead of casts. Use NULLAlexander Bluhm
instead of 0 for pointers. No binary change. OK mpi@
2013-03-29Declare struct pf_state_key in the mbuf and in_pcb header files toAlexander Bluhm
avoid ugly casts. OK krw@ tedu@
2013-03-28code that calls timeout functions should include timeout.hTed Unangst
slipped by on i386, but the zaurus doesn't automagically pick it up. spotted by patrick
2013-03-14tedu faith(4), suggested by todd@ some weeks ago after a submission byMartin Pieuchot
dhill. ok krw@, mikeb@, tedu@ (implicit)
2013-01-17After finding the socket's inp by using the pf's statekey, resetAlexander Bluhm
the pointer to the statekey in the mbuf. When an UDP socket is spliced, pf would use this key during ip_output() although the packet went through two sockets in the meantime. Reset the mbuf's statekey in tcp_input() and udp_input() to eliminate the pointer to pf lingering in the socket buffers. OK claudio@
2013-01-17first or second coming, commie or not commie, one m in coming is sufficientHenning Brauer
ok claudio
2012-07-16add IP_IPSECFLOWINFO option to sendmsg() and recvmsg(), so npppd(4)Markus Friedl
can use this to select the IPsec tunnel for sending L2TP packets. this fixes Windows (always binding to 1701) and Android clients (negotiating wildcard flows); feedback mpf@ and yasuoka@; ok henning@ and yasuoka@; ok jmc@ for the manpage
2012-03-10Increase TCP's initial window to 10 * MSS or 14600 bytes as proposed inClaudio Jeker
draft-ietf-tcpm-initcwnd. net.inet.tcp.rfc3390 defaults to 2 now which uses the 10*MSS, setting it back to 1 brings back the old default of 4*MSS. OK sperreault@, henning@, sthen@, markus@
2011-10-15Respect the ToS setting in tcp syn+ack for IPv4, still need to fix forChristiano F. Haesbaert
IPv6. ok claudio@
2011-05-13Revert the pf->socket linking diff.Owain Ainsworth
at least krw@, pirofti@ and todd@ have been seeing panics (todd and krw with xxxterm not sure about pirofti) involving pool corruption while using this commit. krw and todd confirm that this backout fixes the problem. ok blambert@ krw@, todd@ henning@ and kettenis@ Double link between pf states and sockets. Henning has already implemented half of it. The additional part is: - The pf state lookup for outgoing packets is optimized by using mbuf->inp->state. - For incomming tcp, udp, raw, raw6 packets the socket lookup always is optimized by using mbuf->state->inp. - All protocols establish the link for incomming packets. - All protocols set the inp in the mbuf for outgoing packets. This allows the linkage beginning with the first packet for outgoing connections. - In case of divert states, delete the state when the socket closes. Otherwise new connections could match on old states instead of being diverted to the listen socket. ok henning@
2011-05-04Clean up gotos for listening sockets to make it obvious when packetsBret Lambert
are dropped and when normal program flow occurs. Change error return value of syn_cache_add() from 0 to -1 in order to clearly communicate intent. ok claudio@
2011-04-29In certain failure cases, a RST would be sent out on rdomain 0,Bret Lambert
regardless of the rdomain the packet was received on. Explicitly pass the rdomain to the tcp_respond() monstrosity to compensate for said monstricism which led to this behavior. ok claudio@
2011-04-28Make in_broadcast() rdomain aware. Mostly mechanical change.Claudio Jeker
This fixes the problem of binding sockets to broadcast IPs in other rdomains. OK henning@
2011-04-24Double link between pf states and sockets. Henning has alreadyAlexander Bluhm
implemented half of it. The additional part is: - The pf state lookup for outgoing packets is optimized by using mbuf->inp->state. - For incomming tcp, udp, raw, raw6 packets the socket lookup always is optimized by using mbuf->state->inp. - All protocols establish the link for incomming packets. - All protocols set the inp in the mbuf for outgoing packets. This allows the linkage beginning with the first packet for outgoing connections. - In case of divert states, delete the state when the socket closes. Otherwise new connections could match on old states instead of being diverted to the listen socket. ok henning@
2011-04-12put the accepted socket of a diverted connection into the routing domainMike Belopuhov
of a connection originator. this allows one to query the source rdomain with a SO_RTABLE socket option. figured out with reyk, ok claudio.
2011-04-05Replace if/else ladder with much more legible switch statement forBret Lambert
testing tcp flags. ok henning@ claudio@
2011-04-04turn some macros into functions; saves 1400+ bytes from the kernelBret Lambert
on amd64 ok claudio@
2011-04-04Instead of calling tcp_reass (tcp reassembly) with magic argumentsBret Lambert
in order to skip most of the reassembly logic and try to flush available tcp segments to the socket, just split it off into its own function and use it where appropriate. ok claudio@ henning@
2011-04-04change an if statement to a switch to reduce eye bleedageBret Lambert
no change in .o md5 "ok gcc" claudio@
2011-01-07Add socket option SO_SPLICE to splice together two TCP sockets.Alexander Bluhm
The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
2010-09-29Initialize the ts_recent (received timestamp) field in the newly createdClaudio Jeker
socket from the information we have in the syncache. Also bzero() the tcpcb that is passed to tcp_dooptions() just to be sure.
2010-09-29It is not allowed to recalculate the window scale after the initial SYN.Claudio Jeker
A session must stick to the rscale factor sent out in the SYN packet. Remove the bogus tcp_rscale() call which is done after a full established session is returned from the syncache.
2010-09-29Do not delay ACKs on connections using loopback interfaces. There is noClaudio Jeker
reason to reduce the amount of ACKs sent and delayed ACKs have a very bad interaction with the large MTU of lo(4) and the fairly small socketbuffer size. In collaboration with andre@freebsd. OK deraadt@
2010-09-24TCP send and recv buffer scaling.Claudio Jeker
Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org. Based on work by markus@ and djm@. OK dlg@, henning@, put it in deraadt@
2010-07-20Switch some obvious network stack MAC comparisons from bcmp() toMatthew Dempsky
timingsafe_bcmp(). ok deraadt@; committed over WPA.
2010-07-09Add support for using IPsec in multiple rdomains.Reyk Floeter
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1. Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain. ok claudio@ naddy@
2010-07-03Fix the naming of interfaces and variables for rdomains and rtablesPhilip Guenthe
and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0. Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped. Written by claudio@, criticized^Wcritiqued by me
2010-03-11unbreak the build with a custom kernel config including "pseudo-deviceStuart Henderson
faith 1", noticed by Andris Kadar. ok kettenis@ beck@
2010-01-15Replace pool_get() + bzero() with pool_get(..., PR_ZERO).Charles Longeau
With input from oga@ and krw@ ok oga@ krw@ thib@ markus@ mk@
2009-11-13Extend the protosw pr_ctlinput function to include the rdomain. This isClaudio Jeker
needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
2009-11-03rtables are stacked on rdomains (it is possible to have multiple routingClaudio Jeker
tables on top of a rdomain) but until now our code was a crazy mix so that it was impossible to correctly use rtables in that case. Additionally pf(4) only knows about rtables and not about rdomains. This is especially bad when tracking (possibly conflicting) states in various domains. This diff fixes all or most of these issues. It adds a lookup function to get the rdomain id based on a rtable id. Makes pf understand rdomains and allows pf to move packets between rdomains (it is similar to NAT). Because pf states now track the rdomain id as well it is necessary to modify the pfsync wire format. So old and new systems will not sync up. A lot of help by dlg@, tested by sthen@, jsg@ and probably more OK dlg@, mpf@, deraadt@
2009-08-20fix indentationAlexander Bluhm
no binary change; ok grunk@
2009-08-10sockets created via a listening socket lose the rdomain and fail to workClaudio Jeker
therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
2009-06-05Initial support for routing domains. This allows to bind interfaces toClaudio Jeker
alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
2009-06-03add the basic infrastructure to take advantage of TCP and UDP receiveChristian Weisgerber
checksum offload over IPv6; ok deraadt@
2008-11-02Remove the M_ANYCAST6 mbuf flag by doing the detection all in ip6_input().Claudio Jeker
M_ANYCAST6 was only used to signal tcp6_input() that it should drop the packet and send back icmp error. This can be done in ip6_input() without the need for a mbuf flag. Gives us back one slot in m_flags for possible future need. Looked at and some input by naddy@ and henning@. OK dlg@