Age | Commit message (Collapse) | Author |
|
Technically the whole point of the stoeplitz API is that it's symmetric,
meaning that the order of addresses and ports doesn't matter and will produce
the same hash value.
Coverity CID 1501717
ok dlg@
|
|
could get stuck in an endless recursion during TCP path MTU discovery.
Create a dynamic host route in ip_output() that can be used by
tcp_mtudisc() to store the MTU.
Reported by Peter Mueller and Sebastian Sturm
OK claudio@
|
|
OK bluhm@, claudio@, deraadt@
|
|
a new object that is already refcounted, so carp attach does not
reach into internal structures. Add kasserts to detect counter
overflow or underflow.
OK mvs@
|
|
offloading. The checksum must be calculated in software. Use the
same condition in ether_resolve() to send the broadcast packet back
to the stack and in in_ifcap_cksum() to force software checksumming.
This fixes regress/sys/kern/sosplice/loop.
OK procter@
|
|
The kernel uses a huge amount of processing time for sending ACKs to the sender
on the receiving interface. After receiving a data segment, we send out two
ACKs. The first one in tcp_input() direct after receiving. The second ACK is
send out, after the userland or the sosplice task read some data out of the
socket buffer. Thus, we save some processing time and improve network
performance.
Longer tested by sthen@
OK claudio@
|
|
kernel make sure that the rdomain of that interface is the same as
the rdomain of the inpcb.
Problem spotted and fix tested by semarie@
OK bluhm@ mvs@
|
|
short TCP segments or fragments encapsulated in ESP instead of
fragmented ESP packets. Pass the don't fragment flag down along
the stack so that dynamic routes with MTU are created eventually.
with and OK markus@; OK tobhe@
|
|
OK deraadt@
|
|
drivers that implement rss and multiple rings depend on the symmetric
toeplitz code, and use it to generate a key that decides with rx
ring a packet lands on. if the toeplitz code is enabled, this diff
has the pcb and tcp layer use the toeplitz code to generate a flowid
for packets they send, which in turn is used to pick a tx ring.
because the nic and the stack use the same key, the tx and rx sides
end up with the same hash/flowid. at the very least this means that
the same rx and tx queue pair on a particular nic are used for both
sides of the connection. as the stack becomes more parallel, it
will also help keep both sides of the tcp connection processing in
the one place.
|
|
ok dlg@ bluhm@
|
|
|
|
struct ip_mreq or a struct ip_mreqn. Using struct ip_mreqn allows to
pass a interface index instead of specifying the multicast interface
via its IP address. This is also the API implemented by Linux and
FreeBSD and should help porting software.
OK bluhm@ phessler@ robert@
|
|
Relax input validation and use integer comparison.
OK kn@ mvs@ sthen@
|
|
code is copied from IPv4 and adapted. Some things are changed in
v4 to make it look similar.
- ip6_forward increases the noroute error counter, do that in
ip_forward, too.
- Pass more specific sockaddr_in6 to icmp6_mtudisc_clone().
- IPv6 may also use reject routes for IPsec PMTU clones.
- To pass a route_in6 to ip6_output_ipsec_send() introduce one in
ip6_forward(). That is the same what IPv4 does. Note
that dst and sin6 switch roles.
- Copy comments from ip_output_ipsec_send() to ip6_output_ipsec_send()
to make code similar.
- Implement dynamic IPv6 IPsec PMTU routes.
OK tobhe@
|
|
OK millert@
|
|
One case uses the explicit range from the code and the other was
inferred from reading the usage.
OK millert@
|
|
struct ip_mreqn allows to use the interface index to select the
interface for multicast packets which makes it possible to use
this with unnumbered interfaces.
OK dlg@ robert@
|
|
patch submitted by Ralf Horstmann from ackstorm.de
OK dlg@
|
|
Since revision 1.87 of ip_icmp.c icmp_mtudisc_clone() ignored reject
routes. Otherwise TCP would clone these routes for PMTU discovery.
They will not work, even after dynamic routing has found a better
route than the reject route.
With IPsec the use case is different. First you need a route, but
then the flow handles the packet without routing. Usually this
route should be a reject route to avoid sending unencrypted traffic
if the flow is missing. But IPsec needs this route for PMTU
discovery, so use it for that.
OK claudio@ tobhe@
|
|
RFC 4302 and RFC 4303). It seems this was changed by accident when support
for 64 bit sequence numbers was added.
ok bluhm@ patrick@
|
|
This eliminates the risk for IV reuse because of random collisions
and increases performance a little.
ok patrick@ markus@
|
|
|
|
divert_sysctl and divert6_sysctl get a tiny bit slimmer.
|
|
- Move most of the processing out of rtable.c (reasonnable tb@, ok bluhm@)
- Remove memory allocation, store pointer to existing ifaddr
- Fix tunnel interface handling
looks fine mpi@
|
|
Tested with multiple Window 10 Pro (ver 2004) clients, and OpenBSD+iked
as the server.
OK tobhe@ sthen@ kn@
|
|
Advised by bluhm@
|
|
OK deraadt
|
|
Based/previous work on an idea from deraadt@
Input from claudio@, djm@, deraadt@, sthen@
OK deraadt@
|
|
could use mbuf memory after freeing it. If m_pullup() allocates a
new mbuf, the caller uses the old pointer.
found and reported by Maxime Villard, thanks
OK claudio@ markus@ denis@
|
|
|
|
The best-guessed limits will be tested by trial.
|
|
The best-guessed limits will be tested by trial.
|
|
OK sashan
|
|
... these all look fine, derradt@
|
|
|
|
|
|
Thanks kettenis@ for pointing out.
ok kettenis@
|
|
This introduces bounds checks for many net.inet.tcp sysctl variables.
Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
|
This replaces a piece of observationally identical code which was much
more complicated.
ok mpi@
|
|
RFC 4291 dropped this requirement from RFC 3513:
o An anycast address must not be used as the source address of an
IPv6 packet.
And from that requirement draft-itojun-ipv6-tcp-to-anycast rightly
concluded that TCP connections must be prevented.
The draft also states:
The proposed method MUST be removed when one of the following events
happens in the future:
o Restriction imposed on IPv6 anycast address is loosened, so that
anycast address can be placed into source address field of the IPv6
header[...]
OK jca
|
|
Reported by Peter J. Philipp.
ok mvs@ deraadt@
|
|
Range violations are now consistently reported as EOPNOTSUPP.
Previously they were mixed with ENOPROTOOPT.
OK kn@
|
|
ok kn
|
|
an uvm fault. Check that ifp0 is not NULL.
OK sashan@ mvs@
|
|
Zero-tick timeouts rely on implicit behavior in the timeout layer that
inhibits optimizations in softclock().
bluhm@ says waiting a tick for the reaper shouldn't break anything.
ok bluhm@
|
|
ok sashan@
|
|
the interface input handler lists were originally set up to help
us during the intial mpsafe network stack work. at the time not all
the virtual ethernet interfaces (vlan, svlan, bridge, trunk, etc)
were mpsafe, so we wanted a way to avoid them by default, and only
take the kernel lock hit when they were specifically enabled on the
interface. since then, they have been fixed up to be mpsafe.
i could leave the list in place, but it has some semantic problems.
because virtual interfaces filter packets based on the order they
were attached to the parent interface, you can get packets taken
away in surprising ways, especially when you reboot and netstart
does something different to what you did by hand. by hardcoding the
order that things like vlan and bridge get to look at packets, we
can document the behaviour and get consistency.
it also means we can get rid of a use of SRPs which were difficult
to replace with SMRs. the interface input handler list is an SRPL,
which we would like to deprecate. it turns out that you can sleep
during stack processing, which you're not supposed to do with SRPs
or SMRs, but SRPs are a lot more forgiving and it worked.
lastly, it turns out that this code is faster than the input list
handling, so lots of winning all around.
special thanks to hrvoje popovski and aaron bieber for testing.
this has been in snaps as part of a larger diff for over a week.
|
|
carp_input is only tried after vlan and bridge handling is done,
and after the ethernet packet doesnt match the parent interfaces
mac address.
this has been in snaps as part of a larger diff for over a week.
|
|
this is the first step in refactoring how ethernet frames are demuxed
by virtual interfaces, and also in deprecating interface input list
handling.
we now have drivers for three types of virtual bridges, bridge(4),
switch(4), and tpmr(4), and it doesn't make sense for any of them
to be enabled on the same "port" interfaces at the same time.
currently you can add a port interface to multiple types of bridge,
but which one gets to steal the packets depends on the order in
which they were attached.
this creates an ether_brport structure that holds an input function
for the bridge, and optionally some per port state that the bridge
can use. arpcom has a single pointer to one of these structs that
will be used during normal ether_input processing to see if a packet
should be passed to a bridge, and will be used instead of an if
input handler. because it is a single pointer, it will make sure
only one bridge of any type is attached to a port at any one time.
this has been in snaps as part of a larger diff for over a week.
|