Age | Commit message (Collapse) | Author |
|
produced ugly output. Move the function name and the newline into
the DPRINTF macro. This simplifies the debug statements.
OK tobhe@
|
|
i did test this, but i guess i was lucky. very lucky.
Coverity CID 1505114
|
|
this is in ip6_input_if just before ipv6_check returns the pointer
we end up using.
pointed out by bluhm@
|
|
this will allow these checks to be reused for ip packet inspection
in bridge, veb, and tpmr.
ok bluhm@ sashan@
|
|
are constant. Having more const makes MP review easier. More
pointers are mapped read-only in the kernel image.
OK deraadt@ mvs@
|
|
Because of this large ping packets where fragmented even if the MTU did
not indicate the need for it. This causes some trouble when system do
not expect to receive a fragmented answer from a system. One such case
is the automated link test from google routers before allowing to establish
a BGP peering session with them. In general PMTU problems should be an
issue from the past and if not it may be better to also break on ping
packets and not only for UDP and TCP. ICMP ping is normaly the first
tool in the admins toolbox to figure out network issues.
OK phessler@ florian@ bluhm@
|
|
`ps_rtableid' as atomic. This allows us to unlock setrtable(2).
ok claudio@ mpi@
|
|
functions are sysctl_int() and sysctl_rdint(). This brings us back
the 4.4BSD implementation. Then sysctl_int_bounded() builds the
magic for range checks on top. sysctl_bounded_arr() is a wrapper
around it to support multiple variables.
Introduce macros that describe the meaning of the magic boundary
values. Use these macros in obvious places.
input and OK gnezdo@ mvs@
|
|
|
|
ok gnezdo@ semarie@ mpi@
|
|
|
|
simplify the handling of the fragment list. Now the functions
ip_fragment() and ip6_fragment() always consume the mbuf. They
free the mbuf and mbuf list in case of an error and take care about
the counter. Adjust the code a bit to make v4 and v6 look similar.
Fixes a potential mbuf leak when pf_route6() called pf_refragment6()
and it failed. Now the mbuf is always freed by ip6_fragment().
OK dlg@ mvs@
|
|
the first cut of this diff was made with coccinelle using this spatch:
@rule@
type caddr_t;
expression m, off, len, cp;
@@
-m_copydata(m, off, len, (caddr_t)cp)
+m_copydata(m, off, len, cp)
i had fix it's opinionated idea of formatting by hand though, so
i'm not sure it was worth it.
ok deraadt@ bluhm@
|
|
ok deraadt@ dlg@
|
|
Technically the whole point of the stoeplitz API is that it's symmetric,
meaning that the order of addresses and ports doesn't matter and will produce
the same hash value.
Coverity CID 1501717
ok dlg@
|
|
via index is actually in the right rdomain for the socket.
OK bluhm@ mvs@
|
|
|
|
short TCP segments or fragments encapsulated in ESP instead of
fragmented ESP packets. Pass the don't fragment flag down along
the stack so that dynamic routes with MTU are created eventually.
with and OK markus@; OK tobhe@
|
|
|
|
drivers that implement rss and multiple rings depend on the symmetric
toeplitz code, and use it to generate a key that decides with rx
ring a packet lands on. if the toeplitz code is enabled, this diff
has the pcb and tcp layer use the toeplitz code to generate a flowid
for packets they send, which in turn is used to pick a tx ring.
because the nic and the stack use the same key, the tx and rx sides
end up with the same hash/flowid. at the very least this means that
the same rx and tx queue pair on a particular nic are used for both
sides of the connection. as the stack becomes more parallel, it
will also help keep both sides of the tcp connection processing in
the one place.
|
|
code is copied from IPv4 and adapted. Some things are changed in
v4 to make it look similar.
- ip6_forward increases the noroute error counter, do that in
ip_forward, too.
- Pass more specific sockaddr_in6 to icmp6_mtudisc_clone().
- IPv6 may also use reject routes for IPsec PMTU clones.
- To pass a route_in6 to ip6_output_ipsec_send() introduce one in
ip6_forward(). That is the same what IPv4 does. Note
that dst and sin6 switch roles.
- Copy comments from ip_output_ipsec_send() to ip6_output_ipsec_send()
to make code similar.
- Implement dynamic IPv6 IPsec PMTU routes.
OK tobhe@
|
|
associated IPv6 NDP entry is invalidated. Otherwise we end up with an
INCOMPLETE entry that can't be updated to STALE and REACHABLE by
neighbor advertisements and thus interrupting communication.
This is the same as arpinvalidate() for IPv4.
Guidance bluhm & claudio, fix proposed by claudio
OK claudio
|
|
packet. IPv6 still had the old EHOSTUNREACH code. Use the same
errno for dropped IPv6 packets as in IPv4.
OK kn@ phessler@ claudio@ florian@ sashan@
|
|
ok florian
|
|
|
|
divert_sysctl and divert6_sysctl get a tiny bit slimmer.
|
|
- Move most of the processing out of rtable.c (reasonnable tb@, ok bluhm@)
- Remove memory allocation, store pointer to existing ifaddr
- Fix tunnel interface handling
looks fine mpi@
|
|
Advised by bluhm@
|
|
Based/previous work on an idea from deraadt@
Input from claudio@, djm@, deraadt@, sthen@
OK deraadt@
|
|
could use mbuf memory after freeing it. If m_pullup() allocates a
new mbuf, the caller uses the old pointer.
found and reported by Maxime Villard, thanks
OK claudio@ markus@ denis@
|
|
The best-guessed limits will be tested by trial.
|
|
Tighter limits and OK by sashan
|
|
OK sashan
|
|
RFC 4291 dropped this requirement from RFC 3513:
o An anycast address must not be used as the source address of an
IPv6 packet.
And from that requirement draft-itojun-ipv6-tcp-to-anycast rightly
concluded that TCP connections must be prevented.
The draft also states:
The proposed method MUST be removed when one of the following events
happens in the future:
o Restriction imposed on IPv6 anycast address is loosened, so that
anycast address can be placed into source address field of the IPv6
header[...]
OK jca
|
|
to the previous behavior: In case of a tie the new implementation
would keep the current best address while the old implementation
replaced the best address.
Since IPv6 addresses are stored in a TAILQ this meant that the rewrite
would use the "oldest" address while the previous behavior was to use
the "newest".
RFC 6724 section 5 has no opinion which one is better and leaves the
tie break up to implementers.
naddy found out the hard way that this breaks his IPv6 connectivity in
case of flash renumbering events when the link on his cpe flaps and a
new prefix is used since we would always pick an old address.
While we could pick the newest address in a tie break this feels too
much like an implementation detail, a solution much more in the spirit
of IPv6 is to pick the address with the highest preferred lifetime (or
valid lifetime in case of another tie).
very patient testing naddy@
|
|
Fixes a bunch of panics reported by syzkaller.
ok florian@
Reported-by: syzbot+02f2e07964a89ab65ea4@syzkaller.appspotmail.com
Reported-by: syzbot+c26b058a499ce38f689f@syzkaller.appspotmail.com
Reported-by: syzbot+62af76d8cb7c09ac017c@syzkaller.appspotmail.com
Reported-by: syzbot+d70144b3ae2ec068e318@syzkaller.appspotmail.com
Reported-by: syzbot+3c87ca9873bfd0492f5c@syzkaller.appspotmail.com
Reported-by: syzbot+323549177062adb80f84@syzkaller.appspotmail.com
Reported-by: syzbot+e745c1c29d960337ce14@syzkaller.appspotmail.com
Reported-by: syzbot+91da988a445013baf925@syzkaller.appspotmail.com
Reported-by: syzbot+747cbcbbed6318542061@syzkaller.appspotmail.com
Reported-by: syzbot+ca5efa23e00130bc8000@syzkaller.appspotmail.com
Reported-by: syzbot+731ab8c9a0342ace4189@syzkaller.appspotmail.com
Reported-by: syzbot+6c80b815a0ff8f09be69@syzkaller.appspotmail.com
Reported-by: syzbot+7939d2c4bc9a5dfa707a@syzkaller.appspotmail.com
Reported-by: syzbot+e893fb0259640a314d06@syzkaller.appspotmail.com
Reported-by: syzbot+b6a3447070ae8ffcb125@syzkaller.appspotmail.com
Reported-by: syzbot+23c0824b688f28c79c1b@syzkaller.appspotmail.com
Reported-by: syzbot+6cc72412d8ddcf87f8a1@syzkaller.appspotmail.com
|
|
Copied over from sys/netinet/raw_ip.c:rip_input() where it appeared with
initial support for multiple routing tables.
This enforces separation between multiple raw sockets in different routing
tables, i.e. one must not see packets from the other if the rtable differs.
Observed with ping6(8)'s "-v" showing all ICMPv6 packets on its raw socket
including those produced by another ping6 with "-V1".
florian reported IPv6 route advertisments in one routing table appearing on
raw sockets in other routing tables as well.
OK claudio florian
|
|
Range violations are now consistently reported as EOPNOTSUPP.
Previously they were mixed with ENOPROTOOPT.
OK kn@
|
|
RFC 6724 section 5.
This simplifies the code considerably while extensive testing shows no
change in behaviour. It is time to volunteer some more testers.
OK denis@ some time ago.
|
|
r1.146 "Enable IPv6 routing domain support" adapted the mtod() line from the
IPV6_PIPEX case which was bogus since introduction in r1.118.
Issue found by florian, who came up with the same partial diff for SO_RTABLE
while working on rdomain aware slaacd(8).
Taken from sys/netinet/ip_output.c which does it correctly.
OK florian millert
|
|
This is the name the other BSDs use for this, there is no reason to
be different, the IPv6 RFCs call these addresses temporary, and some
software in ports wants to use this as well.
Most recently pointed out for firefox by landry.
OK claudio, sthen
|
|
ok claudio@
|
|
time_second(9) and time_uptime(9) are widely used in the kernel to
quickly get the system UTC or system uptime as a time_t. However,
time_t is 64-bit everywhere, so it is not generally safe to use them
on 32-bit platforms: you have a split-read problem if your hardware
cannot perform atomic 64-bit reads.
This patch replaces time_second(9) with gettime(9), a safer successor
interface, throughout the kernel. Similarly, time_uptime(9) is replaced
with getuptime(9).
There is a performance cost on 32-bit platforms in exchange for
eliminating the split-read problem: instead of two register reads you
now have a lockless read loop to pull the values from the timehands.
This is really not *too* bad in the grand scheme of things, but
compared to what we were doing before it is several times slower.
There is no performance cost on 64-bit (__LP64__) platforms.
With input from visa@, dlg@, and tedu@.
Several bugs squashed by visa@.
ok kettenis@
|
|
ip6_hopopts(). The value is tested and non-zero values could cause a
packet to be discarded.
Initialize the pointed at variable to 0, tweaking variable names and
associated comments.
COVERITY 1453098
ok deraadt@ mpi@
|
|
i feel like i should add IFT_L3IPVLAN here so mgre(4) can take
advantage of this too.
from Matt Dunwoodie and Jason A. Donenfeld
ok deraadt@
|
|
Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.
dlg@ said that comments should be good enough.
ok sashan@
|
|
An invalid/corrupted hop6 option in rip6_input()/ip6_savecontrol() could
lead m_copydata(9)s' check to trigger a panic.
Fix from maxv@NetBSD where the problem was also reported by syzkaller.
Reported-by: syzbot+3b07b3511b4ceb8bf1e2@syzkaller.appspotmail.com
Reported-by: syzbot+7ee0eb2691d507fcad2e@syzkaller.appspotmail.com
ok sashan@, dlg@, claudio@, deraadt@
|
|
Such routes have a valid link-local entry that should not be overwritten.
The current assert in the timeout routine doesn't give enough information
to know where the bug is, if there is still one.
This should play better with syzkaller.
ok claudio@, visa@ as part of a larger diff
|
|
Such route correspond to a locally configured address and the ND6
subsystem expect its link-local address to be always present.
Fix an issue reported by Julian Brost.
ok claudio@, visa@
|
|
Prevent concurrency in the socket layer which is not ready for that.
Two recent data corruptions in pfsync(4) and the socket layer pointed
out that, at least, tun(4) was incorrectly using NET_RUNLOCK(). Until
we find a way in software to avoid future mistakes and to make sure that
only the softnet thread and some ioctls are safe to use a read version
of the lock, put everything back to the exclusive version.
ok stsp@, visa@
|