Age | Commit message (Collapse) | Author |
|
divert-packet rules pf calls directly from IP layer to protocol
layer. As the former has only shared net lock, additional protection
against parallel access is needed. Kernel lock is a temporary
workaround until the socket layer is MP safe.
discussed with kettenis@ mvs@
|
|
void. Introduce mutex and refcounting for inp like in the other
PCB functions.
OK sashan@
|
|
allocate them.
Currently there are 6 rttimer_queues and not many more will follow. So
change rt_timer_queue_create() to rt_timer_queue_init() which now takes
a struct rttimer_queue * as argument which will be initialized.
Since this changes the gloabl vars from pointer to struct adjust other
callers as well.
OK bluhm@
|
|
All users use the same callback per queue so that makes sense.
Also replace rt_timer_queue_destroy() with rt_timer_queue_flush().
OK bluhm@
|
|
The callback only needs to know the rtableid all the other info from
struct rtableid is not needed.
Also change the default rttimer callback to only delete routes that are
RTF_HOST and RTF_DYNAMIC. This way 2 of the ICMP handlers can use NULL
as the callback.
OK bluhm@
|
|
rdomain. The rttimer API is rtable/rdomain aware and so there is no need
to have so many queues.
Also init the two queues (one for IPv4 and one for IPv6) early on. This
will allow the rttable code to become simpler.
OK bluhm@
|
|
to have parallel IP processing while the upper layers are still not
MP safe. Introduce ip_ours() that enqueues the packets and ipintr()
that dequeues and processes them with an exclusive netlock.
Note that we still have only one softnet task. Running IP processing
on multiple CPU will be the next step.
lots of testing Hrvoje Popovski; OK sashan@
|
|
for timeout, add sysctl bounds checking between 0 and max int, and
use time_t for absolute times.
Some code assumes that the route timeout queue can be NULL and at
some places this was checked. Better make sure that all queues
always exist. The pool_get for struct rttimer_queue is only called
from initialization and from syscall, so PR_WAITOK is possible.
Keep the special hack when ip_mtudisc is set to 0. Destroy the
queue and generate an empty one.
If redirect timeout is 0, it should not time out. Check the value
in IPv6 to make the behavior like IPv4.
Sysctl net.inet6.icmp6.redirtimeout had no effect as the queue
timeout was not modified. Make icmp6_sysctl() look like icmp_sysctl().
OK claudio@
|
|
While it makes sense to limit bind(2) of unicast addresses that overlap
each other to be all from the same UID (like 0.0.0.0:53 and 127.0.0.1:53)
it makes little sense for multicast. Multicast is delivered to all sockets
that match so there is no risk of someone stealing traffic from someone
else. This should hopefully help with mDNS as reported by robert@
OK deraadt@ bluhm@
|
|
From there it calls sbappendaddr() while holding the raw6 table
mutex. This ends in sorwakeup() where we finally grab the kernel
lock while holding a mutex. Witness detects this misuse.
Use the same solution as for PCB notify. Collect the affected PCBs
in a temporary list. The list is protected by exclusive net lock.
Reported-by: syzbot+5b2679ee9be0895d26f9@syzkaller.appspotmail.com
OK claudio@
|
|
PCBs. This make mutex and error handling easier.
OK claudio@
|
|
for PCB tables. It does not break userland build anymore.
pf_socket_lookup() calls in_pcbhashlookup() in the PCB layer. To
run pf in parallel, make parts of the stack MP safe. Protect the
list and hashes in the PCB tables with a mutex.
Note that the protocol notify functions may call pf via tcp_output().
As the pf lock is a sleeping rw_lock, we must not hold a mutex. To
solve this for now, collect these PCBs in inp_notify list and protect
it with exclusive netlock.
OK sashan@
|
|
This reverts the commit protecting the list and hashes in the PCB tables
with a mutex since the build of sysctl(8) breaks, as found by kettenis.
ok sthen
|
|
run pf in parallel, make parts of the stack MP safe. Protect the
list and hashes in the PCB tables with a mutex.
Note that the protocol notify functions may call pf via tcp_output().
As the pf lock is a sleeping rw_lock, we must not hold a mutex. To
solve this for now, collect these PCBs in inp_notify list and protect
it with exclusive netlock.
OK sashan@
|
|
function.
OK gnezdo@ mvs@ florian@ sashan@
|
|
Revert the pr_usrreqs move: syzkaller found a NULL pointer deref
and I won't be available to monitor for followup issues for a bit
|
|
then be shared among protosw structures, following the same basic
direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the
proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
|
don't declare it again in the .c file
ok dlg@ mvs@ bluhm@
|
|
needed it and some no longer need it after moving the externs from
there to <sys/protosw.h>
ok jsg@
|
|
net/if_pppx.c pointed out by jsg@
ok gnezdo@ deraadt@ jsg@ mpi@ millert@
|
|
where the IPv4 versions have been forever
ok gnezdo@ deraadt@ jsg@ mpi@ millert@
|
|
|
|
as ifaddr ia_ifa is the first field of in6_ifaddr. So the pointers
are the same, and one NULL check works for both. But in ISO C NULL
has some kind of type and this is undefined behavior. So add a
second NULL check that the compiler can optimize away. The resulting
assembler is the same.
found by kubsan; OK tobhe@
|
|
trees. ipsp_ids_lookup() returns `ids' with bumped reference
counter. original diff from mvs
ok mvs
|
|
ok jmc@ reads ok tb@
|
|
without kernel lock. Unlock the two callers in ip6_input_if() that
have been forgotten.
OK mvs@ kn@
|
|
dirty hacks, it is better to protect IPsec input and output with
kernel lock. Not much is lost as crypto needs the kernel lock
anyway. From here we can refine the lock later.
Note that there is no kernel lock in the SPD lockup path. Goal is
to keep that lock free to allow fast forwarding with non IPsec
traffic.
tested by Hrvoje Popovski; OK tobhe@
|
|
'tdb_data' struct became unused and was removed.
Tested by Hrvoje Popovski.
ok bluhm@
|
|
therefore the mandatory idiom is completely clearing structs before
building them for copyout -- that means ALMOST ALL STRUCTS, because
we never know when some architecture will pad a struct.. In two more
cases, the clearing wasn't performed.
from Reno Robert ZDI
ok millert bluhm
|
|
It checks dp in two of three places. One check got lost in revision
1.83. Do a dp == NULL once at the beginning.
OK jsg@
Reported-by: syzbot+88c0ce914a0b10b7e1c8@syzkaller.appspotmail.com
|
|
pointer is passed to the function, it will return a refcounted TDB.
The ref happens when ipsp_spd_inp() copies the pointer from
ipo->ipo_tdb. The caller of ipsp_spd_lookup() has to unref after
using it.
tested by Hrvoje Popovski; OK mvs@ tobhe@
|
|
is not always needed, but the error value is necessary for the
caller. As TDB should be refcounted, it makes not sense to always
return it. Pass an output pointer for the TDB which can be NULL.
OK mvs@ tobhe@
|
|
ICMP packet could be wrong. The mtu was taken from the loopback
interface as the tdb mtu was copied to the route too late. Without
crypto task, ipsp_process_packet() returns the EMSGSIZE error
earlier. Immediately update tdb and route mtu.
IPv4 part from markus@; OK tobhe@
|
|
in IPsec IPv6 tunnel. Implement sending ICMP6 packet too big
messages. Also implement the pf error case in ip6_forward(). While
there, do some cleanup and make the IPv4 and IPv6 code look similar.
OK tobhe@
|
|
ok phessler@
|
|
for ah, esp, and ipcomp. Move common code into ipsec_protoff()
which finds the offset of the next protocol field in the previous
header.
OK tobhe@
|
|
during path MTU discovery. ip6_forward() has to update its rt
variable to the new route in ro. Otherwise it could operate on a
freed route.
from markus@
|
|
Panic reported by Hrvoje Popovski.
|
|
'tdb_data' struct became unused and was removed.
ok bluhm@
|
|
strict. ICMP error packets generated by pf were not passed
immediately, but could be blocked. Preserve PF_TAG_GENERATED flag
in icmp_reflect() and icmp6_reflect().
reported by sf@; OK patrick@ kn@
|
|
produced ugly output. Move the function name and the newline into
the DPRINTF macro. This simplifies the debug statements.
OK tobhe@
|
|
i did test this, but i guess i was lucky. very lucky.
Coverity CID 1505114
|
|
this is in ip6_input_if just before ipv6_check returns the pointer
we end up using.
pointed out by bluhm@
|
|
this will allow these checks to be reused for ip packet inspection
in bridge, veb, and tpmr.
ok bluhm@ sashan@
|
|
are constant. Having more const makes MP review easier. More
pointers are mapped read-only in the kernel image.
OK deraadt@ mvs@
|
|
Because of this large ping packets where fragmented even if the MTU did
not indicate the need for it. This causes some trouble when system do
not expect to receive a fragmented answer from a system. One such case
is the automated link test from google routers before allowing to establish
a BGP peering session with them. In general PMTU problems should be an
issue from the past and if not it may be better to also break on ping
packets and not only for UDP and TCP. ICMP ping is normaly the first
tool in the admins toolbox to figure out network issues.
OK phessler@ florian@ bluhm@
|
|
`ps_rtableid' as atomic. This allows us to unlock setrtable(2).
ok claudio@ mpi@
|
|
functions are sysctl_int() and sysctl_rdint(). This brings us back
the 4.4BSD implementation. Then sysctl_int_bounded() builds the
magic for range checks on top. sysctl_bounded_arr() is a wrapper
around it to support multiple variables.
Introduce macros that describe the meaning of the magic boundary
values. Use these macros in obvious places.
input and OK gnezdo@ mvs@
|
|
|
|
ok gnezdo@ semarie@ mpi@
|