Age | Commit message (Collapse) | Author |
|
This logic was introduced in 2013 when pf checksum fixup was
temporarily removed. After restoring the pf bahavior in 2016, it
should not be necessary anymore.
OK claudio@
|
|
introduction of tcp_send.
OK mvs@, bluhm@, gnezdo@
Reported-by: syzbot+e859fd353c90eeac26f8@syzkaller.appspotmail.com
|
|
ok bluhm@
|
|
There was a crash due to use after free of the ifa although it is
ref counted. As ifa_refcnt was a simple integer increment, there
may be a path where multiple CPUs access it concurrently. So change
to struct refcnt which is MP safe and provides dt(4) leak debugging.
Link level address for IPsec enc(4) and various MPLS interfaces is
special. There ifa is part of struct sc. Use refcount anyway and
add a panic to detect use after free.
bug report stsp@; OK mvs@
|
|
ok bluhm@
|
|
We abort only the sockets which are linked to `so_q' or `so_q0' queues of
listening socket. Such sockets have no corresponding file descriptor and
are not accessed from userland, so PRU_ABORT used to destroy them on
listening socket destruction.
Currently all our sockets support PRU_ABORT request, but actually it
required only for tcp(4) and unix(4) sockets, so i should be optional.
However, they will be removed with separate diff, and this time PRU_ABORT
requests were converted as is.
Also, the socket should be destroyed on PRU_ABORT request, but route and
key management sockets leave it alive. This was also converted as is,
because this wrong code never called.
ok bluhm@
|
|
The former PRU_SEND error path of gre_usrreq() had `control' mbuf(9)
leak. It was fixed in new gre_send().
The former pfkeyv2_send() was renamed to pfkeyv2_dosend().
ok bluhm@
|
|
ok bluhm@
|
|
ok bluhm@
|
|
are protected by netlock. They are only used as shortcut in fast
timer.
Common prefix in mld6.c is mld6.
OK mvs@
|
|
ok bluhm@
|
|
function may sleep, so holding a mutex is not possible. The same
list entry and rwlock is used for UDP multicast and raw IP delivery.
By adding a write lock, exclusive netlock is no longer necessary
for PCB notify and UDP and raw IP input.
OK mvs@
|
|
ok bluhm@
|
|
are status variables that can be used to avoid locking if timers
are not running. This should reduce contention on exclusive netlock.
OK kn@ mvs@
|
|
ok bluhm@
|
|
ok bluhm@
|
|
ok bluhm@
|
|
reassembly and IPv6 hob-by-hob header chain processing out of
ip_local() and ip6_local(), they are almost empty stubs. The check
for local deliver loop in ip_ours() and ip6_ours() is sufficient.
Recover mbuf offset and next protocol directly in ipintr() and
ip6intr().
OK mvs@
|
|
buffer. Later it may be used to protect more of the PCB or socket.
In divert input replace the kernel lock with this mutex.
OK mvs@
|
|
For the protocols which don't support request, leave handler NULL. Do the
NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in
such case. This will be done for all upcoming user request handlers.
ok bluhm@ guenther@
|
|
code is MP safe and moves from ip6_local() to ip6_ours(). If there
are any options, store the chain offset and next protocol in a mbuf
tag. When dequeuing without tag, it is a regular IPv6 header. As
mbuf tags degrade performance, use them only if a hop-by-hop header
is present. Such packets are rare and pf drops them by default.
OK mvs@
|
|
This function will help to avoid code duplication when tcp_usrreq() will
be divided to multiple handlers.
ok bluhm@
|
|
handlers into it. We want to split existing (*pr_usrreq)() to multiple
short handlers for each PRU_ request as it was already done for
PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)()
split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
|
OK claudio@
|
|
do nearly the same thing, so they should look similar.
OK sashan@
|
|
to out of memory. Use a generic idropped counter for those.
OK mvs@
|
|
TCP_INFO provides a lot of information about the TCP session of this socket.
Many processes like to peek at the rtt of a connection but this also provides
a lot of more special info for use by e.g. tcpbench(1).
While the basic minimal info is available all the time the more specific
data is only populated for privileged processes. This is done to not share
data back to userland that may allow to attack a session.
TCP_INFO is available to pledge "inet" since pledged processes like chrome
tend to use TCP_INFO when available.
OK bluhm@
|
|
Use their reference counter in more places.
The in_pcb lookup functions hold the PCBs in hash tables protected
by table->inpt_mtx mutex. Whenever a result is returned, increment
the ref count before releasing the mutex. Then the inp can be used
as long as neccessary. Unref it at the end of all functions that
call in_pcb lookup.
As a shortcut, pf may also hold a reference to the PCB. When
pf_inp_lookup() returns it, it also incements the ref count and the
caller can handle it like the inp from table lookup.
OK sashan@
|
|
NET_RLOCK_IN_IOCTL, which have the same implementation. The R and
W are hard to see, call the new macro NET_LOCK_SHARED. Rename the
opposite assertion from NET_ASSERT_WLOCKED to NET_ASSERT_LOCKED_EXCLUSIVE.
Update some outdated comments about net locking.
OK mpi@ mvs@
|
|
of significant bits on big endian machines. Bug has been introduced
in previous commit by removing the =! 0 check.
OK mvs@
|
|
the mutex for the fragment list. Move this code before the critical
section. Use ISSET() to make clear which flags are checked.
OK mvs@
|
|
Note that ip_ours() runs with shared netlock, while ip_local() has
exclusive netlock after queuing. Move existing the code into
function ip_fragcheck() and call it from ip_ours().
OK mvs@
|
|
shared net lock. ip_deliver() needs exclusive net lock. Instead
of calling ip_deliver() directly, use ip6_ours() to queue the packet.
Move the write lock assertion into ip_deliver() to catch such bugs
earlier.
The assertion was only triggered with IPv6 multicast forwarding or
router alert hop by hop option. Found by regress test.
OK kn@ mvs@
|
|
field of the route with a mutex. Keep rt_llinfo not NULL consistent
with RTF_LLINFO flag is set. Also do not put the mutex in the fast
path.
OK mpi@
|
|
instead of 's' for `tdb_sadb_mtx' mutex(9) because this is 'D'atabase.
No functional changes.
ok bluhm@
|
|
Otherwise we use `ipsecflowinfo' obtained from previous packet.
ok claudio@
|
|
tracepoint for each type of refcnt we have. As a start, add inpcb
and tdb refcnt. When the counter changes, btrace may print the
actual object, the current counter, the change value and optionally
the stack trace.
discussed with visa@; OK mpi@
|
|
prevent concurrent access to rt_llinfo from rtrequest_delete().
But the common case, when the MAC address is already known, works
without lock.
tested by Hrvoje Popovski; OK mvs@
|
|
once per function. This gives a more consistent time value.
OK claudio@ miod@ mvs@
|
|
(*if_qstart)() be always called with netlock held doesn't work anymore
with PPPOE sessions.
Introduce `pipex_list_mtx' mutex(9) and use it to protect global pipex(4)
lists and radix trees.
Protect pipex(4) `session' dereference with reference counters, because we
could sleep when accessing pipex(4) from ioctl(2) path, and this is not
possible with mutex(9) held.
ok bluhm@
|
|
counter to 0 properly. We have one reference count for the lists,
and one for the timeout handler. When the timout fires, it has to
decrement the reference to itself. Then the ipa is removed from
the lists and decremented again.
from Stefan Butz; OK tobhe@ mvs@
|
|
for the lock operation and to pass a value to the unlock operation.
sofree() still needs an extra flag to know if sounlock() should be called
or not. But sofree() is called less often and mostly without keeping the lock.
OK mpi@ mvs@
|
|
if_put(9) call means we finish work with `ifp' and it could be destroyed.
`ia' is the pointer to 'in_ifaddr' data belongs to `ifp', so we need to
release corresponding `ifp' after we finish deal with `ia'.
`if_addrlist' list destruction and ip_getmoptions() are serialized with
kernel and net locks so this is not critical, but looks inconsistent.
ok bluhm@
|
|
having it return a pointer to something that has a lifetime managed
by a lock without accounting for it or taking a reference count or
anything like that is asking for trouble. copying the address to
caller provded memory while still inside the lock is a lot safer.
discussed with visa@
ok bluhm@ claudio@
|
|
divert-packet rules pf calls directly from IP layer to protocol
layer. As the former has only shared net lock, additional protection
against parallel access is needed. Kernel lock is a temporary
workaround until the socket layer is MP safe.
discussed with kettenis@ mvs@
|
|
void. Introduce mutex and refcounting for inp like in the other
PCB functions.
OK sashan@
|
|
allocate them.
Currently there are 6 rttimer_queues and not many more will follow. So
change rt_timer_queue_create() to rt_timer_queue_init() which now takes
a struct rttimer_queue * as argument which will be initialized.
Since this changes the gloabl vars from pointer to struct adjust other
callers as well.
OK bluhm@
|
|
We already allow 240/4 in and out so lets allow it through as well.
One of many steps to make 240/4 useable.
Diff by Seth David Schoen (schoen at loyalty.org)
OK bluhm@ djm@
|
|
All users use the same callback per queue so that makes sense.
Also replace rt_timer_queue_destroy() with rt_timer_queue_flush().
OK bluhm@
|
|
always the incoming TDB that has to be checked.
from markus@
|