Age | Commit message (Collapse) | Author |
|
access to netlock protected data.
ok kn@ bluhm@
|
|
and shared netlock respectively.
OK kn@ mvs@
|
|
Netlock protects `if_list', `ifa_list' and returned `ifa' dereference,
so put netlock assertion within.
Please note, rtable_setsource() doesn't destroy data pointed by
`ar_source'. This is the `ifa_addr' data belongs to `ifa' and exclusive
netlock is required to destroy it. So the kernel lock is not required
within rt_setsource(). Take netlock by rt_setsource() caller to make
`ifa' dereference safe.
Suggestions and ok by bluhm@
|
|
access to netlock protected data. Please note, kernel lock is still
taken, as required by rtable_getsource() or BFD subsystem.
ok kn@ bluhm@
|
|
to netlock protected data.
ok kn@ bluhm@
|
|
to netlock protected data.
ok kn@ bluhm@
|
|
receive buffer. As it was done for SS_CANTSENDMORE bit, the definition
kept as is, but now these bits belongs to the `sb_state' of receive
buffer. `sb_state' ored with `so_state' when socket data exporting to the
userland.
ok bluhm@
|
|
optional.
We have no interest on pru_abort() return value. We call it only from
soabort() which is dummy pru_abort() wrapper and has no return value.
Only the connection oriented sockets need to implement (*pru_abort)()
handler. Such sockets are tcp(4) and unix(4) sockets, so remove existing
code for all others, it doesn't called.
ok guenther@
|
|
malloc(9) or pool_get(9).
Pass down a wait flag to pru_attach(). During syscall socket(2)
it is ok to wait, this logic was missing for internet pcb. Pfkey
and route sockets were already waiting.
sonewconn() must not wait when called during TCP 3-way handshake.
This logic has been preserved. Unix domain stream socket connect(2)
can wait until the other side has created the socket to accept.
OK mvs@
|
|
on pru_rcvd() return value.
Drop "pru_rcvd != NULL" check within pru_rcvd() wrapper. We only call it
if the socket's protocol have PR_WANTRCVD flag set. Such sockets are
route domain, tcp(4) and unix(4) sockets.
ok guenther@ bluhm@
|
|
Naming the list like the struct itself makes for awful grepping.
Call the global variable "ifnetlist" from now on.
There used to be kvm(3) consumers in base picking up this symbol, but those
have long been converted to other interfaces.
A few potential ports users remain, same deal as sys/net/if_var.h r1.116
"Remove struct ifnet's unused if_switchport member": they get bumped.
Previous users pointed out by deraadt
OK bluhm
|
|
|
|
Introduce in{,6}_peeraddr() and use them for inet and inet6 sockets,
except tcp(4) case.
Also remove *_usrreq() handlers.
ok bluhm@
|
|
Introduce in{,6}_sockaddr() functions, and use them for all except tcp(4)
inet sockets. For tcp(4) sockets use tcp_sockaddr() to keep debug ability.
The key management and route domain sockets returns EINVAL error for
PRU_SOCKADDR request, so keep this behaviour for a while instead of make
pru_sockaddr handler optional and return EOPNOTSUPP.
ok bluhm@
|
|
The 'proc *' arg is not used for PRU_CONTROL request, so remove it from
pru_control() wrapper.
Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for
inet6 case.
ok guenther@ bluhm@
|
|
ok bluhm@
|
|
PRU_SENDOOB request always consumes passed `top' and `control' mbufs. To
avoid dummy m_freem(9) handlers for all protocols release passed mbufs
in the pru_sendoob() EOPNOTSUPP error path.
Also fix `control' mbuf(9) leak in the tcp(4) PRU_SENDOOB error path.
ok bluhm@
|
|
ok bluhm@
|
|
There was a crash due to use after free of the ifa although it is
ref counted. As ifa_refcnt was a simple integer increment, there
may be a path where multiple CPUs access it concurrently. So change
to struct refcnt which is MP safe and provides dt(4) leak debugging.
Link level address for IPsec enc(4) and various MPLS interfaces is
special. There ifa is part of struct sc. Use refcount anyway and
add a panic to detect use after free.
bug report stsp@; OK mvs@
|
|
ok bluhm@
|
|
pfkeyv2 and route can call their output functions directly.
OK mvs@
|
|
We abort only the sockets which are linked to `so_q' or `so_q0' queues of
listening socket. Such sockets have no corresponding file descriptor and
are not accessed from userland, so PRU_ABORT used to destroy them on
listening socket destruction.
Currently all our sockets support PRU_ABORT request, but actually it
required only for tcp(4) and unix(4) sockets, so i should be optional.
However, they will be removed with separate diff, and this time PRU_ABORT
requests were converted as is.
Also, the socket should be destroyed on PRU_ABORT request, but route and
key management sockets leave it alive. This was also converted as is,
because this wrong code never called.
ok bluhm@
|
|
The former PRU_SEND error path of gre_usrreq() had `control' mbuf(9)
leak. It was fixed in new gre_send().
The former pfkeyv2_send() was renamed to pfkeyv2_dosend().
ok bluhm@
|
|
ok bluhm@
|
|
ok bluhm@
|
|
ok bluhm@
|
|
ok bluhm@
|
|
ok bluhm@
|
|
ok bluhm@
|
|
For the protocols which don't support request, leave handler NULL. Do the
NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in
such case. This will be done for all upcoming user request handlers.
ok bluhm@ guenther@
|
|
handlers into it. We want to split existing (*pr_usrreq)() to multiple
short handlers for each PRU_ request as it was already done for
PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)()
split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
|
operations.
OK mvs@
|
|
use a per rttimer struct timeout. On enqueue the struct rttimer belongs
to the timeout, in case the route is removed before the timer fires
cleanup based on the timeout_del() return value. If the timeout currently
running then just clear the rtt_rt pointer and let the timeout handle the
cleanup. This should hopefully fix the icmp_pmtu_timeout crashes reported
by some people.
OK bluhm@
|
|
|
|
In the rt msg buffer the size of the full buffer is calculated first then
filled out after allocating the mbuf. In the sysctl code this is not needed
since the buffer is already provided.
OK mvs@
|
|
check to the less awkward w->w_needed <= w->w_given.
OK bluhm@
|
|
route socket. All messages passed are by definition done. This may
allow to share more code between sysctl and route socket parsers.
OK mpi@
|
|
for the lock operation and to pass a value to the unlock operation.
sofree() still needs an extra flag to know if sounlock() should be called
or not. But sofree() is called less often and mostly without keeping the lock.
OK mpi@ mvs@
|
|
the network state can change between the two sysctl calls. Adding 10%
extra works for larger routing tables but can be too little on smaller
tables to hold even a single extra message. Instead of that add at least
1024 bytes or 10% (whichever is bigger) and round the size up to the next
page. With this there are no more sporadic errors in the bgpd integration
tests.
OK sthen@
|
|
Revert the pr_usrreqs move: syzkaller found a NULL pointer deref
and I won't be available to monitor for followup issues for a bit
|
|
then be shared among protosw structures, following the same basic
direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the
proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
|
found by kubsan; joint work with tobhe@; OK miod@
|
|
that is less likely to overflow the int type used. A BGP fullfeed is
now so big that this calculation overflowed and then got sign extended.
The result was for example 'route -n show' failures.
Problem identified with deraadt@
OK deraadt@ (more cleanup needed but this fix is a good start)
|
|
Also bfdset() calls pool_get(9) with PR_WAITOK flag so it should be done
before we check the existence of this `bfd', otherwise it could be added
multiple times.
We have BFD disabled in the default kernel so this diff is for
consistency mostly.
ok mpi@
|
|
Reported-by: syzbot+684597dbbb9b516e76ae@syzkaller.appspotmail.com
ok mpi@
|
|
When the dying network interface descriptor has if_get(9) obtained
reference owned by foreign thread, the if_detach() thread will sleep
just after it removed this interface from the interface index map.
The data related to this interface is still in routing table, so
if_get(9) called by concurrent rtm_output() thread will return NULL and
the following "ifp != NULL" assertion will be triggered.
So remove the "ifp != NULL" assertions from rtm_output() and try to grab
`ifp' as early as possible then hold it until we finish the work. In the
case we won the race and we have `ifp' non NULL, concurrent if_detach()
thread will wait us. In the case we lost we just return ESRCH.
The problem reported by danj@.
Diff tested by danj@.
ok mpi@
|
|
|
|
SS_CANTRCVMORE bits are set.
The first check required to prevent timeout_add(9) reschedule
`rop_timeout', otherwise timeout_del_barrier(9) can't help us.
The second check is for the case when shutdown(2) with SHUT_RD argument
occurred on this socket and we should not receive anything include
RTM_DESYNC packets.
ok claudio@
|
|
OK mvs@
|
|
are constant. Having more const makes MP review easier. More
pointers are mapped read-only in the kernel image.
OK deraadt@ mvs@
|