Age | Commit message (Collapse) | Author |
|
If you set FIONBIO on a bpf(4) descriptor you enable non-blocking mode
and also clobber any read timeout set for the descriptor. The reverse
is also true: do BIOCSRTIMEOUT and you'll set a timeout and
simultaneously disable non-blocking status. The two are mutually
exclusive.
This relationship is undocumented and might cause a bug. At the
very least it makes reasoning about the code difficult.
This patch adds a new member to bpf_d, bd_rnonblock, to store the
non-blocking status of the descriptor. The read timeout is still
kept in bd_rtout.
With this in place, non-blocking status and the read timeout can
coexist. Setting one state does not clear the other, and vice versa.
Separating the two states also clears the way for changing the bpf(4)
read timeout to use the system clock instead of ticks. More on that
in a later patch.
With insight from dlg@ regarding the purpose of the read timeout.
ok dlg@
|
|
|
|
descriptors runs below the low watermark.
The em(4) firmware seems not to work properly with just a few descriptors in
the receive ring. Thus, we use the low water mark as an indicator instead of
zero descriptors, which causes deadlocks.
ok kettenis@
|
|
encryption or decryption. This allows us to keep plaintext and encrypted
network traffic seperated and reduces the attack surface for network
sidechannel attacks.
The only way to reach the inner rdomain from outside is by successful
decryption and integrity verification through the responsible Security
Association (SA).
The only way for internal traffic to get out is getting encrypted and
moved through the outgoing SA.
Multiple plaintext rdomains can share the same encrypted rdomain while
the unencrypted packets are still kept seperate.
The encrypted and unencrypted rdomains can have different default routes.
The rdomains can be configured with the new SADB_X_EXT_RDOMAIN pfkey
extension. Each SA (tdb) gets a new attribute 'tdb_rdomain_post'.
If this differs from 'tdb_rdomain' then the packet is moved to
'tdb_rdomain_post' afer IPsec processing.
Flows and outgoing IPsec SAs are installed in the plaintext rdomain,
incoming IPsec SAs are installed in the encrypted rdomain.
IPCOMP SAs are always installed in the plaintext rdomain.
They can be viewed with 'route -T X exec ipsecctl -sa' where X is the
rdomain ID.
As the kernel does not create encX devices automatically when creating
rdomains they have to be added by hand with ifconfig for IPsec to work
in non-default rdomains.
discussed with chris@ and kn@
ok markus@, patrick@
|
|
ignored. Initialize 'error' to 0.
CID 1483380
ok mpi@
|
|
Reported-by: syzbot+d0639632a0affe0a690e@syzkaller.appspotmail.com
Reported-by: syzbot+ae5e359d7f82688edd6a@syzkaller.appspotmail.com
OK anton@
|
|
when using pppac without pipex.
ok dlg
|
|
address as the one trying to be inserted.
Such entry must stay in the table as long as its parent route exist. If
a code path tries to re-insert a route with the same destination address
on the same interface it is a bug.
Avoid the "route contains no arp information" problem reported by sthen@
and Laurent Salle.
ok claudio@
|
|
Prevent concurrency in the socket layer which is not ready for that.
Two recent data corruptions in pfsync(4) and the socket layer pointed
out that, at least, tun(4) was incorrectly using NET_RUNLOCK(). Until
we find a way in software to avoid future mistakes and to make sure that
only the softnet thread and some ioctls are safe to use a read version
of the lock, put everything back to the exclusive version.
ok stsp@, visa@
|
|
it needs NET_LOCK because it modifies if_flags and if_pcount.
ok visa@
|
|
if_pcount is only touched in ifpromisc(), and ifpromisc() needs
NET_LOCK anyway because it also modifies if_flags.
suggested by mpi@
ok visa@
|
|
aggr_p_dtor() calls ifpromisc(), and ifpromisc() callers need to
be holding NET_LOCK to make changes to if_flags and if_pcount, and
before calling the interfaces ioctl to apply the flag change.
i found this while reading code with my eyes, and was able to trigger
the NET_ASSERT_LOCKED in the vlan_ioctl path.
ok visa@
|
|
tpmr_p_dtor() calls ifpromisc(), and ifpromisc() callers need to
be holding NET_LOCK to make changes to if_flags and if_pcount, and
before calling the interfaces ioctl to apply the flag change.
found by hrvoje popovski who was testing tpmr with vlan interfaces.
vlan(4) asserts that the net lock is held in it's ioctl path, which
started this whole bug hunt.
ok visa@ (who came up with a similar diff, which hrvoje tested)
|
|
promiscuous mode from bridge(4). This fixes a regression of r1.332
of sys/net/if_bridge.c.
splassert with bridge(4) and vlan(4) reported by David Hill
OK mpi@, dlg@
|
|
|
|
ok mpi@
|
|
Prevent a data corruption on a UDP receive socket buffer reported by
procter@ who triggered it with wireguard-go.
The symptoms are underflow of sb_cc/sb_datacc/sb_mcnt.
ok visa@
|
|
|
|
|
|
Removing a malloc(9) with M_WAITOK reduces possible context switches which
helps when dealing with parallelism issues.
From Vitaliy Makkoveev.
|
|
for example, with locking assertions.
OK mpi@, anton@
|
|
From Vitaliy Makkoveev
OK yasuoka@
|
|
From Vitaliy Makkoveev
|
|
From Vitaliy Makkoveev
|
|
This way pppx(4) and pppac(4) can be further unified. This is an
intermediary step that does not introduce any behaviour change.
From Vitaliy Makkoveev
|
|
Issue reported by and fix from Vitaliy Makkoveev.
|
|
The timeout code currently assumes that the `session' descriptor it deals
with is independently allocated. This isn't true for pppx(4) and result
in memory corruption. So disable the feature until the code is fixed.
Bug reported and fix provided by Vitaliy Makkoveev.
|
|
This makes a pattern emerge that should help when starting to protect
the global `session' list with something else than the KERNEL_LOCK().
from Vitaliy Makkoveev.
|
|
This function calls pipex_destroy_session() which requires the lock and
pipex_ioctl() already calls it with the NET_LOCK() held.
From Vitaliy Makkoveev.
|
|
CID 1453252
ok claudio@ mpi@
|
|
each dereference. r1.275 added a check at the top of the function,
with an immediate "return (-1)" if src == NULL. Thus making the
repeated checks in the body superfluous.
CID 1452932.
ok millert@ mpi@
|
|
From Benjamin Baier, ok tobhe@
|
|
coverity CID 1486819
pointed out by and ok tobhe@
|
|
returning a (possibly uninitialized) value.
CID 1483466.
ok millert@
|
|
Feedback from and ok dlg@
ok kn@ todd@
|
|
ok dlg@ deraadt@
|
|
adding more filter properties without cluttering the struct.
OK mpi@, anton@
|
|
|
|
Do not include <sys/kthread.h> where it is not needed and stop including
<sys/proc.h> in it.
ok visa@, anton@
|
|
Diff from Jan Stary.
ok claudio
|
|
The 3 subsystems: signal, poll/select and kqueue can now be addressed
separatly.
Note that bpf(4) and audio(4) currently delay the wakeups to a separate
context in order to respect the KERNEL_LOCK() requirement. Sockets (UDP,
TCP) and pipes spin to grab the lock for the sames reasons.
ok anton@, visa@
|
|
As vlan instances obtained from the lists are passed to if_vinput(), which
may sleep (with PF locking enabled), we only traverse the vlan lists inside
the SMR critical section, and keep the existing reference counting in place.
ok visa@ sashan@
|
|
this means a route message is sent when the interface is closed and
goes down, but also causes another route message to be sent when
the interface comes up on the next open. this is important for
things like ospfd and the ospfd regress test because they want to
know when link comes up.
the regression was pointed out by bluhm, who also helped me isolate
the problem.
|
|
this restores restores returning POLLERR when the device is gone.
ENXIO doesn't make much sense as part of a pollfd revents field.
|
|
klist_invalidate() detaches knotes from the list and rewires them
synchronously so that the original filterops routines do not get
called after the invalidation.
OK anton@, mpi@
|
|
like USB only use the former and bpf_iflist was otherwise retaining
references to a freed bpf_if.
ok sashan
|
|
|
|
without this the tun_softc is still available on the list for the
syscalls to get to, even though the device is dead and should no
longer be referenced. by leaving it in the list after the
refcnt_finalize, it was still be found and was used.
found by claudio@
jmatthew@ agrees with the change
|
|
the stack puts an mbuf on the tun ifq, and ifqs protect themselves
with a mutex. rather than invent another lock that tun can wrap
these ifq ops with and also coordinate it's conditionals (reading
and dying) with, try and reuse the ifq mtx for the tun stuff too.
because ifqs are more special than tun, this adds a special
ifq_deq_sleep to ifq code that tun can call. tun just passes the
reading and dying variables to ifq to check, but the tricky stuff
about ifqs are kept in the right place.
with this, tun_dev_read should be callable without the kernel lock.
|
|
|