Age | Commit message (Collapse) | Author |
|
Each ruleset's rules are stored in a TAILQ called "ptr" with "rcount"
representing the number of rules in the ruleset; "ptr_array" points to an
array of the same length.
"ptr" is backed by pool_get(9) and may change in size as "expired" rules
get removed from the ruleset - see "once" in pf.conf(5).
"ptr_array" is allocated momentarily through mallocarray(9) and gets filled
with the TAILQ entries, so that the sole user pfsync(4) can access the list
of rules by index to pick the n-th rule during state insertion.
Remove "ptr_array" and make pfsync iterate over the TAILQ instead to get the
matching rule's index. This simplifies both code and data structures and
avoids duplicate memory management.
OK sashan
|
|
Most clonable interface drivers (except bridge, enc, loop, pppx,
switch, trunk and vlan) initialise the send queue's length to IFQ_MAXLEN
during *_clone_create() even though ifq_init(), which is eventually called
through if_attach(), does the same.
Remove all early "ifq_set_maxlen(&ifq->if_snd, IFQ_MAXLEN);" lines to leave
it to ifq_init() and have clonable drivers a tad more in sync.
OK mvs
|
|
pfsyncstart() does not require the big lock, make it use the ifq API.
OK mvs
|
|
pointer obtained by ifunit() and it's reference counter is not bumped.
This can cause use after free issue. Replace this pointer by interface
index.
ok dlg@ sashan@
|
|
ok dlg@ tobhe@
|
|
ok dlg@ tobhe@
|
|
Reported-by: syzbot+6fef0091252d57113bfb@syzkaller.appspotmail.com
ok kn@
|
|
time_second(9) and time_uptime(9) are widely used in the kernel to
quickly get the system UTC or system uptime as a time_t. However,
time_t is 64-bit everywhere, so it is not generally safe to use them
on 32-bit platforms: you have a split-read problem if your hardware
cannot perform atomic 64-bit reads.
This patch replaces time_second(9) with gettime(9), a safer successor
interface, throughout the kernel. Similarly, time_uptime(9) is replaced
with getuptime(9).
There is a performance cost on 32-bit platforms in exchange for
eliminating the split-read problem: instead of two register reads you
now have a lockless read loop to pull the values from the timehands.
This is really not *too* bad in the grand scheme of things, but
compared to what we were doing before it is several times slower.
There is no performance cost on 64-bit (__LP64__) platforms.
With input from visa@, dlg@, and tedu@.
Several bugs squashed by visa@.
ok kettenis@
|
|
Prevent concurrency in the socket layer which is not ready for that.
Two recent data corruptions in pfsync(4) and the socket layer pointed
out that, at least, tun(4) was incorrectly using NET_RUNLOCK(). Until
we find a way in software to avoid future mistakes and to make sure that
only the softnet thread and some ioctls are safe to use a read version
of the lock, put everything back to the exclusive version.
ok stsp@, visa@
|
|
ok mpi@
|
|
|
|
i think this is a fix for a real bug. pfsync leaked the hooks it
had on a parent^Wsyncdev when the parent went away. now there's
KASSERTs to make sure all hooks are removed before an interface
goes away, the leak caused the KASSERTs to fire and made the bug
obvious.
found by hrvoje popovski
|
|
this is largely mechanical, except for carp. this moves the addition
of the carp link state hook after we're committed to using the new
interface as a carpdev. because the add can't fail, we avoid a
complicated unwind dance. also, this tweaks the carp linkstate hook
so it only updates the relevant carp interface, not all of the
carpdevs on the parent.
hrvoje popovski has tested an early version of this diff and it's
generally ok, but there's some splasserts that this diff fires that
i'll fix in an upcoming diff.
ok claudio@
|
|
the main semantic change is that things registering detach hooks
have to allocate and set a task structure that then gets added to
the list. this means if the task is allocated up front (eg, as part
of carps softc or bridges port structure), it avoids the possibility
that adding a hook can fail. a lot of drivers weren't checking for
failure, and unwinding state in the event of failure in other parts
was error prone.
while doing this i discovered that the list operations have to be
in a particular order, but drivers weren't doing that consistently
either. this diff wraps the list ops up so you have to seriously
go out of your way to screw them up.
ive also sprinkled some NET_ASSERT_LOCKED around the list operations
so we can make sure there's no potential for the list to be corrupted,
especially while it's being run.
hrvoje popovski has tested this a bit, and some issues he discovered
have been fixed.
ok sashan@
|
|
ok semarie@, visa@
|
|
OK mpi@
|
|
instead
From Pamela Mosiejczuk, many thanks!
OK phessler@ deraadt@
|
|
When a pfsync interface is being deleted, all its timeout handlers and
pfsync_send_dispatch() have to stop accessing the software context
before the context is freed. Ensure sufficient synchronization by
acquiring NET_LOCK() and clearing `pfsyncif' inside the critical
section in pfsync_clone_destroy(). When a timeout handler has entered
the critical section, it has to check `pfsyncif' and bail out if the
value is NULL. pfsync_send_dispatch() already does this check.
Issue reported and fix tested by Hrvoje Popovski.
OK mpi@ bluhm@
|
|
OK bluhm@
|
|
this change adds a pf_state_lock rw-lock, which protects consistency
of state table in PF. The code delivered in this change is guarded
by 'WITH_PF_LOCK', which is still undefined. People, who are willing
to experiment and want to run it must do two things:
- compile kernel with -DWITH_PF_LOCK
- bump NET_TASKQ from 1 to ... sky is the limit,
(just select some sensible value for number of tasks your
system is able to handle)
OK bluhm@
|
|
OK bluhm@, OK mpi@, henning@, jca@
|
|
The account flag `ASU' will no longer be set but that makes suser()
mpsafe since it no longer mess with a per-process field.
No objection from millert@, ok tedu@, bluhm@
|
|
memory shortage. As it is invoked from a system call, it should
not fail and wait instead.
OK visa@ mpi@
|
|
pr_input handlers without KERNEL_LOCK().
ok visa@
|
|
Tested by Hrvoje Popovski, ok bluhm@
|
|
reported and patch tested by Hrvoje Popovski
O.K. bluhm@
|
|
pfsyncioctl() is executed with the NET_LOCK() held which is enough.
ok sashan@
|
|
ok visa@
|
|
allows to simplify code used for both IPv4 and IPv6.
OK mikeb@ deraadt@
|
|
constants.
The consensus is that if both operands are constant, we don't need
mallocarray. Reminded by tedu@
ok deraadt@
|
|
ok deraadt@
|
|
zero the buffers first. All the current objects appear to be safe,
however future changes might introduce structure pads.
Discussed with guenther, ok bluhm
|
|
Fixes a panic observed by douple-p (aka pb@) when destroying the syncdev.
tweak & ok mpi@
|
|
ok florian@
|
|
to get rid of struct ip6protosw and some wrapper functions. It is
more consistent to have less different structures. The divert_input
functions cannot be called anyway, so remove them.
OK visa@ mpi@
|
|
make the variable parameters of the protocol input functions fixed.
Also add the proto to make it similar to IPv6.
OK mpi@ guenther@ millert@
|
|
only once per packet.
Fix a regression introduced when if_input() started to be called by
every pseudo-driver.
ok claudio@, dlg@
|
|
splsofnet()/splx() dance.
Tested by Hrvoje Popovski, ok visa@
|
|
them.
ok claudio@
|
|
ok bluhm@
|
|
Prevent pf_socket_lookup() reading uninitialised header buffers on fragments.
OK blum@ sashan@
|
|
just use pd->m. Then pf_test() can also operate on pd.m and set
the *m0 value in the caller just before it returns.
OK sashan@
|
|
pf functions. That means less parameters, more consistency and
later we can call functions that need a pd from pf_route().
OK sashan@
|
|
The current reason is that rtalloc_mpath(9) inside ip_output() might
end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
|
|
|
|
|
|
|
the ioff argument to pool_init() is unused and has been for many
years, so this replaces it with an ipl argument. because the ipl
will be set on init we no longer need pool_setipl.
most of these changes have been done with coccinelle using the spatch
below. cocci sucks at formatting code though, so i fixed that by hand.
the manpage and subr_pool.c bits i did myself.
ok tedu@ jmatthew@
@ipl@
expression pp;
expression ipl;
expression s, a, o, f, m, p;
@@
-pool_init(pp, s, a, o, f, m, p);
-pool_setipl(pp, ipl);
+pool_init(pp, s, a, ipl, f, m, p);
|
|
|
|
and pretending the output succeeded. Packets are still dropped!
Idea from jsg@ following same change to bridge(4). ok mpi@
|