summaryrefslogtreecommitdiff
path: root/sys/net/if_pfsync.c
AgeCommit message (Collapse)Author
2020-08-24Remove ptr_array from struct pf_rulesetkn
Each ruleset's rules are stored in a TAILQ called "ptr" with "rcount" representing the number of rules in the ruleset; "ptr_array" points to an array of the same length. "ptr" is backed by pool_get(9) and may change in size as "expired" rules get removed from the ruleset - see "once" in pf.conf(5). "ptr_array" is allocated momentarily through mallocarray(9) and gets filled with the TAILQ entries, so that the sole user pfsync(4) can access the list of rules by index to pick the n-th rule during state insertion. Remove "ptr_array" and make pfsync iterate over the TAILQ instead to get the matching rule's index. This simplifies both code and data structures and avoids duplicate memory management. OK sashan
2020-08-21Leave default ifq_maxlen handling to ifq_init()kn
Most clonable interface drivers (except bridge, enc, loop, pppx, switch, trunk and vlan) initialise the send queue's length to IFQ_MAXLEN during *_clone_create() even though ifq_init(), which is eventually called through if_attach(), does the same. Remove all early "ifq_set_maxlen(&ifq->if_snd, IFQ_MAXLEN);" lines to leave it to ifq_init() and have clonable drivers a tad more in sync. OK mvs
2020-08-11Run start routing without KERNEL_LOCK()kn
pfsyncstart() does not require the big lock, make it use the ifq API. OK mvs
2020-07-29pfsync(4) holds pointer to corresponding `ifnet' as `sc_sync_if'. Thismvs
pointer obtained by ifunit() and it's reference counter is not bumped. This can cause use after free issue. Replace this pointer by interface index. ok dlg@ sashan@
2020-07-10Change users of IFQ_SET_MAXLEN() and IFQ_IS_EMPTY() to use the "new" API.Patrick Wildt
ok dlg@ tobhe@
2020-07-10Change users of IFQ_PURGE() to use the "new" API.Patrick Wildt
ok dlg@ tobhe@
2020-06-28state import should accept AF_INET/AF_INET6 onlyAlexandr Nedvedicky
Reported-by: syzbot+6fef0091252d57113bfb@syzkaller.appspotmail.com ok kn@
2020-06-24kernel: use gettime(9)/getuptime(9) in lieu of time_second(9)/time_uptime(9)cheloha
time_second(9) and time_uptime(9) are widely used in the kernel to quickly get the system UTC or system uptime as a time_t. However, time_t is 64-bit everywhere, so it is not generally safe to use them on 32-bit platforms: you have a split-read problem if your hardware cannot perform atomic 64-bit reads. This patch replaces time_second(9) with gettime(9), a safer successor interface, throughout the kernel. Similarly, time_uptime(9) is replaced with getuptime(9). There is a performance cost on 32-bit platforms in exchange for eliminating the split-read problem: instead of two register reads you now have a lockless read loop to pull the values from the timehands. This is really not *too* bad in the grand scheme of things, but compared to what we were doing before it is several times slower. There is no performance cost on 64-bit (__LP64__) platforms. With input from visa@, dlg@, and tedu@. Several bugs squashed by visa@. ok kettenis@
2020-04-12Stop processing packets under non-exclusive (read) netlock.Martin Pieuchot
Prevent concurrency in the socket layer which is not ready for that. Two recent data corruptions in pfsync(4) and the socket layer pointed out that, at least, tun(4) was incorrectly using NET_RUNLOCK(). Until we find a way in software to avoid future mistakes and to make sure that only the softnet thread and some ioctls are safe to use a read version of the lock, put everything back to the exclusive version. ok stsp@, visa@
2020-04-11Avoid triggering KASSERT for bogus reason in pfsync_sendout with PFSYNC_DEBUG.Stefan Sperling
ok mpi@
2020-04-11fix build with PFSYNC_DEBUG by switching a format string from %d to %zdStefan Sperling
2019-11-07remove the detach and linkstate hooks when the parent is going away.David Gwynne
i think this is a fix for a real bug. pfsync leaked the hooks it had on a parent^Wsyncdev when the parent went away. now there's KASSERTs to make sure all hooks are removed before an interface goes away, the leak caused the KASSERTs to fire and made the bug obvious. found by hrvoje popovski
2019-11-07turn the linkstate hooks into a task list, like the detach hooks.David Gwynne
this is largely mechanical, except for carp. this moves the addition of the carp link state hook after we're committed to using the new interface as a carpdev. because the add can't fail, we avoid a complicated unwind dance. also, this tweaks the carp linkstate hook so it only updates the relevant carp interface, not all of the carpdevs on the parent. hrvoje popovski has tested an early version of this diff and it's generally ok, but there's some splasserts that this diff fires that i'll fix in an upcoming diff. ok claudio@
2019-11-06replace the hooks used with if_detachhooks with a task list.David Gwynne
the main semantic change is that things registering detach hooks have to allocate and set a task structure that then gets added to the list. this means if the task is allocated up front (eg, as part of carps softc or bridges port structure), it avoids the possibility that adding a hook can fail. a lot of drivers weren't checking for failure, and unwinding state in the event of failure in other parts was error prone. while doing this i discovered that the list operations have to be in a particular order, but drivers weren't doing that consistently either. this diff wraps the list ops up so you have to seriously go out of your way to screw them up. ive also sprinkled some NET_ASSERT_LOCKED around the list operations so we can make sure there's no potential for the list to be corrupted, especially while it's being run. hrvoje popovski has tested this a bit, and some issues he discovered have been fixed. ok sashan@
2019-06-10Use mallocarray(9) & put some free(9) sizes for M_IPMOPTS allocations.Martin Pieuchot
ok semarie@, visa@
2019-06-04pfsync_sendout() requires PF_LOCK()Alexandr Nedvedicky
OK mpi@
2019-02-10"non-existant" is one of those words that don't exist, so use "non-existent"Peter Hessler
instead From Pamela Mosiejczuk, many thanks! OK phessler@ deraadt@
2018-10-03Fix a race condition that affects pfsync interface deletion.Visa Hankala
When a pfsync interface is being deleted, all its timeout handlers and pfsync_send_dispatch() have to stop accessing the software context before the context is freed. Ensure sufficient synchronization by acquiring NET_LOCK() and clearing `pfsyncif' inside the critical section in pfsync_clone_destroy(). When a timeout handler has entered the critical section, it has to check `pfsyncif' and bail out if the value is NULL. pfsync_send_dispatch() already does this check. Issue reported and fix tested by Hrvoje Popovski. OK mpi@ bluhm@
2018-10-02- pfsync: avoid a recursion on PF_LOCKAlexandr Nedvedicky
OK bluhm@
2018-09-11- moving state look up outside of PF_LOCK()Alexandr Nedvedicky
this change adds a pf_state_lock rw-lock, which protects consistency of state table in PF. The code delivered in this change is guarded by 'WITH_PF_LOCK', which is still undefined. People, who are willing to experiment and want to run it must do two things: - compile kernel with -DWITH_PF_LOCK - bump NET_TASKQ from 1 to ... sky is the limit, (just select some sensible value for number of tasks your system is able to handle) OK bluhm@
2018-08-24- cosmetic tweak to if_pfsync.cAlexandr Nedvedicky
OK bluhm@, OK mpi@, henning@, jca@
2018-02-19Remove almost unused `flags' argument of suser().Martin Pieuchot
The account flag `ASU' will no longer be set but that makes suser() mpsafe since it no longer mess with a per-process field. No objection from millert@, ok tedu@, bluhm@
2018-01-09Creating a cloned interface could return ENOMEM due to temporaryAlexander Bluhm
memory shortage. As it is invoked from a system call, it should not fail and wait instead. OK visa@ mpi@
2017-11-20Sprinkle some NET_ASSERT_LOCKED(), const and co to prepare runningMartin Pieuchot
pr_input handlers without KERNEL_LOCK(). ok visa@
2017-08-11Remove NET_LOCK()'s argument.Martin Pieuchot
Tested by Hrvoje Popovski, ok bluhm@
2017-06-09- pfsync_input() must grab PF_LOCKAlexandr Nedvedicky
reported and patch tested by Hrvoje Popovski O.K. bluhm@
2017-05-27Remove useless splnet()/splx() dances.Martin Pieuchot
pfsyncioctl() is executed with the NET_LOCK() held which is enough. ok sashan@
2017-05-16Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().Martin Pieuchot
ok visa@
2017-04-14Pass down the address family through the pr_input calls. ThisAlexander Bluhm
allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
2017-04-11Partially revert previous mallocarray conversions that containDavid Hill
constants. The consensus is that if both operands are constant, we don't need mallocarray. Reminded by tedu@ ok deraadt@
2017-04-09Use mallocarray to allocate multicast group memberships.David Hill
ok deraadt@
2017-04-05When building counter memory in preparation to copy to userland, alwaysTheo de Raadt
zero the buffers first. All the current objects appear to be safe, however future changes might introduce structure pads. Discussed with guenther, ok bluhm
2017-03-11Add a detachhook to pfsync(4) which deals with the syncdev going away.Stefan Sperling
Fixes a panic observed by douple-p (aka pb@) when destroying the syncdev. tweak & ok mpi@
2017-02-20pfsync(4) percpu countersJeremie Courreges-Anglas
ok florian@
2017-01-29Change the IPv4 pr_input function to the way IPv6 is implemented,Alexander Bluhm
to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
2017-01-25Since raw_input() and route_input() are gone from pr_input, we canAlexander Bluhm
make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
2017-01-23Flag pseudo-interfaces as such in order to call add_net_randomness()Martin Pieuchot
only once per packet. Fix a regression introduced when if_input() started to be called by every pseudo-driver. ok claudio@, dlg@
2017-01-20pfsync_update_net_tdb() is only called at IPL_SOFTNET, no need for aMartin Pieuchot
splsofnet()/splx() dance. Tested by Hrvoje Popovski, ok visa@
2017-01-20No need to handle SIOCAIFADDR in drivers, it's never passed down toMartin Pieuchot
them. ok claudio@
2016-12-19Timer sending packets need to grab the NET_LOCK().Martin Pieuchot
ok bluhm@
2016-11-22Fold union pf_headers buffer into struct pf_pdesc (enabled by pfvar_priv.h).Richard Procter
Prevent pf_socket_lookup() reading uninitialised header buffers on fragments. OK blum@ sashan@
2016-11-14Instead of passing an extra mbuf pointer to pf_route(), it shouldAlexander Bluhm
just use pd->m. Then pf_test() can also operate on pd.m and set the *m0 value in the caller just before it returns. OK sashan@
2016-10-27Pass a struct pf_pdesc to pf_route() like it is done in the otherAlexander Bluhm
pf functions. That means less parameters, more consistency and later we can call functions that need a pd from pf_route(). OK sashan@
2016-10-04Convert timeouts that need a process context to timeout_set_proc(9).Martin Pieuchot
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock. ok kettenis@, bluhm@
2016-09-27roll back turning RB into RBT until i get better at this process.David Gwynne
2016-09-27move pf from the RB macros to the RBT functions.David Gwynne
2016-09-21Remove recursive splsoftnet() calls, from David Hill.Martin Pieuchot
2016-09-15all pools have their ipl set via pool_setipl, so fold it into pool_init.David Gwynne
the ioff argument to pool_init() is unused and has been for many years, so this replaces it with an ipl argument. because the ipl will be set on init we no longer need pool_setipl. most of these changes have been done with coccinelle using the spatch below. cocci sucks at formatting code though, so i fixed that by hand. the manpage and subr_pool.c bits i did myself. ok tedu@ jmatthew@ @ipl@ expression pp; expression ipl; expression s, a, o, f, m, p; @@ -pool_init(pp, s, a, o, f, m, p); -pool_setipl(pp, ipl); +pool_init(pp, s, a, ipl, f, m, p);
2016-08-23pool_setiplDavid Gwynne
2016-04-29Make if_output() return EAFNOSUPPORT instead of just dropping packetsKenneth R Westerback
and pretending the output succeeded. Packets are still dropped! Idea from jsg@ following same change to bridge(4). ok mpi@