src - OpenBSD base system

Age	Commit message (Collapse)	Author
2021-11-16	move memory allocations in pfr_add_addrs() outside of NET_LOCK()/PF_LOCK()	Alexandr Nedvedicky
	scope. feedback by bluhm@ OK bluhm@
2021-11-11	Allow pfi_kif_get() callers to pre-allocate buffer for new kif. If kif	Alexandr Nedvedicky
	object exists already, then caller must free the pre-allocated buffer. If caller does not pre-allocate buffer, the pfi_kif_get() will get memory from pool using M_NOWAIT flag. Commit is also polishing pfi_initialize() a bit so it uses M_WAITOK allocation for pfi_all. there is no change in current behaviour. feedback by bluhm@ OK bluhm@
2021-06-23	augment the global pf state list with its own locks.	David Gwynne
	before this, things that iterated over the global list of pf states had to take the net, pf, or pf state locks. in particular, the ioctls that dump the state table took the net and pf state locks before iterating over the states and using copyout to export them to userland. when we tried replacing the use rwlocks with mutexes under the pf locks, this blew up because you can't sleep when holding a mutex and there's a sleeping lock used inside copyout. this diff introduces two locks around the global state list: a mutex that protects the head and tail of the list, and an rwlock that protects the links between elements in the list. inserts on the state list only occur during packet handling and can be done by taking the mutex and putting the state on the tail before releasing the mutex. iterating over states is only done from thread/process contexts, so we can take a read lock, then the mutex to get a snapshot of the head and tail pointers, and then keep the read lock to iterate between the head and tail points. because it's a read lock we can then take other sleeping locks (eg, the one inside copyout) without (further) gymnastics. the pf state purge code takes the rwlock exclusively and the mutex to remove elements from the list. this allows the ioctls and purge code to loop over the list concurrently and largely without blocking the creation of states when pf is processing packets. pfsync also iterates over the state list when doing bulk sends, which the state purge code needs to be careful around. ok sashan@
2021-06-23	rework pf_state_expires to avoid confusion around state->timeout.	David Gwynne
	im going to make it so pf_purge_expired_states() can gather states largely without sharing a lock with pfsync or actual packet processing in pf. if pf or pfsync unlink a state while pf_purge_expired_states is looking at it, we can race with some checks and fall over a KASSERT. i'm fixing this by having the caller of pf_state_expires read state->timeout first, do it's checks, and then pass the value as an argument into pf_state_expires. this means there's a consistent view of the state->timeout variable across all the checks that pf_purge_expired_states in particular does. if pf/pfsync does change the timeout while pf_purge_expired_states is looking at it, the worst thing that happens is that it doesn't get picked as a candidate for purging in this pass and will have to wait for the next sweep. ok sashan@ as part of a bigger diff
2021-03-10	spelling	Jonathan Gray
	ok gnezdo@ semarie@ mpi@
2021-02-01	change route-to so it sends packets to IPs instead of interfaces.	David Gwynne
	this is a significant (and breaking) reworking of the policy based routing that pf can do. the intention is to make it as easy as nat/rdr to use, and more robust when it's operating. the main reasons for this change are: - route-to, reply-to, and dup-to do not work with pfsync this is because the information about where to route-to is stored in rules, and it is hard to have a ruleset synced between firewalls, and impossible to have them synced 100% of the time. - i can make my boxes panic in certain situations using route-to yeah... - the configuration and syntax for route-to rules are confusing. the argument to route-to and co is an interace name with an optional ip address. there are several problems with this. one is that people tend to think about routing as sending packets to peers by their address, not by the interface they're reachable on. another is that we currently have no way to synchronise interface topology information between firewalls, so using an interface to say where packets go means we can't do failover of these states with pfsync. another is that a change in routing topology means a host may become reachable over a different interface. tying routing policy to interfaces gets in the way of failover and load balancing. this change does the following: - stores the route info in the state instead of the pf rule this allows route-to to keep working when the ruleset changes, and allows route-to info to be sent over pfsync. there's enough spare bits in pfsync messages that the protocol doesnt break. the caveat is that route-to becomes tied to pass rules that create state, like rdr-to and nat-to. - the argument to route-to etc is a destination ip address it's not limited to a next-hop address (thought a next-hop can be a destination address). this allows for the failover and load balancing referred to above. - deprecates the address@interface host syntax in pfctl because routing is done entirely by IPs, the interface is derived from the route lookup, not pf. any attempt to use the @interface syntax will fail now in all contexts. there's enthusiasm from proctor@ jmatthew@ and others ok sashan@ bluhm@
2021-01-12	Sometimes a user ID was logged in pflog(4) although the logopt of	Alexander Bluhm
	the rule did not specify it. Check the option again for the log rule in case another rule has triggered a socket lookup. Remove logopt group, it is not documented and cannot work as struct pfloghdr does not contain a gid. Rename PF_LOG_SOCKET_LOOKUP to PF_LOG_USER to express what it does. The lookup involved is only an implemntation detail. OK kn@ sashan@ mvs@
2020-10-14	replace a MAXPATHLEN that slipped back in with PATH_MAX so userland won't	Christian Weisgerber
	have to pull in <sys/param.h> ok kn@ sashan@ deraadt@
2020-08-24	Remove ptr_array from struct pf_ruleset	kn
	Each ruleset's rules are stored in a TAILQ called "ptr" with "rcount" representing the number of rules in the ruleset; "ptr_array" points to an array of the same length. "ptr" is backed by pool_get(9) and may change in size as "expired" rules get removed from the ruleset - see "once" in pf.conf(5). "ptr_array" is allocated momentarily through mallocarray(9) and gets filled with the TAILQ entries, so that the sole user pfsync(4) can access the list of rules by index to pick the n-th rule during state insertion. Remove "ptr_array" and make pfsync iterate over the TAILQ instead to get the matching rule's index. This simplifies both code and data structures and avoids duplicate memory management. OK sashan
2020-07-28	Use the table on root always if current table is not active.	YASUOKA Masahiko
	ok sashan
2020-07-21	rename PF_OPT_TABLE_PREFIX to PF_OPTIMIZER_TABLE_PFX and move it to pfvar.h	Henning Brauer
	OPT is misleading and usually refers to command line arguments to pfctl ok sashan kn
2019-11-17	"set delay" never worked as committed: the delay field was not copied	Otto Moerbeek
	in and the pf_pktdelay struct ws not declared and initialzed properly. ok rob@ kn@
2019-07-09	Fix previous commit which made src-node have a reference for the kif.	YASUOKA Masahiko
	Src-node should use the reference counter since it might live longer than its table entry, rule or the associated states. OK sashan
2019-07-02	When source address tracking record is used for "route-to", the next	YASUOKA Masahiko
	hop interface configured with "route-to" was not used. Keep the interface within the pf_src_node and use it when the record is used. OK sashan
2019-02-18	Change ps_len of struct pfioc_states and psn_len of struct	Alexander Bluhm
	pfioc_src_nodes to size_t. This avoids integer truncation by casts to unsigned. As the types of DIOCGETSTATES and DIOCGETSRCNODES ioctl(2) arguments change, pfctl(8) and systat(1) should be updated together with the kernel. Calculate number of pf(4) states as size_t in userland. OK sashan@ deraadt@
2018-12-17	Rename pf_anchor_remove() to pf_remove_anchor()	kn
	For semantic consistency with pf_{create,find,remove}_{anchor,ruleset}(). Simplify logic by squashing the if/else block while here. No functional change. Feedback jca and mikeb, OK mikeb
2018-12-10	Remove useless macros	kn
	These are just unhelpful case conversion. OK sashan henning
2018-12-09	Zap duplicate signatures	kn
	Redundant under _KERNEL since introduction in r1.260 from 2006. OK jca
2018-09-13	Add reference counting for inet pcb, this will be needed when we	Alexander Bluhm
	start locking the socket. An inp can be referenced by the PCB queue and hashes, by a pf mbuf header, or by a pf state key. OK visa@
2018-09-11	- moving state look up outside of PF_LOCK()	Alexandr Nedvedicky
	this change adds a pf_state_lock rw-lock, which protects consistency of state table in PF. The code delivered in this change is guarded by 'WITH_PF_LOCK', which is still undefined. People, who are willing to experiment and want to run it must do two things: - compile kernel with -DWITH_PF_LOCK - bump NET_TASKQ from 1 to ... sky is the limit, (just select some sensible value for number of tasks your system is able to handle) OK bluhm@
2018-09-10	Limit the fragment entry queue length to 64 per bucket. So we have	Alexander Bluhm
	a global limit of 1024 fragments, but it is fine grained to the region of the packet. Smaller packets may have less fragments. This costs another 16 bytes of memory per reassembly and devides the worst case for searching by 8. requestd by claudio@; OK sashan@ claudio@
2018-09-08	Split the pf(4) fragment reassembly queue into smaller parts.	Alexander Bluhm
	Remember 16 entry points based on the fragment offset. Instead of a worst case of 8196 list traversals we now check a maximum of 512 list entries or 16 array elements. discussed with claudio@ and sashan@; OK sashan@
2018-07-22	Fix arguments of pf_purge_expired_{src_nodes,rules}()	Stefan Fritsch
	Due to the missing "void", this extern void pf_purge_expired_src_nodes(); is no prototype but a declaration. It is enough to suppress the 'implicit declaration' warning but it does not allow the compiler to check the arguments passed to the calls of the function. Fix the prototypes and don't pass the waslocked argument anymore. It has been removed a year ago. ok sashan henning
2018-07-11	provide pfi_group_addmember(), which makes the new member interface inherit	Henning Brauer
	set flags from the group. ok phessler benno
2018-07-10	The year is 2018.	Henning Brauer
	Mercury, Bowie, Cash, Motorola and DEC all left us. Just pf still has a default state table limit of 10000. Had! Now it's a tiny little bit more, 100k. lead guitar: me ok chorus: phessler theo claudio benno background school girl laughing: bob
2018-07-10	provide a generic packet delay functionality. packets to be delayed are marked	Henning Brauer
	by pf in the packet header. pf_delay_pkt reads the delay value from the packet header, schedules a timeout and re-queues the packet when the timeout fires. ok benno sashan
2018-06-18	Refactor the six ways to find TCP options into one new function. As a result:	Richard Procter
	- MSS and WSCALE option candidates must now meet their min type length. - 'max-mss' is now more tolerant of malformed option lists. These changes were immaterial to the live traffic I've examined. OK sashan@ mpi@
2018-04-05	Zap the obsolete PF_TRANS_ALTQ.	Lawrence Teo
	Note: Remember to "make includes" and recompile the following programs together with the kernel: sbin/pfctl usr.sbin/authpf usr.sbin/ftp-proxy usr.sbin/relayd usr.sbin/tftp-proxy Thanks to sthen@ for checking the ports tree. ok bluhm@ sashan@ visa@
2018-02-09	oh carp - i didnt mean to commit these	David Gwynne

2018-02-09	use struct in_addr to represent an address.	David Gwynne

2018-02-08	make the watermarks/thresholds for entering and leaving syncookie mode when	Henning Brauer
	syncookies are set to adaptive tunable, ok claudio benno
2018-02-08	add DIOCGETSYNFLWATS to get current synflood detection watermarks,	Henning Brauer
	ok claudio benno procter
2018-02-07	provide counters for # of synfloods detected, # of syncookies sent,	Henning Brauer
	# of syncookies successfuly validated, ok phessler
2018-02-06	syncookies for pf.	Henning Brauer
	when syncookies are on, pf will blindly answer each and every SYN with a syncookie-SYNACK. Upon reception of the ACK completing the 3WHS, pf will reconstruct the original SYN, shove it through pf_test, where state will be created if the ruleset permits it. Then massage the freshly created state (we won't see the SYNACK), set up the sequence number modulator, and call into the existing synproxy code to start the 3WHS with the backend host. Add an - somewhat basic for now - adaptive mode where syncookies get enabled if a certain percentage of the state table is filled up with half-open tcp connections. This makes pf firewalls resilient against large synflood attacks. syncookies are off by default until we gained more experience, considered experimental for now. see http://bulabula.org/papers/2017/bsdcan/ for more details. joint work with sashan@, widely discussed and with lots of input by many
2017-12-29	Make the functions which link the pf state keys to mbufs, inpcbs,	Alexander Bluhm
	or other states more consistent. OK visa@ sashan@ on a previous version
2017-11-28	The divert structure was using the port number to indicate that	Alexander Bluhm
	divert-to or divert-reply was active. If the address was also set, it meant divert-to. Divert packet used a separate structure. This is confusing and makes it hard to add new features. It is better to have a divert type that explicitly says what is configured. Adapt the pf rule struct in kernel and pfctl, no functional change. Note that kernel and pfctl have to be updated together. OK sashan@
2017-11-27	The divert structure was using the port number to indicate that	Alexander Bluhm
	divert-to or divert-reply was active. If the address was also set, it meant divert-to. Divert packet used a separate structure. This is confusing and makes it hard to add new features. It is better to have a divert type that explicitly says what is configured. Convert the pfctl(8) rule parser to divert types, kernel cleanup will be the next step. OK sashan@
2017-11-13	add a generic packet rate matching filter. allows things like	Henning Brauer
	pass in proto icmp max-pkt-rate 100/10 all packets matching the rule in the direction the state was created are taken into consideration (typically: requests, but not replies). Just like with the other max-*, the rule stops matching if the maximum is reached, so in typical scenarios the default block rule would kick in then. with input from Holger Mikolon ok mikeb
2017-09-05	- split pf_find_or_create_ruleset() to smaller chunks.	Alexandr Nedvedicky
	tested by Hrvoje OK mpi@, OK bluhm@
2017-08-14	move pf_get_wscale + pf_get_mss prototypes to pfvar.h (diff shrinkage)	Henning Brauer

2017-08-14	add half-open tcp states accounting, road paved by sashan	Henning Brauer
	increment in pf_create_state(), decrement in pf_set_protostate(). input & ok bluhm
2017-08-13	to change a state's state (that term is overloaded in pf, protocol state	Henning Brauer
	like ESTABLISHED for tcp here), don't do it directly, but go through a newly introduced pf_set_protostate() ok bluhm benno
2017-08-06	Reduce contention on the NET_LOCK() by moving the logic of the pfpurge	Martin Pieuchot
	thread to a task running on the `softnettq`. Tested and inputs from Hrvoje Popovski. ok visa@, sashan@
2017-07-19	Rework HFSC vs FQ-CoDel checks	Mike Belopuhov
	The selection mechanism introduced in pf_ioctl.c -r1.316 suffers from being too ambiguous and lacks robustness. Instead of relying on composition of multiple flags in the queue specification, it's easier to identify the root class (if it exists) and derive all further checks from it.
2017-06-28	Introduce a simple mechanism to select the appropriate queue manager	Mike Belopuhov
	Discussed with and OK henning@ at d2k17 as a part of a larger diff.
2017-06-28	Extend pf queueing ops to include queue manager hooks	Mike Belopuhov
	Discussed with and OK henning@ at d2k17 as a part of a larger diff.
2017-06-26	Fragments for a single connection (a combination of proto,src,dst,af)	Alexander Bluhm
	may easily reuse the fragment id as it is only 16 bit for IPv4. To avoid that pf reassembles them into the wrong packet, throw away stale fragments. With the default timeout this happens after 12,000 newer fragements have been seen. from markus@; OK sashan@
2017-05-30	remove XXX from the comments marking "holes" in the ioctls. I see very	Henning Brauer
	very little value in these comments at all, but the XXX is just wrong and in the way when looking for real XXXs. phessler agrees
2017-05-30	g/c DIOCCLRRULECTRS	Henning Brauer
	kinda deprecated for a decade now, nothing in base uses it, nothing in ports uses it (thanks sthen) ok phessler sashan
2017-05-30	teach pf_build_tcp() about SACK, ok & with sashan	Henning Brauer