Age | Commit message (Collapse) | Author |
|
the NAT rewrite and ever since then only checked in a couple of plaes
but never set. same for nat_src_node on pf_state.
with this the NAT rewrite made pf over 1000 lines shorter.
|
|
NAT, filter). now we only have one. no need for an array any more. simplifies
the code quite a bit.
in the process fix the abuse of PF_RULESET_* by (surprise, isn't it) the
table code.
written at the filesystem hackathon in stockholm, committed from the
hardware hackathon in portugal. ok gcc and jsing
|
|
everything just more complicated. Make sure the structs align nicely.
OK deraadt@
|
|
tables on top of a rdomain) but until now our code was a crazy mix so that
it was impossible to correctly use rtables in that case. Additionally pf(4)
only knows about rtables and not about rdomains. This is especially bad when
tracking (possibly conflicting) states in various domains.
This diff fixes all or most of these issues. It adds a lookup function to
get the rdomain id based on a rtable id. Makes pf understand rdomains and
allows pf to move packets between rdomains (it is similar to NAT).
Because pf states now track the rdomain id as well it is necessary to modify
the pfsync wire format. So old and new systems will not sync up.
A lot of help by dlg@, tested by sthen@, jsg@ and probably more
OK dlg@, mpf@, deraadt@
|
|
which unbreaks ie route-to after the recent pf changes.
With much help debugging and pointing out of missing bits from claudio@
ok claudio@ "looks good" henning@
|
|
destination of a packet was changed by pf. This allows for some evil
games with rdr-to or nat-to but is mostly needed for better rdomain/rtable
support. This is a first step and more work and cleanup is needed.
Here a list of what works and what does not (needs a patched pfctl):
pass out rdr-to:
from local rdr-to local addr works (if state tracking on lo0 is done)
from remote rdr-to local addr does NOT work
from local rdr-to remote works
from remote rdr-to remote works
pass in nat-to:
from remote nat-to local addr does NOT work
from remote nat-to non-local addr works
non-local is an IP that is routed to the FW but is not assigned on the FW.
The non working cases need some magic to correctly rewrite the incomming
packet since the rewriting would happen outbound which is too late.
"time to get it in" deraadt@
|
|
- queue packets from pf(4) to a userspace application
- reinject packets from the application into the kernel stack.
The divert socket can be bound to a special "divert port" and will
receive every packet diverted to that port by pf(4).
The pf syntax is pretty simple, e.g.:
pass on em0 inet proto tcp from any to any port 80 divert-packet port 1
A lot of discussion have happened since my last commit that resulted
in many changes and improvements.
I would *really* like to thank everyone who took part in the discussion
especially canacar@ who spotted out which are the limitations of this approach.
OpenBSD divert(4) is meant to be compatible with software running on
top of FreeBSD's divert sockets even though they are pretty different and will
become even more with time.
discusses with many, but mainly reyk@ canacar@ deraadt@ dlg@ claudio@ beck@
tested by reyk@ and myself
ok reyk@ claudio@ beck@
manpage help and ok by jmc@
|
|
Sorry.
|
|
- queue packets from pf(4) to a userspace application
- reinject packets from the application into the kernel stack.
The divert socket can be bound to a special "divert port" and will
receive every packet diverted to that port by pf(4).
The pf syntax is pretty simple, e.g.:
pass on em0 inet proto tcp from any to any port 80 divert-packet port 8000
test, bugfix and ok by reyk@
manpage help and ok by jmc@
no objections from many others.
|
|
rewrite of the NAT code, basically. nat and rdr become actions on regular
rules, seperate nat/rdr/binat rules do not exist any more.
match in on $intf rdr-to 1.2.3.4
match out on $intf nat-to 5.6.7.8
the code is capable of doing nat and rdr in any direction, but we prevent
this in pfctl for now, there are implications that need to be documented
better.
the address rewrite happens inline, subsequent rules will see the already
changed addresses. nat / rdr can be applied multiple times as well.
match in on $intf rdr-to 1.2.3.4
match in on $intf to 1.2.3.4 rdr-to 5.6.7.8
help and ok dlg sthen claudio, reyk tested too
|
|
"reassemble tcp" state option failed to work correctly. Increasing this
to u_int16_t fixes kernel/6178. ok deraadt@ henning@
|
|
found by sthen and fixed, all other callers of these macros checked by both
of us
|
|
by backing out the macro fix. something must rely on the broken behaviour
|
|
was added in 2001. yes i got bitten by inet6 shit again.
in the ANEQ case, if af == AF_INET, (a)->addr32[0] != (b)->addr32[0]
is false when the adresses ARE equal. now it goes right in the
intended-for-v6 case and starts to compare the other addr32 fields -
in the v4 case I have garbage in them, so it reports all v4 as different
when they are in fact the same. fix by adding explicit af == INET6 test
before going on to compare the rest.
found the really hard way (many hours wasted, thought the bug was in my
new code) by me. ok sthen markus claudio
|
|
code. In pf rtableid == -1 means don't change the rtableid because
of this rule. So it has to be signed int there. Before the value
is passed from pf to route it is always checked to be >= 0. Change
the type to int in pf and to u_int in netinet and netinet6 to make
the checks work. Otherwise -1 may be used as an array index and
the kernel crashes.
ok henning@
|
|
2) packet reassembly: only one method remains, full reassembly. crop
and drop-ovl are gone.
. set reassemble yes|no [no-df]
if no-df is given fragments (and only fragments!) with the df bit set
have it cleared before entering the fragment cache, and thus the
reassembled packet doesn't have df set either. it does NOT touch
non-fragmented packets.
3) regular rules can have scrub options.
. pass scrub(no-df, min-ttl 64, max-mss 1400, set-tos lowdelay)
. match scrub(reassemble tcp, random-id)
of course all options are optional. the individual options still do
what they used to do on scrub rules, but everything is stateful now.
4) match rules
"match" is a new action, just like pass and block are, and can be used
like they do. opposed to pass or block, they do NOT change the
pass/block state of a packet. i. e.
. pass
. match
passes the packet, and
. block
. match
blocks it.
Every time (!) a match rule matches, i. e. not only when it is the
last matching rule, the following actions are set:
-queue assignment. can be overwritten later, the last rule that set a
queue wins. note how this is different from the last matching rule
wins, if the last matching rule has no queue assignments and the
second last matching rule was a match rule with queue assignments,
these assignments are taken.
-rtable assignments. works the same as queue assignments.
-set-tos, min-ttl, max-mss, no-df, random-id, reassemble tcp, all work
like the above
-logging. every matching rule causes the packet to be logged. this
means a single packet can get logged more than once (think multiple log
interfaces with different receivers, like pflogd and spamlogd)
.
almost entirely hacked at n2k9 in basel, could not be committed close to
release. this really should have been multiple diffs, but splitting them
now is not feasible any more. input from mcbride and dlg, and frantzen
about the fragment handling.
speedup around 7% for the common case, the more the more scrub rules
were in use.
manpage not up to date, being worked on.
|
|
transactional, closing PRs 4941 and 5910. Minor flag day, requires rebuild
of userland tools that use struct pfi_kif.
ok henning deraadt
|
|
WARNING: THIS BREAKS COMPATIBILITY WITH THE PREVIOUS VERSION OF PFSYNC
this is a new variant of the protocol and a large reworking of the
pfsync code to address some performance issues. the single largest
benefit comes from having multiple pfsync messages of different
types handled in a single packet. pfsyncs handling of pf states is
highly optimised now, along with packet parsing and construction.
huggz for beck@ for testing.
huge thanks to mcbride@ for his help during development and for
finding all the bugs during the initial tests.
thanks to peter sutton for letting me get credit for this work.
ok beck@ mcbride@ "good." deraadt@
|
|
pf_lb.c. This will ease the process of adding more selection types
without bloatening pf.c even more.
ok and a weird death threat, henning@
raised eyebrow, dlg@
|
|
whether we're called from the interrupt context to the functions
performing allocations.
Looked at by mpf@ and henning@, tested by mpf@ and Antti Harri,
the pr originator.
ok tedu
|
|
using the default interrupt handler for both, so there's no need to keep
table entries created in interrupt context separate.
ok henning art
|
|
It applies to state_flags, not to sync_flags.
OK henning@, gollo@
|
|
flows export data gathered from pf states.
initial implementation by Joerg Goltermann <jg@osn.de>, guidance and many
changes by me. 'put it in' theo
|
|
pf_pkt_addr_changed. atm just clears the state key pointer.
calling this is cleaner than having other parts of the stack clearing
pointers in the pf part of the mbuf packet header directly.
|
|
when we first do a pcb lookup and we have a pointer to a pf state key
in the mbuf header, store the state key pointer in the pcb and a pointer
to the pcb we just found in the state key. when either the state key
or the pcb is removed, clear the pointers.
on subsequent packets inbound we can skip the pcb lookup and just use the
pointer from the state key.
on subsequent packets outbound we can skip the state key lookup and use
the pointer from the pcb.
about 8% speedup with 100 concurrent tcp sessions, should help much more
with more tcp sessions.
ok markus ryan
|
|
and the state-related pf(4) ioctls, and make functions in state creation and
destruction paths more robust in error conditions.
All values in struct pfsync_state now in network byte order, as with pfsync.
testing by david
ok henning, systat parts ok canacar
|
|
header inbound. on the outbound side, we take that and look for the key
that is the exact opposite, and store that mapping in the state key. on
subsequent packets we don't have to do the lookup on outbound any more.
almost unable to get real benchmarks going here, we know for sure this
gives a more than 5% increase in forwarding performance.
many thanks to ckuethe for stress- and performance-testing.
ok ryan theo
|
|
Use the 'counters' table option in pf.conf if you actually need them.
If enabled, memory is not allocated until packets match an address.
This saves about 40% memory if counters are not being used, and paves the way
for some more significant cleanups coming soon.
ok henning mpf deraadt
|
|
into one 8 bit flags field.
shrinks the state structure by 4 bytes on 32bit archs
ryan ok
|
|
numbers at all. scary consequences; only tobe used in very specific
situations where you don't see all packets of a connection, e. g.
asymmetric routing. ok ryan reyk theo
|
|
|
|
- Mechanical change: Use arrays for state key pointers in pf_state, and
addr/port in pf_state_key, to allow the use of indexes.
- Fix NAT, pfsync, pfctl, and tcpdump to handle the new state structures.
In struct pfsync_state, both state keys are included even when identical.
- Also fix some bugs discovered in the existing code during testing.
(in particular, "block return" for TCP packets was not returning an RST)
ok henning beck deraadt
tested by otto dlg beck laurent
Special thanks to users Manuel Pata and Emilio Perea who did enough testing
to actually find some bugs.
|
|
complete the split off of the layer 3/4 adressing information from the extra
information in the actual state. a state key holds a list of states, and a
state points to two state keys - they're only different in the NAT case.
More specificially, it deprecates the (often difficult to understand)
concept of lan, ext, and gwy addresses, replacing them with WIRE and
STACK side address tuples. (af, proto, saddr, daddr, sport, dport).
Concept first brought up some years ago on a ferry ride in bc by ryan and
me, I spent some time over the last year getting closer, and finally
got it completed in japan with ryan. dlg also took part, helped a lot,
and saved us 8 bytes.
This commit removes support for any kind of NAT as well as pfsync.
It also paves the road for some code simplification and some very cool
future stuff.
ok ryan beck, tested by many
|
|
|
|
Fix printing of the state id in pfctl -ss -vv.
Remove the psnk_af hack to return the number of killed states.
OK markus, beck. "I like it" henning, deraadt.
Manpage help from jmc.
|
|
makes transparent proxies much easier; ok beck@, feedback claudio@
|
|
shows that 3 developers screwed this up. look carefully at this diff
and learn how to avoid wasting memory. on a 64 bit architecture, each
of these was using 40 bytes instead of 32.
ok henning
|
|
|
|
|
|
It shows up in pfctl verbose mode and in the 7th field of the labels
output. Also remove the label printing for scrub rules, as they
do not support labels.
OK dhartmei@ (on an earlier version), henning@, mcbride@
|
|
when it is in fact only used to delete the state key when the number of
attached states (in a tailq) drops to zero, we can as well test for the
queue beeing empty.
this is a leftover from some early version that did things differently.
ok ryan
|
|
copyin/out. Change the API so that the state is included in the ioctl
argument, so the ioctl wrappers take care of copying memory as appropriate.
Also change the DIOCGETSTATE API to be more useful. Instead of getting
an arbitrarily "numbered" state (using numbering that can change between
calls), instead search based on id and creatorid. If you want to monitor
only a particular state, you can now use the bulk functions first to find
the appropriate id/creatorid and then fetch it directly from then on.
ok dlg@ henning@
|
|
Using a group sums up the statistics of all members.
Modify pfctl(1) slightly to allow a groupname "all",
which gives us an overall pf(4) statistic.
OK henning@, markus@
|
|
ok henning@
|
|
there is a 1:1 mapping between direction and the tree the states get
attached to. there is no need to have anything outside the state insertion/
deletion/lookup routinbes know about these internals. so just pass the
direction to the lookup functions and let them pick the right tree.
ok dhartmei markus
|
|
criteria. ok mcbride@
|
|
|
|
keys that can map to multiple states (last not least for ifbound) we don't
need state tables hanging off each struct kif representing an interface
any more. use two globals for the two tables. ok markus ryan
|
|
unused ifname (this information is in struct pf_state_sync now).
Also a bit of KNF on the pf_state struct.
ok mpf@ henning@
|
|
previously, we had a set of state tables attached to each interface. so for
every packet we had to do a lookup in the tables for the interface, and
afterwards in the global tables.
since we split state keys and states now, use only the global tables, and
put the actual states in a tail queue attached to the state key. sort the
list so that ifbound states come before global ones. on lookup, we only
have to compare the interface pointer on the actual states and use the
first one where either the interface matches or the state is not interface
bound. thus, if you don't actually use ifbound states, and there is only
one state per state key, the overhead is close to zero, where we had extra
lookups before. in addition to a much cleaner design (that'll allow for more
goodies later) this gives us ~12.5% more forwarding performance.
mostly hacked at c2k7, lots of help, testing and ok mcbride & markus
|