Age | Commit message (Collapse) | Author |
|
packet in their payload that matches an exiting connection. It was
not checked whether the outer ICMP packet has the same destination
IP as the source IP of the inner protocol packet. Enforce that
these addresses match, to prevent ICMP packets that do not make
sense.
Issue found by Nicolas Collignon, Corentin Bayet, Eloi Vanderbeken,
Luca Moro at Synacktiv.com
OK sashan@
|
|
makes mpe consistent with mpw and mpip
|
|
this makes ifconfig print "(unset)" to show the label isn't set yet.
|
|
BIOCSFILDROP was already able to be used as a quick and dirty
firewall, which is especially useful when you you want to filter
non-ip things. however, capturing the packets you're dropping is a
lot of overhead when you just want to drop stuff. this extends
fildrop so you can tell bpf not to capture the packets it drops.
ok sthen@ mikeb@ claudio@ visa@
|
|
this just provides the macros for the different values for BIOCGFILDROP
and BIOCSFILDROP, the implementation behing them is coming.
ok sthen@ mikeb@ claudio@ visa@
|
|
|
|
ok visa@
|
|
This will help for future (un)locking.
ok visa@
|
|
ok claudio@ deraadt@
Reported-by: syzbot+8e29400e09a351f17884@syzkaller.appspotmail.com
|
|
the backpressure seems to have kicked in too early, introducing a
lot of packet loss where there wasn't any before. secondly, counting
operations interacted extremely badly with pseudo-interfaces. for
example, if you have a physical interface that rxes 100 vlan
encapsulated packets, it will call ifiq_input once for all 100
packets. when the network stack is running vlan_input against thes
packets, vlan_input will take the packet and call ifiq_input against
each of them. because the stack is running packets on the parent
interface, it can't run the packets on the vlan interface, so you
end up with ifiq_input being called 100 times, and we dropped packets
after 16 calls to ifiq_input without a matching run of the stack.
chris cappuccio hit some weird stuff too.
discussed with claudio@
|
|
|
|
OK phessler@ deraadt@
|
|
alignment constraints documented in RFC 2367 section 2.2.
Fixes 'ipsecctl -ss' segfault observed on i386.
with and ok deraadt@ visa@ mikeb@
|
|
type,
and definately don't do this to the length: (unsigned)(cplim2 - cp2)
ok claudio
|
|
previously ifiq_input uses the traditional backpressure or defense
mechanism and counts packets to decide when to shed load by dropping.
currently it ends up waiting for 10240 packets to get queued on the
stack before it would decide to drop packets. this may be ok for
some machines, but for a lot this was too much.
this diff reworks how ifiqs measure how busy the stack is by
introducing an ifiq_pressure counter that is incremented when
ifiq_input is called, and cleared when ifiq_process calls the network
stack to process the queue. if ifiq_input is called multiple times
before ifiq_process in a net taskq runs, ifiq_pressure goes up, and
ifiq_input uses a high value to decide the stack is busy and it
should drop.
i was hoping there would be no performance impact from this change,
but hrvoje popovski notes a slight bump in forwarding performance.
my own testing shows that the ifiq input list length grows to a
fraction of the 10240 it used to get to, which means the maximum
burst of packets through the stack is smoothed out a bit. instead
of big lists of packets followed by big periods of drops, we get
relatively small bursts of packets with smaller gaps where we drop.
the follow-on from this is to make drivers implementing rx ring
moderation to use the return value of ifiq_input to scale the ring
allocation down, allowing the hardware to drop packets so software
doesnt have to.
|
|
passed by pf or cause a panic in pf.
fix from sashan@; OK bluhm@ claudio@
bug found by Corentin Bayet, Nicolas Collignon, Luca Moro at Synacktiv
|
|
This is basically mpw(4), but it carries IP directly instead of
Ethernet. On the wire it can look the same as what IP over MPLS
looks like, but because it is a pseudowire you can configure a
control word or the FAT label to improve load balancing. It can
be used to quickly set up an IP tunnel over an MPLS fabric
without the need to configure bgpd and mpe(4) interfaces.
Because It implements the same pwe3 ioctls that mpw(4) uses ifconfig
already supports configuration of mpip(4) interfaces. ldpd will
grow support for this in the near future.
This is not hooked up to the build yet
discussed with claduio@ at ak219
ok claudio@
|
|
whether the mpw interface is advertising "ethernet" or "ethernet-
tagged" is something the ends of the wire agree on (ie, ldpd is
configured a certain way), it is not something that affects ethernet
encap or decap.
the MPW ioctls can still configure it and read it, but it has no
bearing on how the driver operates on packets.
|
|
|
|
the existing mpw ioctl is still available for ldpd to use for a
(short) while.
discussed with claudio@ at a2k19
ok mpi@
|
|
part of a larger diff ok mpi@
|
|
inputs & ok visa@
|
|
this basically adds a dummy mpls tag to the stack for pseudowires
and uses a flow as the label on that dummy tag. this allows
intermediate systems that hash packets onto multiple links to use
the extra tag as input to the hash, providing more entropy and
therefore better load balancing.
it's a pity there's no way to turn it on yet...
|
|
this is to prepare for flow aware transport for FAT from RFC 6391
|
|
the only flag used with sc_flags was the one to turn the control
word on and off.
this is in preparation for split ioctls for controlling pseudowire
behaviour. sc_cword can be set atomically and indepentently as a
separate variable.
|
|
i wrote this in mpe before porting and committing it in mpw, but
forgot to commit the mpe version.
|
|
no functional change
|
|
no functional change
|
|
this is a first step in breaking up the monolithic and redundant
SIOCSETMPWCFG ioctls
discussed with claudio@
|
|
sending an MPLS frame is weird compared to other address families.
other families figure out and pass the address on the local link
for ether_output to use for resolution, but AF_MPLS basically passes
a dummy sockaddr so ether_output can get the ethernet protocol field
right. ether_output then has to pull the route apart to figure out
which address and family to use for address resolution on the local
net. eg, MPLS tagged routes via ip addresess need to pull the route
apart and get at the AF_INET sockaddr to pass to arpresolve. that
code currently uses the destination address of the route, but if
that destination is not on the local network, we'd end up using it
for arp requests that don't work.
this change uses the rt_gateway sockaddr if RTF_GATEWAY is set.
this solves the problem in my testing and doesn't seem to break
other uses cases ive tried.
reported by adrian close via bugs@
ok deraadt@ claudio@
|
|
pfioc_src_nodes to size_t. This avoids integer truncation by casts
to unsigned. As the types of DIOCGETSTATES and DIOCGETSRCNODES
ioctl(2) arguments change, pfctl(8) and systat(1) should be updated
together with the kernel. Calculate number of pf(4) states as
size_t in userland.
OK sashan@ deraadt@
|
|
no functional change
|
|
|
|
because the rtable_l2 is modified before calling rt_ifa_del.
Triggered by regress test and reported by Moritz Buhl mbuhl at mbuhl dot me
|
|
ok dlg@
|
|
this adds an rwlock like mpe has which means only one thing can be
adding or removing a local mpls label, and those things check if
the interface is dying before doing their thing.
|
|
this is based on the changes to mpe i made yesterday. unfortunately
mpw has a monstor ioctl that configures all the things, which makes
the kernel side complicated. hopefully i can split them up.
|
|
the timeout handler if the interface is running.
ok claudio@
|
|
|
|
|
|
it was using ifp->if_rdomain for the rtalloc of the mpls encapsulated
tunnel in mpw_start.
|
|
|
|
experience with mpe shows you need RTF_LOCAL everywhere for del to work.
|
|
this borrows the SIOCSLIFPHYRTABLE and SIOCGLIFPHYRTABLE that tunnel
interfaces implement to set the rdomain mpls operates in.
ifconfig tunneldomain X lets you set it, and you can see the effect
with netstat -nr -f mpls -TX, but ifconfig currently doesnt show
the tunneldomain. yet.
|
|
SIOCSIFRDOMAIN is about the routes on top of an mpe interface. the
rdomain mpls operates in is independent of that, and currently
restricted to rdomain 0.
|
|
this allows mpls interfaces (mpe, mpw) to pass the rdomain they
wish the local label to be in, rather than have it implicitly forced
to 0 by these functions. right now they'll pass 0, but it will soon
be possible to have them rx packets in other rdomains.
previously the functions used ifp->if_rdomain for the rdomain.
everything other than mpls still passes ifp->if_rdomain.
ok mpi@
|
|
fails.
|
|
|
|
|
|
mpe would try to detect label collisions itself, but wasn't
coordinating with mpw or other labels, making it's solution incomplete.
this also means i won't need extra locking if i try to make the
ioctl paths mpsafe.
|