Age | Commit message (Collapse) | Author |
|
ok kettenis@
|
|
that represent various header fields. One place where OXMs are used is in
the sef_field action, which contains one OXM representing the header field
to set, followed by padding to align the action in the OpenFlow message to
64 bits. Currently, we assume that a set_field action can contain multiple
OXMs and that they do not need to be padded.
This matches the way we handle OpenFlow messages that contain set_field
actions so that we follow the specs.
OK ori claudio
|
|
No object change
OK sashan
|
|
the problem was introduced with a "mechanical" patch, which
replaced all "breaks;" with "PF_UNLOCK(); break;" This is wrong
for case of DIOCGETRULESETS.
issue analyzed and patch created by Joerg Goltermann <jg@osn.de>
OK tb@
|
|
no functional change
|
|
this makes the driver more like the rest of the tree. no functional change.
|
|
all interfaces. Most handlers will ignore it but at least umb(4) will
send a response back.
OK florian@
|
|
In that case the function can just return. Part of a larger diff
to use the if_rtrequest functions for RTM_PROPOSAL info.
OK florian@
|
|
3rd party software stuck with c90 will still compile. Quick fix since
RTM_PROPOSAL will most porbably change later on.
Reported by naddy and aja
|
|
the size constraint to allow this to pass through the kernel.
Looks good to deraadt@
|
|
kernel. Will be used to have umb(4) inform unwind(8) about DNS changes.
OK bluhm@ tested by florian@ and deraadt@
|
|
|
|
this makes tun more consistent with more of our drivers.
|
|
|
|
it was possible for multiple tun0 interfaces to be created concurrently,
which confused the pf_if code. when concurrent tun0 interfaces were
created, the pf_if code tried to add an addrhook for each interface,
but because they shared a name the two hooks ended up on one
interface. if the interface with two addrhooks was destroyed,
KASSERT(TAILQ_EMPTY(&ifp->if_addrhooks)) would trip. before the
KASSERT existed, we'd blindly free a tailq head, which would corrupt
the list, which would cause faults in pfi_detach_ifnet() anyway.
so now we take more care to ensure multiple tun0 interfaces cannot
exist concurrently.
inserting a tun or tap interface into the list of tun or tap
interfaces now checks to ensure that an interface with the same
unit number doesnt already exist. if an existing interface is found,
insert errors with EEXIST and the callers can unwind. the tunopen
and tapopen paths cope with losing the race.
Reported-by: syzbot+2b26012b9ea93834723e@syzkaller.appspotmail.com
sashan@ made a reliable test that could produce the failures
ok sashan@
|
|
|
|
in and the pf_pktdelay struct ws not declared and initialzed properly.
ok rob@ kn@
|
|
if_detach passes the groupname from an ifg_list struct to if_delgroup,
if_delgroup then uses the name to find the same ifg_list struct so
it can free it, and then passes the name from the struct to
pfi_group_change(). at worst this can cause a fault if malloc(9)
actually unmaps the page the struct was on, and at best it causes
pf interfaces with garbage names to be created.
ok sashan@ bluhm@
|
|
of a network interface.
OK deraadt@ claudio@
|
|
found by Ilja Van Sprundel; OK deraadt@ dlg@
|
|
|
|
there's now a bunch of drivers that implement the bridge ioctls,
but they're inconsistent at checking privilege. doing it up front
once means less code duplication, and more consistent application
of the checks.
ok bluhm@ deraadt@
|
|
found by Ilja Van Sprundel
ok deraadt@ mpi@ bluhm@
|
|
when vxlans parent interface has a link state change event, vxlan
reconfigures the parent to cope with things not being as it expects
when the interface comes back. it does this by removing its config
and then adding it again. part of it's config removal is to take
the link state hook away, and part of putting the config on is is
adding the link state hook.
if we're running an interfaces link state hooks from head to tail,
and the vxlan hook adds itself back to the tail, we end up running
the vxlan hook forever cos it always ends up at the tail.
bluhm@ hit this infinite loop while running regress tests. if turns
out we need to run link state hooks in the same order they were
added, i have a way to avoid this situation, but this is simple.
|
|
|
|
|
|
|
|
|
|
from slaacd and dhclient when it starts.
Discussed with deraadt who notes that it's a bit odd to have this as a
route priority. One idea is to have this as a dedicated route message
and not a priority.
But we want to move this forward and learn how it can be used so we
are going with this for now.
OK deraadt
|
|
|
|
|
|
this follows what's been done for detach and link state hooks, and
makes handling of hooks generally more robust.
address hooks are a bit different to detach/link state hooks in
that there's only a few things that register hooks (carp, pf, vxlan),
but a lot of places to run the hooks (lots of ipv4 and ipv6 address
configuration).
an address hook cookie was in struct pfi_kif, which is part of the
pf abi. rather than break pfctl -sI, this maintains the void * used
for the cookie and uses it to store a task, which is then used as
intended with the new api.
|
|
i think this is a fix for a real bug. pfsync leaked the hooks it
had on a parent^Wsyncdev when the parent went away. now there's
KASSERTs to make sure all hooks are removed before an interface
goes away, the leak caused the KASSERTs to fire and made the bug
obvious.
found by hrvoje popovski
|
|
it's no longer necessary to hold NET_LOCK to call interface hook
adds or dels now, but it is necessary not to hold NET_LOCK when
calling some barrier functions.
found by hrvoje popovski
|
|
i had NET_ASSERT_LOCKED() in the hook add and remove operations,
because that's what's held when the hooks are run. some callers do
not hold the NET_LOCK when calling them though, eg, bridge(4). aggr
and tpmr used to not hold NET_LOCK while being destroyed, which
also caused the asserts to fire, so i moved the port destroys inside
NET_LOCK, but now I have deadlocks with some barrier calls.
the hooks having their own lock means callers don't have to hold
NET_LOCK and the list will stay sane. the code that runs the hooks
gives up the mutex when calling the hook, but keeps track of where
it's up to bey putting a cursor in the list.
there's a single global mutex for all the interface linkstate and
detach hooks, but this stuff isn't a hot path by any stretch of the
imagination.
based on (a lot of) testing by hrvoje popovski. thank you.
|
|
this is largely mechanical, except for carp. this moves the addition
of the carp link state hook after we're committed to using the new
interface as a carpdev. because the add can't fail, we avoid a
complicated unwind dance. also, this tweaks the carp linkstate hook
so it only updates the relevant carp interface, not all of the
carpdevs on the parent.
hrvoje popovski has tested an early version of this diff and it's
generally ok, but there's some splasserts that this diff fires that
i'll fix in an upcoming diff.
ok claudio@
|
|
commit.
|
|
Do not overwrite the address family, we need to know if this is IPv4
or IPv6 to parse the message.
Nameservers are IP addresses, not NUL-terminated strings.
Check that the length is a multiple of the length of an IP address.
OK krw
|
|
the main semantic change is that things registering detach hooks
have to allocate and set a task structure that then gets added to
the list. this means if the task is allocated up front (eg, as part
of carps softc or bridges port structure), it avoids the possibility
that adding a hook can fail. a lot of drivers weren't checking for
failure, and unwinding state in the event of failure in other parts
was error prone.
while doing this i discovered that the list operations have to be
in a particular order, but drivers weren't doing that consistently
either. this diff wraps the list ops up so you have to seriously
go out of your way to screw them up.
ive also sprinkled some NET_ASSERT_LOCKED around the list operations
so we can make sure there's no potential for the list to be corrupted,
especially while it's being run.
hrvoje popovski has tested this a bit, and some issues he discovered
have been fixed.
ok sashan@
|
|
noone seems to use it, and we should not encourage people to use
it by having it available. it's been disabled for most of the last
release and noones asked for it in 6.6, so i'm taking that as an
ok for this removal.
|
|
|
|
this has been reported by a bunch of people including chris@, jon
williams on bugs@, and ze loff on misc@
|
|
|
|
BPF: remove redundant reference counting of filedescriptors
Anton@ made problem crystal clear:
I've been looking into a similar bpf panic reported by syzkaller,
which looks somewhat related. The one reported by syzkaller is caused
by issuing ioctl(SIOCIFDESTROY) on the interface which the packet filter
is attached to. This will in turn invoke the following functions
expressed as an inverted stacktrace:
1. bpfsdetach()
2. vdevgone()
3. VOP_REVOKE()
4. vop_generic_revoke()
5. vgonel()
6. vclean(DOCLOSE)
7. VOP_CLOSE()
8. bpfclose()
Note that bpfclose() is called before changing the vnode type. In
bpfclose(), the `struct bpf_d` is immediately removed from the global
bpf_d_list list and might end up sleeping inside taskq_barrier(systq).
Since the bpf file descriptor (fd) is still present and valid, another
thread could perform an ioctl() on the fd only to fault since
bpfilter_lookup() will return NULL. The vnode is not locked in this path
either so it won't end up waiting on the ongoing vclean().
Steps to trigger the similar type of panic are straightforward, let there be
two processes running concurrently:
process A:
while true ; do ifconfig tun0 up ; ifconfig tun0 destroy ; done
process B:
while true ; do tcpdump -i tun0 ; done
panic happens within few secs (Dell PowerEdge 710)
OK @visa, OK @anton
|
|
This is clearer and more consistent with the rest of the kernel.
OK deraadt@ sashan@
|
|
ok cheloha@, visa@, akoshibe@
|
|
the pressure thresholds were too low in a lot of situations, and
still produced hard to understand interactions at high thresholds.
until we understand the numbers better, and for release, we're going
back counting the length of the per interface input queues.
this was originally based on a report of bad tcp performance with
em(4) by mlarkin, but is very convincingly demonstrated by a bunch
of work procter@ has been doing. deraadt@ is keen on the pressure
backout so he can cut a release.
|
|
ip_ether.h is where netinet/ip_ipip.h got the forward declaration
for struct tdb from though, so fix that before cutting ip_ether.h
out of gif.
|
|
jmatthew@ and i thought i'd broken lacp in trunk(4) when adding
aggr(4), but we couldnt see how. turns out we were trying lacp mode
trunk only on new drivers (mcx and ixl) that didn't set if_baudrate,
which tickled an edge case in trunk that prevented it from selected
those interfaces for an aggregation. ixl and mcx have since been
fixed, but there's no reason for trunk to be this picky when we
make this ommission again.
ok deraadt@
|
|
this is not needed now that the "public" api does not provide a way
to pass a custom copy function in for the internals to pass around.
ok claudio@ visa@
|