summaryrefslogtreecommitdiff
path: root/sys/net
AgeCommit message (Collapse)Author
2022-02-22Delete unnecessary #includes of <sys/domain.h> and/or <sys/protosw.h>Philip Guenther
net/if_pppx.c pointed out by jsg@ ok gnezdo@ deraadt@ jsg@ mpi@ millert@
2022-02-21in input, clear the address union before putting an ipv4 address in it.David Gwynne
the whole vxlan address is used for lookups in the RB tree, so any garbage on the stack where the address sits could confuse the lookup. it looks like i was lucky before, but if you receive vxlan over ipsec you are less lucky. found by and fix tested by jason tubnor.
2022-02-20Add missing newlines in two DPRINTFs (from Matthew Martin)Theo Buehler
Tweak capitalization (from jmc)
2022-02-18dont bother running ethernet multicast ioctl handlers.David Gwynne
while here turn ENETRESET into 0 in the ioctl path. there's no hardware to reset.
2022-02-18fix inverted check of running when changing tunnel rdomain.David Gwynne
you should only be able to change the tunnel rdomain when the interface is down. i was accidentally make sure you could only change the rdomain when the interface was up.
2022-02-18only unwind multicast when in learning mode.David Gwynne
found by bluhm@ when he ran the regress tests.
2022-02-16nat-to round-robin without a pool should fallback to POOL_NONEAlexandr Nedvedicky
bug reported by giovanni@ OK giovanni@
2022-02-16check pf rule "set prio" values consistently.David Gwynne
consistently means we do the check in pf_rule_copyin() so both DIOCADDRULE and DIOCCHANGERULE have the prio values checked. this in turn prevents invalid prio values getting set on a rule via DIOCCHANGERULE. this was caught by a kassert in the ifq priq code firing. Reported-by: syzbot+a8f8e24a44b441e71d93@syzkaller.appspotmail.com ok sashan@
2022-02-16prevent (re)opening of tun/tap interfaces that are being destroyed.David Gwynne
if an open tun (or tap) device is destroyed via the clone destroy ioctl (eg, like what ifconfig destroy does), there is a window while the open device is being revoked on the vfs side that a third thread can come and open it again. this in turn triggers a kassert in the ifconfig destroy path where it expects the device to be closed. fix this by having tun_dev_open check for the TUN_DEAD flag that the destroy function sets. this still relies on the kernel lock for serialisation. Reported-by: syzbot+5df2ad232f5f8b671442@syzkaller.appspotmail.com ok visa@
2022-02-16fill in support for rx prio handling.David Gwynne
2022-02-16rewrite vxlan to better fit the current kernel infrastructure.David Gwynne
the big change is removing the integration with and reliance on bridge(4) for learning vxlan endpoints. we have the etherbridge layer now (which is used by veb, nvgre, bpe, etc) so vxlan can operate independently of bridge(4) (or any other driver) while still dynamically learning about other endpoints. vxlan now uses the udp socket upcall mechanism to receive packets. this means it actually creates and binds udp sockets to use rather adding code in the udp layer for stealing packets from the udp layer. i think it's also important to note that this adds loop prevention to the code. this stops a vxlan interface being used to transmit a packet that was encapsulated in itself. i want to clear this out of my tree where it's been sitting for nearly a year. noone seems too concerned with the change either way. ok claudio@
2022-02-15Use knote_modify_fn() and knote_process_fn() in bpf.Visa Hankala
OK dlg@
2022-02-15only tweak ifp if_flags while holding NET_LOCK.David Gwynne
tun_dev_open and tun_dev_close were being optmistic.
2022-02-15make tun_link_state take the ifnet pointer instead of tun_softc.David Gwynne
it only works on struct ifnet data, so passing ifp makes it clearer what's actually being manipulated. also fix tun_dev_open so tun_link_state is called before if_put instead of immediately after.
2022-02-15remove unused and uneeded bits in a byte defineJonathan Gray
posix requires a byte to be 8 bits
2022-02-13The length value in bpf_movein() is casted to from size_t to u_intAlexander Bluhm
and then rounded before checking. Put the same check before the calculations to avoid overflow. Reported-by: syzbot+6f29d23eca959c5a9705@syzkaller.appspotmail.com OK claudio@
2022-02-13Rename knote_modify() to knote_assign()Visa Hankala
This avoids verb overlap with f_modify.
2022-02-11Replace manual !klist_empty()+knote() with KNOTE().Visa Hankala
OK mpi@
2022-02-09let pfattach() to also initialize pf_default_rule_new to avoidAlexandr Nedvedicky
div-by-zero in pf_purge() Reported-by: syzbot+e720e3bab51366d7b667@syzkaller.appspotmail.com OK deraadt@
2022-02-08Do not /0 if timeout[PFTM_INTERVAL] manages to become zeroTheo de Raadt
crash noticed by gnezdo, a seperate commit will fix the identified cause, but being careful at this point is a good idea. ok sashan
2022-02-07In rtredirect() change an bad assignment in an if condition to theClaudio Jeker
correct equality check. Found by and OK jsg@
2022-02-05make bpf_movein align the packet payload.David Gwynne
bluhm@ hit a problem while running a regress test where a packet generated and injected via bpf ends up being consumed by the network stack. the stack assumes that packets are aligned properly, but bpf was lazy and put whatever was written to it at the start of an mbuf. ethernet has a 14 byte header, so if you put that at the start the payload will be misaligned by 2 bytes. bpf already has handling for different link header types, so this handling is extended a bit to align the payload after the link header. while here we're fixing up a few error codes. short packets produce EINVAL instead of EPERM, and packets larger than the biggest mbuf the kernel supports generates EMSGSIZE. with tweaks and ok bluhm@
2022-02-05remove an extra set of brackets. no functional change.David Gwynne
2022-01-28When it's the possessive of 'it', it's spelled "its", without thePhilip Guenther
apostrophe.
2022-01-24An af-to pf rule must have an address family naf to use afterAlexander Bluhm
translation. Make stricter sanity checks in pf ioctl to avoid later crashes during packet processing. Reported-by: syzbot+0ef9190e7d0195496d0d@syzkaller.appspotmail.com OK sashan@
2022-01-20pfkey import_flow() must do the NULL check before doing pointerAlexander Bluhm
arithmetic. found by kubsan; joint work with tobhe@; OK millert@
2022-01-20Shifting signed integers left by 31 is undefined behavior in C.Alexander Bluhm
found by kubsan; joint work with tobhe@; OK miod@
2022-01-18return EIO, not ENXIO, when the interface underneath ifq_deq_sleep dies.David Gwynne
this is consistent with other drivers when they report their underlying device being detached.
2022-01-18a comment about bridges shouldnt list switch(4), but can have veb(4).David Gwynne
2022-01-16activate/notify waiting kq kevents from bpf_wakeup directly.David Gwynne
this builds on the mpsafe kq/kevent work visa has been doing. normally kevents are notified by calling selwakeup, but selwakeup needs the KERNEL_LOCK. because bpf runs from all sorts of contexts that may or may not have the kernel lock, the call to selwakeup is deferred to the systq which already has the kernel lock. while this avoids spinning in bpf for the kernel lock, it still adds latency between when the buffer is ready for a program and when that program gets notified about it. now that bpf kevents are mpsafe and bpf_wakeup is already holding the necessary locks, we can avoid that latency. bpf_wakeup now checks if there are waiting kevents and notifies them immediately. if there are no other things to wake up, bpf_wakeup avoids the task_add (and associated reference counting) to defer the selwakeup call. selwakeup can still try to notify waiting kevents, so this uses the hint passed to knote() to differentiate between the notification from bpf_wakeup and selwakeup and returns early from the latter. ok visa@
2022-01-13Make bpf event filter MP-safeVisa Hankala
Use bd_mtx to serialize bpf knote handling. This allows calling the event filter without the kernel lock. OK mpi@
2022-01-13Return an error if bpfilter_lookup() fails in bpfkqfilter()Visa Hankala
The lookup should not fail because the kernel lock should prevent simultaneous detaching on the vnode layer. However, most other device kqfilter routines check the lookup's outcome anyway, which is maybe a bit more forgiving. OK mpi@
2022-01-11move allocations in DIOCSADDRULE and DIOCHANGERULE outside of locks.Alexandr Nedvedicky
this diff lets pf_rule_copyin() to be called outside of PF_LOCK()/NET_LOCK(). OK bluhm@
2022-01-10Use NULL instead of 0 for pointers.Jan Klemkow
OK bluhm@
2022-01-07SIOCSIFXFLAGS drops into the SIOCSIFFLAGS to perform auto-up of theTheo de Raadt
interface. If this operation fails (probably due to missing firmware), we must undo changes to the SIOCSIFXFLAGS xflags. ok stsp.
2022-01-05add NSH and NHRP ethertypes, mostly for tcpdump stuff.David Gwynne
ok deraadt@
2022-01-05rename ETHERTYPE_PAE to ETHERTYPE_EAPOL.David Gwynne
everyone else seems to use ETHERTYPE_EAPOL, and as a bonus it also appears to be more correct. ok deraadt@ stsp@
2022-01-04Add `ipsec_flows_mtx' mutex(9) to protect `ipsp_ids_*' list andYASUOKA Masahiko
trees. ipsp_ids_lookup() returns `ids' with bumped reference counter. original diff from mvs ok mvs
2022-01-02spellingJonathan Gray
ok jmc@ reads ok tb@
2021-12-30Use a distinct variable while iterating the list of existing devices.Anton Lindqvist
ok mvs@ Reported-by: syzbot+e2d1df67f742a5a47938@syzkaller.appspotmail.com Reported-by: syzbot+72298724beda82ec8e7f@syzkaller.appspotmail.com
2021-12-30Prevent concurrent access to incomplete or dying `sc' caused by sleepVitaliy Makkoveev
points in pppacopen() and pppacclose() paths. Use the same "sc_ready" logic we use for 'pppx_if' structure. Reported-by: syzbot+a7ac144b48f7f471f689@syzkaller.appspotmail.com ok anton@ dlg@
2021-12-28whitespace tweak, no functional change.David Gwynne
2021-12-28it doesnt make sense to configure a vport as a span port.David Gwynne
2021-12-28move away from using the M_PROTO1 flag to prevent loops with vportsDavid Gwynne
if a vlan interface is configured on a vport interface, vlan(4) will take the packet away from ether_input before the veb bridge input handler gets to clear M_PROTO1. this leaves the flag on the mbuf as it goes through the l3 stacks. if it goes back out a vport into a veb, the presence of M_PROTO1 means the packet ends up getting dropped, which is unexpected. this diff specialises vport handling by veb even more to avoid the problem the flag was handling. vports get their own bridge input handler that skips veb processing completely because a packet being received on a vport can only occur if a veb has decided to forward it there and has already processed it. when the stack sends a packet out a vport interface, then we do actual veb bridge input handling. bug reported on misc@ and the fix tested by Simon Baker
2021-12-26DIOCHANGERRULE ioctl must set pointer to ruleset in rule it inserts.Alexandr Nedvedicky
Reported-by: syzbot+7718c5f69c595f76b298@syzkaller.appspotmail.com OK bluhm@, OK jmatthew@
2021-12-26make 'set skip on ...' in pf.conf dynamicAlexandr Nedvedicky
This is an old issue in pf(4): whenever new interface appears in IP stack, we must reload pf.conf to apply 'set skip on ...' to newly plumbed network interfaces. Time has come to fix it. The idea is to also create pfi_kif for interfaces, which are referred by 'set skip on ...'. Such pfi_kif instances are created/destroyed by pfi_set_flags()/pfi_clear_flags(). claudio@ dragged my attention to this in Gouveia. Also his feedback helped me to put change into shape. OK claudio@
2021-12-23IPsec is not MP safe yet. To allow forwarding in parallel withoutAlexander Bluhm
dirty hacks, it is better to protect IPsec input and output with kernel lock. Not much is lost as crypto needs the kernel lock anyway. From here we can refine the lock later. Note that there is no kernel lock in the SPD lockup path. Goal is to keep that lock free to allow fast forwarding with non IPsec traffic. tested by Hrvoje Popovski; OK tobhe@
2021-12-20Use per-CPU counters for tunnel descriptor block (TDB) statistics.Vitaliy Makkoveev
'tdb_data' struct became unused and was removed. Tested by Hrvoje Popovski. ok bluhm@
2021-12-19There are occasions where the walker function in tdb_walk() mightAlexander Bluhm
sleep. So holding the tdb_sadb_mtx() when calling walker() is not allowed. Move the TDB from the TDB-Hash to a temporary list that is protected by netlock. Then unlock tdb_sadb_mtx and traverse the list to call the walker. OK mvs@
2021-12-16When adding the extra 10% of space to a needed sysctl buffer use mathClaudio Jeker
that is less likely to overflow the int type used. A BGP fullfeed is now so big that this calculation overflowed and then got sign extended. The result was for example 'route -n show' failures. Problem identified with deraadt@ OK deraadt@ (more cleanup needed but this fix is a good start)