summaryrefslogtreecommitdiff
path: root/sys/net
AgeCommit message (Collapse)Author
2021-01-02Don't call if_deactivate() in bridge_clone_destroy(). Followingmvs
if_detach() will do this. ok kn@
2021-01-02Remove PIPEX{S,G}MODE ioctl(2) commands. This time they are pretty dummymvs
and were kept only for backward compatibility reasons. ok mpi@ yasuoka@
2021-01-02optimise bpf_catchpacket and bpf_wakeup.David Gwynne
bpf_catchpacket had a chunk to deal with reader timeouts, but that has largely been moved to bpfread. the vestigal code that was left still tried to wake up a reader when a buffer got full, but there already is a chunk of code that wakes up readers when the buffer gets full. bpf_wakeup now checks for readers before calling wakeup directly, rather than pushing the wakeup to a task and calling it unconditionally. the task_add is now only done when the bpfdesc actually has something that needs it. ok visa@
2021-01-02bpf(4): remove tickscheloha
Change bd_rtout to a uint64_t of nanoseconds. Update the code in bpfioctl() and bpfread() accordingly. Add a local copy of nsecuptime() to make the diff smaller. This will need to move to kern_tc.c if/when we have another user elsewhere in the kernel. Prompted by mpi@. With input from dlg@. ok dlg@ mpi@ visa@
2020-12-30Fix pppoe_dispatch_disc_pkt definition to be in accordance with style(9)mvs
ok claudio@ kn@
2020-12-30Convert the `off' argument of pppoe_dispatch_disc_pkt function tomvs
local variable. This argument was always passed as 0. ok kn@
2020-12-28Remove unused start routinekn
enc(4) does not use the ifqueue API at all; IPsec packets are directly transformed in the IP input/output routines. enc_start() is never called (by design) so remove it for clarity. OK mpi
2020-12-26bpf(4): bpf_d struct: replace bd_rdStart member with bd_nreaders membercheloha
bd_rdStart is strange. It nominally represents the start of a read(2) on a given bpf(4) descriptor, but there are several problems with it: 1. If there are multiple readers, the bd_rdStart is not set by subsequent readers, so their timeout is screwed up. The read timeout should really be tracked on a per-thread basis in bpfread(). 2. We set bd_rdStart for poll(2), select(2), and kevent(2), even though that makes no sense. We should not be setting bd_rdStart in bpfpoll() or bpfkqfilter(). 3. bd_rdStart is buggy. If ticks is 0 when the read starts then bpf_catchpacket() won't wake up the reader. This is a problem inherent to the design of bd_rdStart: it serves as both a boolean and a scalar value, even though 0 is a valid value in the scalar range. So let's replace it with a better struct member. "bd_nreaders" is a count of threads sleeping in bpfread(). It is incremented before a thread goes to sleep in bpfread() and decremented when a thread wakes up. If bd_nreaders is greater than zero when we reach bpf_catchpacket() and fbuf is non-NULL we wake up all readers. The read timeout, if any, is now tracked locally by the thread in bpfread(). Unlike bd_rdStart, bpfpoll() and bpfkqfilter() don't touch bd_nreaders. Prompted by mpi@. Basic idea from dlg@. Lots of input from dlg@. Tested by dlg@ with tcpdump(8) (blocking read) and flow-collector (https://github.com/eait-itig/flow-collector, non-blocking read). ok dlg@
2020-12-25Refactor klist insertion and removalVisa Hankala
Rename klist_{insert,remove}() to klist_{insert,remove}_locked(). These functions assume that the caller has locked the klist. The current state of locking remains intact because the kernel lock is still used with all klists. Add new functions klist_insert() and klist_remove() that lock the klist internally. This allows some code simplification. OK mpi@
2020-12-16Reject rules with invalid port rangeskn
Ranges where the left boundary is bigger than the right one are always bogus as they work like `port any' (`port 34<>12' means "all ports") or in way that inverts the rule's action (`pass ... port 34:12' means "pass no port at all"). Add checks for all ranges and invalidate those that yield no or all ports. For this to work on redirections, make pfctl(8) pass the range's type, otherwise boundary including ranges are not detected as such; that is to say, `struct pf_pool's `port_op' member was unused in the kernel so far. `rdr-to' rules with invalid ranges could panic the kernel when hit. Reported-by: syzbot+9c309db201f06e39a8ba@syzkaller.appspotmail.com OK sashan
2020-12-15missing NET_LOCK()/NET_UNLOCK() in pf_osfp_flush()Alexandr Nedvedicky
OK mpi@
2020-12-15clear M_TIMESTAMP in if_enqueue.David Gwynne
this is to avoid a timestamp being used on the way out of the stack (eg, in bpf), or if it reenters the stack (eg, if it goes between rdomains with pair(4)).
2020-12-14Make sure that the address families of a flow's source address,tobhe
destination address and their netmasks match, otherwise return EINVAL. ok bluhm@ patrick@
2020-12-12Correct wrong type of variable and remove useless casts.jan
OK bluhm@
2020-12-12Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.jan
OK dlg@, bluhm@ No Opinion mpi@ Not against it claudio@
2020-12-12call if_enqueue() to send a packet, not a member ports (*ifp->if_enqueue)David Gwynne
the latter is too clever, and nothing else does it.
2020-12-12get bpf_mtap_ether to call _bpf_mtap directly instead of via bpf_mtap.David Gwynne
this is so _bpf_mtap can look at the mbuf with packet headers on it so it can fill in more stuff in the bpf_hdr struct. ive been running this in production for most of a month now and it's working well.
2020-12-12try to read the mbuf timestamp from the mbuf with the pkthdrs in it.David Gwynne
2020-12-11bpf(4): BIOCGRTIMEOUT, BIOCSRTIMEOUT: protect bd_rtout with bd_mtxcheloha
Reading and writing bd_rtout is not an atomic operation, so it needs to be done under the per-descriptor mutex. While here, start annotating locking in bpfdesc.h. There's lots more to do on this front, but you have to start somewhere. Tweaked by mpi@. ok mpi@
2020-12-10classify packets without a flowid into bucket 0, not a random bucket.David Gwynne
putting packets into random buckets means packets in a flow/connection will be reorderd. pf assigns a flowid if it's enabled, and you need pf to configure code, so it's reasonable to assume that most packets will have a flowid. using bucket 0 like this is what we do in most other places that bin packets with the flowid.
2020-12-10when setting a flowid, set the M_FLOWID csum_flags bit too.David Gwynne
this "fixes" TCP going over an interface with fq codel enabled. the way the codel code classifies a packet without a flowid set is to randomly assign it to a bucket. this in turn means that packets will get reordered, and tcp hates that. sthen was able to find a test case and narrow down at which time the problem appeared, helped greatly. tested by sthen@ and millert@ ok sashan@ jmatthew@
2020-12-10Convert gre_sysctl to sysctl_bounded_arrgnezdo
Fixed up a reference to gre_wccp where a fixed value from wwcp standard was intended. ok gkoehler@
2020-12-09add RCS tagsTheo Buehler
2020-12-07synproxy should be processing incoming SYN packets only.Alexandr Nedvedicky
issue noticed by sthen@. fix discussed with bluhm@ and procter@ OK bluhm@, kn@, procter@
2020-12-01bzero the antireplay counter rwlock before rw_init'ing it, not after.Stuart Henderson
This was triggering a WITNESS detection witness: lock_object uninitialized: 0xffff800000bcf0d8 Starting stack trace... witness_checkorder(ffff800000bcf0d8,9,0) at witness_checkorder+0xab rw_enter_write(ffff800000bcf0c8) at rw_enter_write+0x43 noise_remote_decrypt(ffff800000bcea48,c4992785,0,fffffd80073c89bc,10) at noise_remote_decrypt+0x135 wg_decap(ffff80000054a000,fffffd805f53ac00) at wg_decap+0xda wg_decap_worker(ffff80000054a000) at wg_decap_worker+0x7a taskq_thread(ffff80000012d900) at taskq_thread+0x9f alternating between two lock objects. From Matt Dunwoodie, thanks semarie@ for explanations about witness and looking at the code.
2020-11-12Document art locking.Martin Pieuchot
ok denis@, jmatthew@
2020-11-07Rework source IP address setting.denis
- Move most of the processing out of rtable.c (reasonnable tb@, ok bluhm@) - Remove memory allocation, store pointer to existing ifaddr - Fix tunnel interface handling looks fine mpi@
2020-11-05Enable support for ASN1_DN ipsec identifiers.Peter Hessler
Tested with multiple Window 10 Pro (ver 2004) clients, and OpenBSD+iked as the server. OK tobhe@ sthen@ kn@
2020-11-05Replace wrong cast with satosin.denis
Advised by bluhm@
2020-11-04Use sysctl_int_bounded in bpf_sysctlgnezdo
Unlike the other cases of sysctl_bounded_arr this one uses a dynamic limit. OK millert@
2020-11-03replace the nvgre node when the endpoint ip changes.David Gwynne
this helps nvgre follow things like carp masters changing on the inside of the virtual network. "makes sense" jmatthew@
2020-10-31release the correct lock in noise_remote_begin_session()Jasper Lievisse Adriaanse
fixes a "noise_keypair: lock not held" panic observed by Caspar Sc hutijser from Jason A. Donenfeld
2020-10-29Add feature to force the selection of source IP addressdenis
Based/previous work on an idea from deraadt@ Input from claudio@, djm@, deraadt@, sthen@ OK deraadt@
2020-10-22- missing NET_UNLOCK() in pf_ioctl.c error pathAlexandr Nedvedicky
Reported-by: syzbot+b9af9c29ed1a6dabda25@syzkaller.appspotmail.com OK anton@
2020-10-21Provide dummy definitions for NET_LOCK() and PF_LOCK() when compiling thisMark Kettenis
file as part of tcpdump(8). Unbreaks the tree. ok deraadt@
2020-10-21- fixing fatal typos fp vs fp_prealloc.Alexandr Nedvedicky
OK mpi
2020-10-21Prevent NULL dereference introduced in previous.Martin Pieuchot
Used a different variable to not shadow `entry' allocated before grabbing the lock.
2020-10-21- move NET_LOCK() further down in pf_ioctl.c. Also move memory allocationsAlexandr Nedvedicky
outside of NET_LOCK()/PF_LOCK() scope in easy spots. OK kn@
2020-10-14replace a MAXPATHLEN that slipped back in with PATH_MAX so userland won'tChristian Weisgerber
have to pull in <sys/param.h> ok kn@ sashan@ deraadt@
2020-10-04fix indentanton
2020-10-03Introduce `if_cloners_lock' rwlock and use it to serializemvs
if_clone_{create,destroy}(). This fixes the races described below. if_clone_{create,destroy}() are kernel locked, but since they touch various sleep points introduced by rwlocks and M_WAITOK allocations, without serialization they can intersect due to race condition. The avoided races are: 1. While performing if_clone_create(), concurrent thread which performing if_clone_create() can attach `ifp' with the same `if_xname' and made inconsistent `if_list' where all attached interfaces linked. 2. While performing if_clone_create(), concurrent thread which performing if_clone_destroy() can kill this incomplete `ifp'. 3. While performing if_clone_destroy(), concurrent thread which performing if_clone_destroy() can kill this dying `ifp'. ok claudio@ kn@ mpi@ sashan@
2020-10-02relax check for valid onrdomain range. onrdomain is -1 if the value isClaudio Jeker
unused by the rule. So skip the rest of the check in that case. Fixes rulest loading for semarie@ OK semarie@
2020-10-01rdomain IDs do not need to exist for "on rdomain N" to workkn
Unlike "... rtable N", pf.conf(5)'s "on rdomain N" does not alter packet state and will always work no matter if rdomain N currently exists or not, i.e. the rule "pass on rdomain 42" will simply match (and pass) packets if rdomain 42 exists, and it will simply not match (neither pass nor block) packets if 42 does not exist. There's no need to reload the ruleset whenever routing domains are created or deleted, which can already be observed now by creating an rdomain, loading rules referencing it and deleting the same rdomain immediately afterwards: pf will continue to work as expected. Relax both pfctl(8)'s parser check as well as pf(4)'s copyin routine to accept any valid routing domain ID without expecting it to exist at the time of ruleset creation - this lifts the requirement to create rdomains before referencing them in pf.conf while keeping pf behaviour unchanged. Prompted by yasuoka's recent pfctl parse.y r1.702 commit requiring an rtable to exist upon ruleset creation. Discussed with claudio and bluhm at k2k20. Feedback sashan OK sashan yasouka claudio
2020-10-01fix indentationJonathan Gray
2020-09-30We have no if_attachtail() function so remove the declaration.mvs
ok deraadt@ claudio@
2020-09-23Fix declaration of `routedomain'. It's not external here.mvs
"Correct" by deraadt@
2020-09-22Document locks which protect `rtpcb' struct members.mvs
ok mpi@
2020-09-20Set `if_snd' queue maximum length to 1. This enforces calls ofmvs
pppx_if_qstart() and pppac_qstart() with netlock held. Otherwise we can't be sure about netlock status while performing these handlers. Problem reported by Glen Faustino. ok yasuoka@
2020-09-13Start documenting locks for struct pppoe_softc memberskn
Pretty much all members are under the net lock, some are proctected by both net and kernel lock, e.g. the start routine is called with KERNEL_LOCK(). OK mpi
2020-09-12Keep port interface UP on removalkn
There is no reason to change flags on member interfaces when removing them, aggr(4) does not pull its members down either. OK florian bluhm