Age | Commit message (Collapse) | Author |
|
not return any pointers without lock anymore.
OK mvs@ mbuhl@
|
|
pppoe_start(), so we can't use it for pppoe(4) data protection. Except
input path, pppoe(4) always accessed with kernel lock held, so grab it
around pppoeintr() too.
Interfaces should not use netlock for their data protection. They should
rely on kernel lock or implement their own.
ok bluhm@ bket@
|
|
prevent future diffs to be ugly.
ok bluhm@
|
|
|
|
to make make media structures MP safe.
OK mvs@
|
|
Long time ago pipex(4) session can't be deleted until both pipex(4)
input and output queues become empty. Dead sessions were linked to the
stack and the `ip_forward' flag was used to prevent packets forwarding.
npppd(8) marked such sessions by doing PIPEXCSESSION ioctl(2) call.
But since we started to unlink close session from the stack, this logic
became unnecessary. Also pipex(4) session could be closed just after
close request.
npppd(8) was the only userland program which did PIPEXCSESSION ioctl(2)
call, and we removed it week ago. It's time to remove the remains.
Now the `flags' member of 'pipex_session' structure became immutable.
ok yasuoka@
|
|
pipex_ip_output(). The all sessions loop was reworked to make possible
to drop the lock within.
ok bluhm@ yasuoka@.
|
|
routines finished.
Call ifq_barrier(9) just after we unlinked dying interface from the stack.
From this point it is not accessible by if_get(9) and if_unit(9), and all
concurrent threads owning interface pointer finished. It also detached
from pseudo drivers like bridge(4). We only could have concurrent
(*if_qstart)() handlers running, so wait them and then continue
destruction.
Reported and tested by Hrvoje Popovski.
ok bluhm@
|
|
easier to read and grep as ifm_status was used in both structs
ifmediareq and ifmedia with different meaning.
OK mvs@
|
|
value if the `error' is set instead of continue to sppp_ioctl().
ok bluhm@
|
|
OK jsg@
|
|
Also remove unneeded seltrue() and selfalse().
OK mpi@ jsg@
|
|
Also remove unneeded includes of <sys/poll.h> and <sys/select.h>.
Some addenda from jsg@.
OK miod@ mpi@
|
|
remove the route from the list. In rtable_match() check if the
route entry is NULL.
discussed with mpi@ jmatthew@ claudio@; OK mpi@
|
|
ok claudio@ mpi@
|
|
exclusive. Do the pppoe(4) input within netisr handler with exclusive
netlok held and remove kernel lock hack from ether_input().
This is the step back, but it makes ether_input() path better then it
is now.
Tested by Hrvoje Popovski.
ok bluhm@ claudio@
|
|
Reported by Hrvoje Popovski. ok bluhm@
|
|
This really pointed out that the place syncookies were hooked in was almost,
but not completely right. The way it was the special case for tcp fast port
reuse in pf_test_state wasn't hit, because the first packet
hitting that was the ACK from the peer finishing the 3WHS, and the
reconstructed SYN came after. We're now doing pf_find_state (and *only* that)
first, then syncookies, then going on so that the old state is thrown away
properly and we get a new one with the sequence number modulator set up
correctly
Bonus: -11 lines of code
tracked down (that took a while) + fixed under contract with Hush
Communications Canada; special thanks to Lyndon
ok sashan
|
|
operations.
OK mvs@
|
|
supported interface.
pointed out by bluhm@
OK bluhm@
|
|
PPPOE packets within. Do (*if_output)() calls within netisr handler with
netlock held.
We can't predict netlock state when pipex(4) related (*if_qstart)()
handlers called. This means we can't use netlock within pppac_qstart()
and pppx_if_qstart() handlers.
ok bluhm@
|
|
use a per rttimer struct timeout. On enqueue the struct rttimer belongs
to the timeout, in case the route is removed before the timer fires
cleanup based on the timeout_del() return value. If the timeout currently
running then just clear the rtt_rt pointer and let the timeout handle the
cleanup. This should hopefully fix the icmp_pmtu_timeout crashes reported
by some people.
OK bluhm@
|
|
prevent concurrent access to rt_llinfo from rtrequest_delete().
But the common case, when the MAC address is already known, works
without lock.
tested by Hrvoje Popovski; OK mvs@
|
|
|
|
disabled by default. Also add a tso option to ifconfig(8) to enable and
disable this feature.
ok deraadt
|
|
In the rt msg buffer the size of the full buffer is calculated first then
filled out after allocating the mbuf. In the sysctl code this is not needed
since the buffer is already provided.
OK mvs@
|
|
(*if_qstart)() and we don't worry it's not serialized with the rest of
output path. Also we will process already enqueued pipex(4) packets
regardless on `pipex_enable' state.
Use the local copy of `pipex_enable' within pppx_if_output(), otherwise we
loose consistency.
pointed and ok by bluhm@
|
|
processing path. Such sessions already reached time to live timeout, and
the garbage collector waits a little to before kill them. Otherwise we
could make session's life time more then PIPEX_CLOSE_TIMEOUT.
ok bluhm@
|
|
is not required. In packet processing path we have shared netlock held,
but we do read-only access on per session `flags' and `ifindex'. We always
modify them from ioctl(2) path with exclusive netlock held. The rest of
pipex(4) session is immutable or uses per-session locks.
ok bluhm@
|
|
|
|
check to the less awkward w->w_needed <= w->w_given.
OK bluhm@
|
|
(*if_qstart)() be always called with netlock held doesn't work anymore
with PPPOE sessions.
Introduce `pipex_list_mtx' mutex(9) and use it to protect global pipex(4)
lists and radix trees.
Protect pipex(4) `session' dereference with reference counters, because we
could sleep when accessing pipex(4) from ioctl(2) path, and this is not
possible with mutex(9) held.
ok bluhm@
|
|
which represent flags. We mix unlocked access to immutable flags with
protected access to mutable ones. This could be not MP independent on
some architectures, so convert these fields to u_int `flags' variables.
ok bluhm@
|
|
OK bluhm
Reported-by: syzbot+50ea4f33ed5dd9264918@syzkaller.appspotmail.com
Reported-by: syzbot+df65f8b7ee8c0089e885@syzkaller.appspotmail.com
|
|
been spotted and reported by jmc@
OK kn@
|
|
route socket. All messages passed are by definition done. This may
allow to share more code between sysctl and route socket parsers.
OK mpi@
|
|
a state in PFTM_PURGE could potentially hide another state on the same state
key that is active and we'd incorrectly block the packet
I believe that cannot happen as things are now.
ok sashan
|
|
by either passing it further or releasing it.
OK mvs@
|
|
same panic can be triggered when address table is part
of anchor loaded by 'load anchor ... from ..,' statement.
pf_find_or_create_ruleset() function called by pfr_add_tables()
must receive ruleset name which comes from pre-allocated root
table.
OK claudio@ dlg@
|
|
for the lock operation and to pass a value to the unlock operation.
sofree() still needs an extra flag to know if sounlock() should be called
or not. But sofree() is called less often and mostly without keeping the lock.
OK mpi@ mvs@
|
|
by firewall.
OK dlg@
|
|
were truncated. Drop such packets instead.
Reported-by: syzbot+91abd3aa2fdfe900f9ce@syzkaller.appspotmail.com
OK sashan@ claudio@
|
|
|
|
removed in 1.970.
ok bluhm@
|
|
handle malloc(9) failure.
from markus@; OK sashan@
|
|
OK dlg@
|
|
|
|
claudio@ is right that as a rule of thumb it is a bad idea to call
arbitrary code from an smr crit section because the scope of what
is called is very hard to keep in your head. in this particular
case sashan@ points out that if_enqueue can call vport handlers,
which calls if_vinput, which will push a packet into the network
stack, which will call pf and try to take an rwlock. you can't sleep
in an smr crit section.
SMRs in this situation are protecting references to ports in the
list of span and actual ports attached to a veb. when we needed to
send a packet to an unknown unicast, broadcast, or multicast packet
the code would SMR_TAILQ_FOREACH over all the ports, duplicating
the mbuf and calling if_enqueue against the port. span port handling
is basically the same, but we unconditionally send to them.
this replaces the SMR_TAILQ with maps (arrays) of ports. the veb
port map data structure contains a struct refcnt and the number of
ports. the forwarding paths use an SMR crit section to get a reference
to the map, increase the refcnt, and then leaves the smr crit section
before iterating over the array of ports in the map. after the
iteration it releases the refcnt.
this does add a couple of atomic ops in the forwarding path, but
only in the uncommon case (most packets are (should be) to known
unicast addresses), and it's only one set of ops for all ports
instead of ops per port. the known unicast case follows this pattern
too.
reported by Barbaros Bilek on bugs@
fix tested by me and hrvoje popovski
ok claudio@ sashan@ bluhm@ (who also did a lot of the initial analysis)
|
|
Keeping and combining tags from multiple previous packets could result in
a single accumulated reply overrunning mbuf size limits. Also make sure
the tag size fields are reset to 0 if allocation fails.
Add size check on mbuf cluster allocation and fail if more than MCLBYTES
are requested.
From NetBSD.
tested by naddy@
ok bluhm@
|
|
NET_LOCK()/PF_LOCK() scope. bluhm@ helped a lot
to put this diff into shape.
OK bluhm@
|