src - OpenBSD base system

Age	Commit message (Collapse)	Author
2016-05-18	rework the srp api so it takes an srp_ref struct that the caller provides.	David Gwynne
	the srp_ref struct is used to track the location of the callers hazard pointer so later calls to srp_follow and srp_enter already know what to clear. this in turn means most of the caveats around using srps go away. specifically, you can now: - switch cpus while holding an srp ref - ie, you can sleep while holding an srp ref - you can take and release srp refs in any order the original intent was to simplify use of the api when dealing with complicated data structures. the caller now no longer has to track the location of the srp a value was fetched from, the srp_ref effectively does that for you. srp lists have been refactored to use srp_refs instead of srpl_iter structs. this is in preparation of using srps inside the ART code. ART is a complicated data structure, and lookups require overlapping holds of srp references. ok mpi@ jmatthew@
2016-05-10	make bpf_mtap callers set the M_FILDROP flag if they care about it.	David Gwynne
	ok mpi@
2016-05-08	Do not export the IFXF_MPSAFE flag to userland, it is a kernel-only	Martin Pieuchot
	hint. ok kettenis@, deraadt@
2016-05-03	Stop using a soft-interrupt context to process incoming network packets.	Martin Pieuchot
	Use a new task that runs holding the KERNEL_LOCK to execute mp-unsafe code. Our current goal is to progressively move input functions to the unlocked task. This gives a small performance boost confirmed by Hrvoje Popovski's IPv4 forwarding measurement: before: after: send receive send receive 400kpps 400kpps 400kpps 400kpps 500kpps 500kpps 500kpps 500kpps 600kpps 600kpps 600kpps 600kpps 650kpps 650kpps 650kpps 640kpps 700kpps 700kpps 700kpps 700kpps 720kpps 640kpps 720kpps 710kpps 800kpps 640kpps 800kpps 650kpps 1.4Mpps 570kpps 1.4Mpps 590kpps 14Mpps 570kpps 14Mpps 590kpps ok kettenis@, bluhm@, dlg@
2016-03-16	if ticks diverge from ifq_congestion too far the diff will go negative	David Gwynne
	detect this and bump ifq_congestion forward rather than claim the system is congested for a long period of time. ok mpi@ henning@ jmatthew@
2016-03-07	Sync no-argument function declaration and definition by adding (void).	Christian Weisgerber
	ok mpi@ millert@
2016-03-02	provide generic ioctls for managing an interfaces parent	David Gwynne
	in the future this will subsume the individual vlandev, carpdev, pppoedev, foodev options for things like vlan, carp, pppoe, etc. inspired by vnetid ok mpi@ jmatthew@
2016-02-28	Support for running Linux binaries under emulation is going away.	Christian Weisgerber
	Remove "option COMPAT_LINUX" and everything directly tied to it from the kernel and the corresponding man page documentation. ok visa@ guenther@
2015-12-09	rework the if_start mpsafe serialisation so it can serialise arbitrary work	David Gwynne
	work is represented by struct task. the start routine is now wrapped by a task which is serialised by the infrastructure. if_start_barrier has been renamed to ifq_barrier and is now implemented as a task that gets serialised with the start routine. this also adds an ifq_restart() function. it serialises a call to ifq_clr_oactive and calls the start routine again. it exists to avoid a race that kettenis@ identified in between when a start routine discovers theres no space left on a ring, and when it calls ifq_set_oactive. if the txeof side of the driver empties the ring and calls ifq_clr_oactive in between the above calls in start, the queue will be marked oactive and the stack will never call the start routine again. by serialising the ifq_set_oactive call in the start routine and ifq_clr_oactive calls we avoid that race. tested on various nics ok mpi@
2015-12-08	Kill unused iftxlist.	Martin Pieuchot
	ok dlg@
2015-12-08	split the interface send queue (struct ifqueue) implementation out.	David Gwynne
	the intention is to make it more clear what belongs to a transmit queue and what belongs to an interface. suggested by and ok mpi@
2015-12-05	remove old lint annotations	Ted Unangst

2015-12-04	Grab the KERNEL_LOCK() around bridge_output().	Martin Pieuchot
	It is now safe to call if_enqueue() without holding the KERNEL_LOCK() even on an interface part of a bridge(4). ok dlg@, henning@, kettenis@
2015-12-03	Use SRPL_HEAD() and SRPL_ENTRY() to be consistent with and allow to	Martin Pieuchot
	fallback to a SLIST. ok dlg@, jasper@
2015-12-03	Remove broadcast matching from ifa_ifwithaddr(), use in_broadcast() where	Vincent Gross
	required. ok bluhm@ mpi@.
2015-12-03	rework if_start to allow nics to provide an mpsafe start routine.	David Gwynne
	existing start routines will still be called under the kernel lock and at IPL_NET. mpsafe start routines will be serialised so only one instance of each interfaces function will be running in the kernel at any point in time. this guarantees packets will be dequeued in order, and the start routines dont have to lock against themselves because if_start does it for them. the code to do that is based on the scsi runqueue code. this also provides an if_start_barrier() function that should wait until any currently running instances of if_start have finished. a driver can opt in to the mpsafe if_start call by doing the following: 1. setting ifp->if_xflags = IFXF_MPSAFE 2. only calling if_start() instead of its own start routine 3. clearing IFF_RUNNING before calling if_start_barrier() on its way down 4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback) to simplify the implementation the tx mitigation code has been removed. tested by several ok mpi@ jmatthew@
2015-12-02	When destroying an interface, we have to wait until all references	Alexander Bluhm
	are not used anymore. This has to be done before any interface fields become invalid. As the route delete request cannot call if_get() anymore, pass down the interface. Split rtrequest_delete() into a separate function that may take an existing inteface. OK mpi@
2015-12-02	Rework the MPLS handling. Remove the lookup loops since nothing is using	Claudio Jeker
	them and they make everything so much harder with no gain. Remove the ifp argument from mpls_input since it is not needed. On the input side the lookup side is modified a bit when it comes to BOS handling. Tested in a L3VPN setup with ldpd and bgpd. Commiting now so we can move on with cleaning up rt_ifp usage. If this breaks L2VPN I will fix it once reported. OK mpi@
2015-12-01	Iterating on &ifnet should only be done with the KERNEL_LOCK held.	Vincent Gross
	With input and ok mpi@.
2015-11-27	Protect the growth of the routing table arrays used by rtable_get()	Martin Pieuchot
	with SRPs. This is a simplified version of the dynamically sizeable array of pointers used by if_get() because routing table heads are never freed. ok dlg@
2015-11-25	replace IFF_OACTIVE manipulation with mpsafe operations.	David Gwynne
	there are two things shared between the network stack and drivers in the send path: the send queue and the IFF_OACTIVE flag. the send queue is now protected by a mutex. this diff makes the oactive functionality mpsafe too. IFF_OACTIVE is part of if_flags. there are two problems with that. firstly, if_flags is a short and we dont have any MI atomic operations to manipulate a short. secondly, while we could make the IFF_OACTIVE operates mpsafe, all changes to other flags would have to be made safe at the same time, otherwise a read-modify-write cycle on their updates could clobber the oactive change. instead, this moves the oactive mark into struct ifqueue and provides an API for changing it. there's ifq_set_oactive, ifq_clr_oactive, and ifq_is_oactive. these are modelled on ifsq_set_oactive, ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd. this diff includes changes to all the drivers manipulating IFF_OACTIVE to now use the ifsq_{set,clr_is}_oactive API too. ok kettenis@ mpi@ jmatthew@ deraadt@
2015-11-21	simplify ifq_deq_rollback by only having it unlock.	David Gwynne
	hfsc needed a rollback ifqop to requeue the mbuf because it used ml_dequeue in the begin op. now it uses MBUF_LIST_FIRST to get a ref to the first mbuf in deq_begin. now the disciplines dont need a rollback op, so ifq_deq_rollback can be simplified to just releasing the mutex. based on a discussion with kenjiro cho
2015-11-20	Keep if_ref() private, if_get() is what you want to use before if_put().	Martin Pieuchot
	The thread detaching an interface will sleep until all references to this interface have been released. So we decided to only keep references for a short period of time. Keeping if_ref() private will hopefully help preserve this goal as long as it makes sense. Calling if_get()/if_put() in the same function also allows us to make use of static analysis tools (thanks jsg@!) to catch our errors. ok dlg@
2015-11-20	i made a mistake. rename ifq_enq and ifq_deq to ifq_enqueue and ifq_dequeue	David Gwynne
	fixing it now before i regret it more.
2015-11-20	fix prio KASSERT, it should be <= not <. ok dlg@	Stuart Henderson

2015-11-20	shuffle struct ifqueue so in flight mbufs are protected by a mutex.	David Gwynne
	the code is refactored so the IFQ macros call newly implemented ifq functions. the ifq code is split so each discipline (priq and hfsc in our case) is an opaque set of operations that the common ifq code can call. the common code does the locking, accounting (ifq_len manipulation), and freeing of the mbuf if the disciplines enqueue function rejects it. theyre kind of like bufqs in the block layer with their fifo and nscan disciplines. the new api also supports atomic switching of disciplines at runtime. the hfsc setup in pf_ioctl.c has been tweaked to build a complete hfsc_if structure which it attaches to the send queue in a single operation, rather than attaching to the interface up front and building up a list of queues. the send queue is now mutexed, which raises the expectation that packets can be enqueued or purged on one cpu while another cpu is dequeueing them in a driver for transmission. a lot of drivers use IFQ_POLL to peek at an mbuf and attempt to fit it on the ring before committing to it with a later IFQ_DEQUEUE operation. if the mbuf gets freed in between the POLL and DEQUEUE operations, fireworks will ensue. to avoid this, the ifq api introduces ifq_deq_begin, ifq_deq_rollback, and ifq_deq_commit. ifq_deq_begin allows a driver to take the ifq mutex and get a reference to the mbuf they wish to try and tx. if there's space, they can ifq_deq_commit it to remove the mbuf and release the mutex. if there's no space, ifq_deq_rollback simply releases the mutex. this api was developed to make updating the drivers using IFQ_POLL easy, instead of having to do significant semantic changes to avoid POLL that we cannot test on all the hardware. the common code has been tested pretty hard, and all the driver modifications are straightforward except for de(4). if that breaks it can be dealt with later. ok mpi@ jmatthew@
2015-11-18	Factorize the bits to check if a L2 route is connected, wether it is	Martin Pieuchot
	attached to a carp(4) or bridge(4) member, to not dereference rt_ifp directly. ok visa@
2015-11-13	Sore the index of the interface used for revarp instead of a pointer to	Martin Pieuchot
	its descriptor. Get rid of a if_ref(). ok dlg@
2015-11-11	Store the index of the lo0 interface instead of a pointer to its	Martin Pieuchot
	descriptor. Allow to get rid of two if_ref() in the output paths. ok dlg@
2015-11-07	Use input handlers for bridge(4).	Martin Pieuchot
	This allows more flexible configurations with vlan(4) and bridge(4) on top of the same physical interface. In particular it allows to not feed VLAN tagget packets into a bridge(4). Fix regression reported by Armin Wolfermann on bugs@, ok dlg@
2015-11-06	Rename rt_mpath_next() into rtable_mpath_next() and provide an	Martin Pieuchot
	implementation for ART based on the singly-linked list of route entries.
2015-11-03	Do not clear M_PROTO1 flag before calling if_start() because pseudo-	Martin Pieuchot
	drivers, like vlan(4), call if_enqueue() in their *start function. Prevent an infinite recursion reported by Armin Wolfermann on bugs@.
2015-11-02	Merge rtable_mpath_match() into rtable_lookup().	Martin Pieuchot
	ok bluhm@
2015-10-28	Remove linkmtu and maxmtu from struct nd_ifinfo. IN6_LINKMTU can now	Florian Obser
	die and ifp->if_mtu is the one true mtu. Suggested by and OK mpi@
2015-10-27	Use rt_ifidx rather than rt_ifp.	Martin Pieuchot
	ok bluhm@
2015-10-25	unbreak tree for ramdisks without INET6	Theo de Raadt

2015-10-25	Do not overwrite if_rtrequest() if the driver specified it before	Martin Pieuchot
	calling if_attach().
2015-10-25	arp_ifinit() is no longer required.	Martin Pieuchot

2015-10-25	Introduce if_rtrequest() the successor of ifa_rtrequest().	Martin Pieuchot
	L2 resolution depends on the protocol (encoded in the route entry) and an ``ifp''. Not having to care about an ``ifa'' makes our life easier in our MP effort. Fewer dependencies between data structures implies fewer headaches. Discussed with bluhm@, ok claudio@
2015-10-24	Add pair(4), a vether-based virtual Ethernet driver to interconnect	Reyk Floeter
	rdomains and bridges on the local system. This can be used to route through local rdomains, to create L2 devices (like trunks) between them, and many other things. Discussed with many, with input from mpi@ OK sthen@ phessler@ yasuoka@ mikeb@
2015-10-22	Kill link_rtrequest(), introduce in 1990 to "fix" the result	Martin Pieuchot
	of rt_getifa() when adding link level route from outside the kernel. ok claudio@
2015-10-22	Make sure that the address matching the key (destination) of a route	Martin Pieuchot
	entry is attached to this entry. ok phessler@, bluhm@
2015-10-22	Inspired by satosin(), use inline functions to convert sockaddr dl.	Alexander Bluhm
	Instead of casts they check wether the incoming object has the expected type. So introduce satosdl() and sdltosa() in the kernel. OK mpi@
2015-10-22	Do not dereference ``ifa_ifp'' when we already have an ``ifp'' pointer.	Martin Pieuchot

2015-10-12	the pattr argument to IFQ_ENQUEUE is unused, so let's get rid of it.	David Gwynne
	also the comment above IFQ_ENQUEUE that says the pattr argument is unused. ok mpi@
2015-10-12	Unify link state change notification.	Martin Pieuchot
	ok mikeb@
2015-10-12	protect SIOCSLIFPHYTTL, SIOCSVNETID so only root can call them, and	David Gwynne
	return EPNOTSUPP for SIOCGLIFPHYTTL and SIOCGVNETID. all so drivers dont have to do these checks themselves. ok mikeb@ mpi@
2015-10-08	Unlock the softnet task.	Martin Pieuchot
	ok dlg@, kettenis@
2015-10-05	Revert if_oqdrops accounting changes done in kernel, per request from mpi@.	Masao Uebayashi
	(Especially adding IF_DROP() after IFQ_ENQUEUE() was completely wrong because IFQ_ENQUEUE() already does it. Oops.) After this revert, the situation becomes: - if_snd.ifq_drops is incremented in either IFQ_ENQUEUE() or IF_DROP(), but it is not shown to userland, and - if_data.ifi_oqdrops is shown to userland, but it is not incremented by anyone.
2015-10-05	Count IFQ_ENQUEUE() failure as output drop.	Masao Uebayashi
	mpi@ prefers checking IFQ_ENQUEUE() error, and this matches that. OK dlg@