src - OpenBSD base system

Age	Commit message (Collapse)	Author
2022-08-22	Use rwlock per inpcb table to protect notify list. The notify	Alexander Bluhm
	function may sleep, so holding a mutex is not possible. The same list entry and rwlock is used for UDP multicast and raw IP delivery. By adding a write lock, exclusive netlock is no longer necessary for PCB notify and UDP and raw IP input. OK mvs@
2022-08-21	Introduce a mutex per inpcb to serialize access to socket receive	Alexander Bluhm
	buffer. Later it may be used to protect more of the PCB or socket. In divert input replace the kernel lock with this mutex. OK mvs@
2022-08-08	To make protocol input functions MP safe, internet PCB need protection.	Alexander Bluhm
	Use their reference counter in more places. The in_pcb lookup functions hold the PCBs in hash tables protected by table->inpt_mtx mutex. Whenever a result is returned, increment the ref count before releasing the mutex. Then the inp can be used as long as neccessary. Unref it at the end of all functions that call in_pcb lookup. As a shortcut, pf may also hold a reference to the PCB. When pf_inp_lookup() returns it, it also incements the ref count and the caller can handle it like the inp from table lookup. OK sashan@
2022-08-06	Clean up the netlock macros. Merge NET_RLOCK_IN_SOFTNET and	Alexander Bluhm
	NET_RLOCK_IN_IOCTL, which have the same implementation. The R and W are hard to see, call the new macro NET_LOCK_SHARED. Rename the opposite assertion from NET_ASSERT_WLOCKED to NET_ASSERT_LOCKED_EXCLUSIVE. Update some outdated comments about net locking. OK mpi@ mvs@
2022-06-28	Use btrace(8) to debug reference counting. dt(4) provides a static	Alexander Bluhm
	tracepoint for each type of refcnt we have. As a start, add inpcb and tdb refcnt. When the counter changes, btrace may print the actual object, the current counter, the change value and optionally the stack trace. discussed with visa@; OK mpi@
2022-06-06	Simplify solock() and sounlock(). There is no reason to return a value	Claudio Jeker
	for the lock operation and to pass a value to the unlock operation. sofree() still needs an extra flag to know if sounlock() should be called or not. But sofree() is called less often and mostly without keeping the lock. OK mpi@ mvs@
2022-05-15	have in_pcbselsrc copy the selected address to memory provided by the caller.	David Gwynne
	having it return a pointer to something that has a lifetime managed by a lock without accounting for it or taking a reference count or anything like that is asking for trouble. copying the address to caller provded memory while still inside the lock is a lot safer. discussed with visa@ ok bluhm@ claudio@
2022-04-14	Relax address availability check for multicast binds.	Claudio Jeker
	While it makes sense to limit bind(2) of unicast addresses that overlap each other to be all from the same UID (like 0.0.0.0:53 and 127.0.0.1:53) it makes little sense for multicast. Multicast is delivered to all sockets that match so there is no risk of someone stealing traffic from someone else. This should hopefully help with mDNS as reported by robert@ OK deraadt@ bluhm@
2022-03-22	Fix whitespace.	Alexander Bluhm

2022-03-21	Header netinet/in_pcb.h includes sys/mutex.h now. Recommit mutex	Alexander Bluhm
	for PCB tables. It does not break userland build anymore. pf_socket_lookup() calls in_pcbhashlookup() in the PCB layer. To run pf in parallel, make parts of the stack MP safe. Protect the list and hashes in the PCB tables with a mutex. Note that the protocol notify functions may call pf via tcp_output(). As the pf lock is a sleeping rw_lock, we must not hold a mutex. To solve this for now, collect these PCBs in inp_notify list and protect it with exclusive netlock. OK sashan@
2022-03-21	treat 255.255.255.255 like an mcast address in in_pcbselsrc.	David Gwynne
	this allows the IP_MULTICAST_IF sockopt to specify which address you want to send a limited broadcast (255.255.255.255) packet out of. requested by and ok claudio@
2022-03-14	Unbreak the tree, revert commitid aZ8fm4iaUnTCc0ul	Theo Buehler
	This reverts the commit protecting the list and hashes in the PCB tables with a mutex since the build of sysctl(8) breaks, as found by kettenis. ok sthen
2022-03-14	pf_socket_lookup() calls in_pcbhashlookup() in the PCB layer. To	Alexander Bluhm
	run pf in parallel, make parts of the stack MP safe. Protect the list and hashes in the PCB tables with a mutex. Note that the protocol notify functions may call pf via tcp_output(). As the pf lock is a sleeping rw_lock, we must not hold a mutex. To solve this for now, collect these PCBs in inp_notify list and protect it with exclusive netlock. OK sashan@
2022-03-04	in_pcbinit() is called during boot. There malloc(9) cannot fail,	Alexander Bluhm
	but would panic instead of waiting. Remove needless error handling. OK mvs@
2022-03-02	Use NULL instead of 0 for pointer.	Alexander Bluhm

2022-03-01	Remove outdated comment about v4-mapped v6 addresses. They are not	Alexander Bluhm
	supported anymore.
2021-10-25	The implementation of ipsp_spd_inp() is side effect free. It may	Alexander Bluhm
	set the error output parameter or return a tdb. Both are ignored in in_pcbconnect(). Remove the code that does nothing. OK tobhe@ jca@ mvs@
2021-03-10	spelling	Jonathan Gray
	ok gnezdo@ semarie@ mpi@
2021-02-11	Swap faddr/laddr and fport/lport arguments in call to stoeplitz_ipXport().	Patrick Wildt
	Technically the whole point of the stoeplitz API is that it's symmetric, meaning that the order of addresses and ports doesn't matter and will produce the same hash value. Coverity CID 1501717 ok dlg@
2021-01-25	if stoeplitz is enabled, use it to provide a flowid for tcp packets.	David Gwynne
	drivers that implement rss and multiple rings depend on the symmetric toeplitz code, and use it to generate a key that decides with rx ring a packet lands on. if the toeplitz code is enabled, this diff has the pcb and tcp layer use the toeplitz code to generate a flowid for packets they send, which in turn is used to pick a tx ring. because the nic and the stack use the same key, the tx and rx sides end up with the same hash/flowid. at the very least this means that the same rx and tx queue pair on a particular nic are used for both sides of the connection. as the stack becomes more parallel, it will also help keep both sides of the tcp connection processing in the one place.
2020-11-07	Rework source IP address setting.	denis
	- Move most of the processing out of rtable.c (reasonnable tb@, ok bluhm@) - Remove memory allocation, store pointer to existing ifaddr - Fix tunnel interface handling looks fine mpi@
2020-11-05	Replace wrong cast with satosin.	denis
	Advised by bluhm@
2020-10-29	Add feature to force the selection of source IP address	denis
	Based/previous work on an idea from deraadt@ Input from claudio@, djm@, deraadt@, sthen@ OK deraadt@
2020-05-27	Connectionless sockets like UDP can be re-connected to a different	Alexander Bluhm
	address. In that case, the linking to the pf state must be dissolved as the latter still contains the old address. If it is a divert state, also remove the state as any divert state must be associated with a matching socket. Call pf_remove_divert_state() and pf_inp_unlink() from in_pcbconnect(). reported by Tim Kuijsten; OK sashan@ claudio@
2019-07-15	Initialize struct inpcb pool not on demand, but during initialization.	Alexander Bluhm
	Removes a global variable and avoids MP problems. OK mpi@ visa@
2018-10-04	Revert the inpcb table mutex commit. It triggers a witness panic	Alexander Bluhm
	in raw IP delivery and UDP broadcast loops. There inpcbtable_mtx is held and sorwakeup() is called within the loop. As sowakeup() grabs the kernel lock, we have a lock ordering problem. found by Hrvoje Popovski; OK deraadt@ mpi@
2018-09-20	As a step towards per inpcb or socket locks, remove the net lock	Alexander Bluhm
	for netstat -a. Introduce a global mutex that protects the tables and hashes for the internet PCBs. To detect detached PCB, set its inp_socket field to NULL. This has to be protected by a per PCB mutex. The protocol pointer has to be protected by the mutex as netstat uses it. Always take the kernel lock in in_pcbnotifyall() and in6_pcbnotify() before the table mutex to avoid lock ordering problems in the notify functions. OK visa@
2018-09-14	In general it is a bad idea to use one random secret for two things.	Alexander Bluhm
	The inet PCB uses one hash with local and foreign addresses, and one with local port numbers. Give both hashes separate keys. Also document the struct fields. OK visa@
2018-09-13	Add reference counting for inet pcb, this will be needed when we	Alexander Bluhm
	start locking the socket. An inp can be referenced by the PCB queue and hashes, by a pf mbuf header, or by a pf state key. OK visa@
2018-09-11	Make the distribution of in_ and in6_ functions in in_pcb.c and	Alexander Bluhm
	in6_pcb.c consistent, to ease comparing the code. Move all inet6 functions to in6_. Bring functions in both source files in same order. Cleanup the include section. Now in_pcb.c is a superset of in6_pcb.c. The latter contains all the special implementations. Just moving arround, no code change intended. OK mpi@
2018-09-10	Remove useless INPCBHASH() macros. Just expand them.	Alexander Bluhm
	OK stsp@
2018-09-07	Explain the special case for redirect to localhost in a comment.	Alexander Bluhm
	input and OK claudio@
2018-07-11	Retire RTM_LOSING, it no longer makes sense and on busy servers the	Claudio Jeker
	route socket is flooded with those messages. Instead maek sure that the removal of the dynamic route that can happen is actually also sent to the routing socket. OK mpi@ henning@
2018-06-14	In in_pcballoc() finish the inp initialization before adding it to	Alexander Bluhm
	the global inpcb queue and hashes. OK visa@ mpi@ as part of a larger diff
2018-06-14	Assert that the INP_IPV6 in in6_pcbconnect() is correct. Just call	Alexander Bluhm
	in_pcbconnect() to avoid the address family maze in syn_cache_get(). input claudio@; OK mpi@
2018-06-11	Do not unlock the KERNEL_LOCK() unconditionally in sounlock().	Martin Pieuchot
	Instead introduce two flags to deal with global lock recursion. This is necessary until we get per-socket lock. Req. by and ok visa@
2018-06-11	Push the KERNEL_LOCK() inside route_input().	Martin Pieuchot
	ok visa@, tb@
2018-06-07	The global zero addresses must not change, mark them constant.	Alexander Bluhm
	OK tb@ visa@
2018-06-06	Pass the socket to sounlock(), this prepare the terrain for per-socket	Martin Pieuchot
	locking. ok visa@, bluhm@
2018-06-03	Use variable names for rtable and rdomain consistently in the in_pcb	Alexander Bluhm
	functions. discussed with and OK mpi@ visa@
2018-06-03	Rename the incpb table field inpt_hash to inpt_mask as it contains	Alexander Bluhm
	the hashmask. For the resize calculations it is clearer to use the field inpt_size. OK visa@ mpi@
2018-06-02	Cleanup the in_pcbnotifymiss diagnostic printfs. Always print the	Alexander Bluhm
	rdomain. Move the printf to the end of the pcb lookup functions. OK tb@ mpi@ visa@
2018-06-02	The function in_pcbrehash() does not modify the pcb table queue.	Alexander Bluhm
	So in in_pcbresize() the variant without _SAFE of the TAILQ_FOREACH macro is sufficient. OK tb@ mpi@ visa@
2018-03-30	Store the allocation size in inpcbhead for free().	David Hill
	OK visa@
2018-02-19	Remove almost unused `flags' argument of suser().	Martin Pieuchot
	The account flag `ASU' will no longer be set but that makes suser() mpsafe since it no longer mess with a per-process field. No objection from millert@, ok tedu@, bluhm@
2017-12-04	Make divert lookup similar for all socket types. If PF_TAG_DIVERTED	Alexander Bluhm
	is set, pf_find_divert() cannot fail so put an assert there. Explicitly check all possible divert types, panic in the default case. For raw sockets call pf_find_divert() before of the socket loop. Divert reply should not match on TCP or UDP listen sockets. OK sashan@ visa@
2017-12-01	Fix white spaces and shorten long line.	Alexander Bluhm

2017-12-01	Simplify the reverse PCB lookup logic. The PF_TAG_TRANSLATE_LOCALHOST	Alexander Bluhm
	security check prevents that the user accidentally configures redirect where a divert-to would be appropriate. Instead of spreading the logic into tcp and udp input, check the flag during PCB listen lookup. This also reduces parameters of in_pcblookup_listen(). OK visa@
2017-08-11	Validate sockaddr from userland in central functions. This results	Alexander Bluhm
	in common checks for unix, inet, inet6 instead of partial checks here and there. Some checks are already done at a higher layer, but better be paranoid with user input. OK claudio@ millert@
2017-08-04	The in_pcbhashlookup() in in_pcbconnect() enforces that the 4 tupel	Alexander Bluhm
	of src/dst ip/port is unique for TCP. But if the socket is not bound, the automatic bind by connect happens after the check. If the socket has the SO_REUSEADDR flag, in_pcbbind() may select an existing local port. Then we had two colliding TCP PCBs. This resulted in a packet storm of ACK packets on loopback. The softnet task was constantly holding the netlock and has a high priority, so the system hung. Do the in_pcbhashlookup() again after in_pcbbind(). This creates sporadic "connect: Address already in use" errors instead of a hang. bug report and testing Olivier Antoine; OK mpi@