summaryrefslogtreecommitdiff
path: root/sys/kern/uipc_socket.c
AgeCommit message (Collapse)Author
2013-04-05remove some obsolete castsTed Unangst
2013-04-04Do not allow the listen(2) syscall for an already connected socket.Alexander Bluhm
This would create a weird set of states in TCP. FreeBSD has the same check. Issue found by and OK guenther@
2013-03-27Move soidle() into the big #ifdef SOCKET_SPLICE block to have itAlexander Bluhm
all in one place. Saves one additional #ifdef, no functional change. OK mikeb@
2013-03-19After a socket splicing timeout is fired, a network interrupt canAlexander Bluhm
unsplice() the sockets before soidle() goes to splsoftnet. In this case, unsplice() was called twice. So check wether splicing still exists within the splsoftnet protection. Uvm fault in sounsplice() reported by keith at scott-land dot net. OK claudio@
2013-02-16Fix a bug in udp socket splicing in case a packet gets diverted andAlexander Bluhm
spliced and routed to loopback. The content of the pf header in the mbuf was keeping the divert information on its way. Reinitialize the whole packet header of the mbuf and remove the mbuf tags when the packet gets spliced. OK claudio@ markus@
2013-01-17Expand the socket splicing functionality from TCP to UDP. MergeAlexander Bluhm
the code relevant for UDP from sosend() and soreceive() into somove(). That allows the kernel to directly transfer the UDP data from one socket to another. OK claudio@
2013-01-15Pass an EFBIG error to user land when the maximum splicing lengthAlexander Bluhm
has been reached. This creates a read event on the spliced source socket that can be noticed with select(2). So the kernel passes control to the relay process immediately. This could be used to log the end of an http request within a persistent connection. deraadt@ reyk@ mikeb@ like the idea
2013-01-15Changing the socket buffer flags sb_flags was not interrupt safeAlexander Bluhm
as |= and &= are non-atomic operations. To avoid additional locks, put the flags that have to be accessed from interrupt into a separate sb_flagsintr 32 bit integer field. sb_flagsintr is protected by splsoftnet. Input from miod@ deraadt@; OK deraadt@
2012-12-31Put the #ifdef SOCKBUF_DEBUG around sbcheck() into a SBCHECK macro.Alexander Bluhm
That is consistent to the SBLASTRECORDCHK and SBLASTMBUFCHK macros. OK markus@
2012-10-05add send(2) MSG_DONTWAIT support which enables us to choose nonblockingYASUOKA Masahiko
or blocking for each send(2) call. diff from UMEZAWA Takeshi ok bluhm
2012-09-20In somove() free the mbufs when necessary instead of freeing themAlexander Bluhm
in the release path. Especially accessing m in a KDASSERT() could go wrong. OK claudio@
2012-09-19When a socket is spliced, it may not wakeup the userland for reading.Alexander Bluhm
There was a small race in sorwakeup() where that could happen if we slept before the SB_SPLICE flag was set. ok claudio@
2012-09-19In somove() make the call to pr_usrreq(PRU_RCVD) under the sameAlexander Bluhm
conditions as in soreceive(). My goal is to make socket splicing less protocol dependent. ok claudio@
2012-09-17Fix indent white spaces.Alexander Bluhm
2012-07-22unp_dispose() walks not just the mbuf chain (m_next) but also the packetPhilip Guenthe
chain (m_nextpkt), so the mbuf passed to it must be disconnected completely from the socket buffer's chains. Problem noticed by yasuoka@; tweak from krw@, ok deraadt@
2012-07-10For setsockopt(SO_{SND,RCV}TIMEO), convert the timeval to ticks usingPhilip Guenthe
tvtohz() so that the rounding is correct and we don't time out a tick early ok claudio@
2012-07-10Try to cleanup the macro magic because of socket spliceing. Since structClaudio Jeker
socket is no longer affected by option SOCKET_SPLICE we can simplyfy the code. OK bluhm@
2012-07-07Fix two races in socket splicing. When somove() gets called fromAlexander Bluhm
sosplice() to move the data already there, it might sleep in m_copym(). Another process must not unsplice during that sleep, so also lock the receive buffer when sosplice is called with fd -1. The same sleep can allow network interrupts to modify the socket buffer. So use sbsync() to write back modifications within the loop instead of fixing the socket buffer after the loop. OK claudio@
2012-04-24In sosend() for AF_UNIX control message sending, correctly calculateTheo de Raadt
the size (internalized ones can be larger on some architectures) for fitting into the socket. Avoid getting confused by sb_hiwat as well. This fixes a variety of issues where sendmsg() would fail to deliver a fd set or fail to wait; even leading to file leakage. Worked on this with claudio for about a week...
2012-04-22Add struct proc * argument to FRELE() and FILE_SET_MATURE() inPhilip Guenthe
anticipation of further changes to closef(). No binary change. ok krw@ miod@ deraadt@
2012-03-23Make rusage totals, itimers, and profile settings per-process insteadPhilip Guenthe
of per-rthread. Handling of per-thread tick and runtime counters inspired by how FreeBSD does it. ok kettenis@
2012-03-17remove IP_JUMBO, SO_JUMBO, and RTF_JUMBO.David Gwynne
no objection from mcbride@ krw@ markus@ deraadt@
2012-03-14Close a race that would corrupt a sockbuf because the code that externalizesMark Kettenis
an SCM_RIGHTS message may sleep. Bits and pieces from NetBSD with some simplifications by yours truly. Fixes the "receive 1" panic seen by many. ok guenther@, claudio@
2011-08-23iPrevent that a socket splicing timeout error in one direction isAlexander Bluhm
also added to the other direction. ok mikeb@
2011-07-04Implement an idle timeout for the socket splicing. A new `sp_idle'Mike Belopuhov
field of the `splice' structure can be used to specify a period of inactivity after which splicing will be dissolved. ETIMEDOUT error retrieved with a SO_ERROR indicates the idle timeout expiration. With comments from and OK bluhm.
2011-07-02kqueue attach functions should return an errno or 0, not a plain 1. FixNicholas Marriott
the obvious cases to return EINVAL and ENXIO. ok tedu deraadt
2011-05-02recognize SO_RTABLE socket option at the SOL_SOCKET level;Mike Belopuhov
discussed with and ok claudio
2011-04-19Put splice cleanup code into a common function sounsplice().Alexander Bluhm
ok claudio@
2011-04-04Plug mbuf leaks in SO_PEERCRED by not double allocating mbufs intoClaudio Jeker
the same variable. Leak found with dlg's magic mbuf leakage finder. OK henning@, deraadt@
2011-04-04If the socket was half closed then don't let userland change theClaudio Jeker
socketbuffer size of the closed side since on half close the high watermark was set to 0. OK blambert@
2011-03-14When a process reads from a spliced socket that already got anAlexander Bluhm
end-of-file but still has data in the receive buffer, soreceive() should block until all data has been moved. To make kqueue work with socket splicing, it has to report spliced sockets as non-readable. ok deraadt@
2011-03-12There existed a race when a process was trying to read from a splicedAlexander Bluhm
socket. soreceive() releases splsoftnet for uiomove(). In that moment, somove() could pull the mbuf from the receive buffer. After that, soreceive removed the mbuf again. The corrupted length accounting resulted in a panic. The fix is to block read calls in soreceive() until splicing has been finished. just commit deraadt@
2011-02-28When the maximum splice length has been reached, send out the dataAlexander Bluhm
immediately by unsetting the SS_ISSENDING flag. This prevents a possible 5 seconds delay in socket splicing. ok markus@; commit it deraadt@
2011-01-07Add socket option SO_SPLICE to splice together two TCP sockets.Alexander Bluhm
The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
2010-09-24TCP send and recv buffer scaling.Claudio Jeker
Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org. Based on work by markus@ and djm@. OK dlg@, henning@, put it in deraadt@
2010-07-03Fix the naming of interfaces and variables for rdomains and rtablesPhilip Guenthe
and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0. Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped. Written by claudio@, criticized^Wcritiqued by me
2010-07-02remove support for compat_sunos (and m68k4k). ok deraadt guentherTed Unangst
2010-07-01SO_PEERCRED should return ENOTCONN when the sockets are not connectedTheo de Raadt
2010-06-30Add getsockopt SOL_SOCKET SO_PEERCRED support. This behaves similar toTheo de Raadt
getpeereid(2), but also supplies the remote pid. This is supplied in a 'struct sockpeercred' (unlike Linux -- they showed how little they know about real unix by calling theirs 'struct ucred'). ok guenther ajacoutot
2009-10-31Use suser when possible. Suggested by miod@.Federico G. Schwindt
miod@ deraadt@ ok.
2009-08-10Don't use char arrays for sleep wchans and reuse them.Thordur I. Bjornsson
just use strings and make things unique. ok claudio@
2009-06-05Initial support for routing domains. This allows to bind interfaces toClaudio Jeker
alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
2009-03-15Introduce splsoftassert(), similar to splassert() but for soft interruptMiod Vallat
levels. This will allow for platforms where soft interrupt levels do not map to real hardware interrupt levels to have soft ipl values overlapping hard ipl values without breaking spl asserts.
2009-02-22fix PR 6082: do not create more fd's than will fit in the message onOtto Moerbeek
the receiving side when passing fd's. ok deraadt@ kettenis@
2009-01-13Change sbreserve() to return 0 on success, 1 on failure, as god intended.Bret Lambert
This sort of breaking with traditional and expected behavior annoys me. "yes!" henning@
2008-10-09Change sb_timeo to unsigned, so that even if some calculation (ie. n * HZ)Theo de Raadt
becomes a very large number it will not wrap the short into a negative number and screw up timeouts. It will simply become a max of 65535. Since this happens when HZ is cranked to a high number, this will still only take n seconds, or less. Safer than crashing. Prompted by PR 5511 ok guenther
2008-08-07don't wait for a free mbuf cluster in sosend() and enter the existingReyk Floeter
error handler that was never used before. this fixes a bug that a userland process might hang if the system ran out of mbuf clusters or even other unexpected behaviour in the network drivers. this bug is very old - it is also found in rev 1.1/stevens v2/44lite2/... discussed with many ok markus@ thib@ dlg@
2008-06-14A bunch of pool_get() + bzero() -> pool_get(..., .. | PR_ZERO)Michael Knudsen
conversions that should shave a few bytes off the kernel. ok henning, krw, jsing, oga, miod, and thib (``even though i usually prefer FOO|BAR''; thanks for looking.
2008-05-23Deal with the situation when TCP nfs mounts timeout and processesThordur I. Bjornsson
get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect. OK markus@, blambert@. "go ahead" deraadt@. Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
2008-05-09Add SO_BINDANY socket option from BSD/OS.Markus Friedl
The option allows a socket to be bound to addresses which are not local to the machine. In order to receive packets for these addresses SO_BINDANY needs to be combined with matching outgoing pf(4) divert rules, see pf.conf(5). ok beck@