summaryrefslogtreecommitdiff
path: root/sys/kern
AgeCommit message (Collapse)Author
2013-01-02Fix a bug in ptcwrite() that could result in up to 100 lost bytesTodd C. Miller
when we block due to hitting the TTYHOG limit. OK miod@
2013-01-01copyright++;Jasper Lievisse Adriaanse
2012-12-31Put the #ifdef SOCKBUF_DEBUG around sbcheck() into a SBCHECK macro.Alexander Bluhm
That is consistent to the SBLASTRECORDCHK and SBLASTMBUFCHK macros. OK markus@
2012-12-31Extend the sbcheck() function to make it work with socket buffersAlexander Bluhm
containing m_nextpkt chains. OK markus@
2012-12-30In sysctl_proc_cwd(), vref() the target proc's fd_cdir before callingPhilip Guenthe
malloc(), so that it can't exit and be freed if we sleep. (another sparc.p nightmare test case) ok beck@, phessler@
2012-12-28Avoid spinning in the cleaner when there are insufficient clean pages, butJoel Sing
there are no buffers on the dirty queue to clean. ok beck@
2012-12-24Fix compilation with POOL_DEBUG but !DDBPhilip Guenthe
ok jsing@ krw@ mikeb@
2012-12-02Fix kva reserve - ensure that kva reserve is checked for, as wellBob Beck
as fix the case where buffers can be returned on the vinvalbuf path and we do not get woken up when waiting for kva. An earlier version looked at and ok'd by guenther@ in coimbra. - helpful comments from kettenis@
2012-12-02Don't wake the cleaner and potentially throw away pages we shouldn'tBob Beck
be throwing away when growing the buffer cache - ok mlarkin@
2012-12-02Determine whether we're currently on the alternative signal stackPhilip Guenthe
dynamically, by comparing the stack pointer against the altstack base and size, so that you get the correct answer if you longjmp out of the signal handler, as tested by regress/sys/kern/stackjmp/. Also, fix alt stack handling on vax, where it was completely broken. Testing and corrections by miod@, krw@, tobiasu@, pirofti@
2012-11-19If uvm_km_kmemalloc_pla() fails when just creating a thread (and not aPhilip Guenthe
process), then don't decrement the total and per-user counts of processes. ok deraadt@ miod@
2012-11-18These functions all should be called with splbio, so splassert(IPL_BIO)Bob Beck
everywhere instead of setting splbio. ok krw@ pirofti@
2012-11-17 Don't map a buffer (and potentially sleep) when invalidating it in vinvalbuf.Bob Beck
This fixes a problem where we could sleep for kva and then our pointers would not be valid on the next pass through the loop. We do this by adding buf_acquire_nomap() - which can be used to busy up the buffer without changing its mapped or unmapped state. We do not need to have the buffer mapped to invalidate it, so it is sufficient to acquire it for that. In the case where we write the buffer, we do map the buffer, and potentially sleep.
2012-11-07Fix the buffer cache.Bob Beck
A long time ago (in vienna) the reserves for the cleaner and syncer were removed. softdep and many things have not performed ths same ever since. Follow on generations of buffer cache hackers assumed the exising code was the reference and have been in frustrating state of coprophagia ever since. This commit 0) Brings back a (small) reserve allotment of buffer pages, and the kva to map them, to allow the cleaner and syncer to run even when under intense memory or kva pressure. 1) Fixes a lot of comments and variables to represent reality. 2) Simplifies and corrects how the buffer cache backs off down to the lowest level. 3) Corrects how the page daemons asks the buffer cache to back off, ensuring that uvmpd_scan is done to recover inactive pages in low memory situaitons 4) Adds a high water mark to the pool used to allocate struct buf's 5) Correct the cleaner and the sleep/wakeup cases in both low memory and low kva situations. (including accounting for the cleaner/syncer reserve) Tested by many, with very much helpful input from deraadt, miod, tobiasu, kettenis and others. ok kettenis@ deraadt@ jj@
2012-11-05unifdef -D __HAVE_TIMECOUNTERMiod Vallat
2012-10-21Fix problem reported by Nathan Weeks <weeks@iastate.edu> where a userlandBob Beck
program could induce the kernel to panic by attempting to do a sempo with nsops > kern.seminfo.semume and the SEM_UNFO flag set. This fixes it so we return ENOSPC, like the man page says, rather than panicing. ok miod@, millert@
2012-10-17use wakeup here, not wakeup_one - avoids problem of not waking up writersBob Beck
when there are more of them than size of queue waiting, and nothing else going on. ok miod@ kettenis@
2012-10-17Swap arguments to wdog_register() since it is nicer, and prepareTheo de Raadt
wdog_shutdown() for external usage.
2012-10-17In sys_accept(), don't sleep between pulling the new socket from thePhilip Guenthe
queue and calling soaccept(), so that the socket can't get torn down by a TCP RST in the middle and trigger "panic: soaccept: !NOFDREF", as seen by halex@ Analysis, original diff, and ok bluhm@
2012-10-17If a thread calls __threxit() or _exit() immediately after anotherPhilip Guenthe
thread coredumps, the former thread needs to be released by the later single_thread_set(SINGLE_EXIT) call, even though its P_WEXIT flag is set. ok kettenis@
2012-10-16Cleanup.Bob Beck
- Whitespace KNF - Removal/fixing of old useless comments - Removal of unused counter - Removal of pointless test that had no effect ok krw@
2012-10-12For consistency with other OSes and ease of porting, makePhilip Guenthe
get{sock,peer}name() behave like accept() when the involved UNIX-domain socket isn't bound to an address, returning an AF_UNIX sockaddr with zero-length sun_path. Based on diff from robert@ and mikeb@ ok robert@ deraadt@
2012-10-09Capilization in comment, and document leftoverroom, + knf nit, spotted by theoBob Beck
2012-10-09Add nscan as a disk queueing algorithm, and make it the default withBob Beck
n = 128. Nscan is essentially, the disksort() style elevator algorithm for ordering disk io operations. The difference is that we will re-order in chunks of 128 operations before continuing with the rest of the work. This avoids the problem that the basic SCAN (aka elevator algorithm) has where continued inserts can cause starvation, where requests can sit for a long time. This solves problems where usb sticks could be unusable while long sequential writes happened, and systems would become unresponsive while dumping core. hacked upon (and this version largely rewritten by) tedu and myself. Note, can be "backed out" by changing BUFQ_DEFAULT back to disksort in buf.h ok kettenis@, tedu@, krw@
2012-10-09bufq write limitingBob Beck
This change ensures that writes in flight from the buffer cache via bufq are limited to a high water mark - when the limit is reached the writes sleep until the amount of IO in flight reaches a low water mark. This avoids the problem where userland can queue an unlimited amount of asynchronous writes resulting in the consumption of all/most of our available buffer mapping kva, and a long queue of writes to the disk. ok kettenis@, krw@
2012-10-08Revamp the sequences for suspend/hibernate -> resume so that the codeTheo de Raadt
paths are reflexive. It is now possible to fail part-way through a suspend sequence, and recover along the resume code path. Split DVACT_SUSPEND by adding a new DVACT_POWERDOWN method is used after hibernate (and suspend too) to finish the job. Some drivers must be converted at the same time to use this instead of shutdown hooks (the others will follow at a later time) ok kettenis mlarkin
2012-10-05add send(2) MSG_DONTWAIT support which enables us to choose nonblockingYASUOKA Masahiko
or blocking for each send(2) call. diff from UMEZAWA Takeshi ok bluhm
2012-10-01Make groupmember() check the effective gid too, so that the checks arePhilip Guenthe
consistent when the effective gid isn't also a supplementary group. ok beck@
2012-09-29When running a.out OMAGIC binaries, be sure to round ep_daddr to a pageMiod Vallat
boundary; uvm depends on this and will KASSERT this for its own safety. Found the hard way, rounding direction discussed with ariane@ (I initially wanted to round down, but it makes more sense to round up). Of course noone in his right mind ought to run OMAGIC binaries (-:
2012-09-26add M_ZEROIZE as an mbuf flag, so copied PFKEY messages (with embedded keys)Markus Friedl
are cleared as well; from hshoexer@, feedback and ok bluhm@, ok claudio@
2012-09-20In somove() free the mbufs when necessary instead of freeing themAlexander Bluhm
in the release path. Especially accessing m in a KDASSERT() could go wrong. OK claudio@
2012-09-19When a socket is spliced, it may not wakeup the userland for reading.Alexander Bluhm
There was a small race in sorwakeup() where that could happen if we slept before the SB_SPLICE flag was set. ok claudio@
2012-09-19In somove() make the call to pr_usrreq(PRU_RCVD) under the sameAlexander Bluhm
conditions as in soreceive(). My goal is to make socket splicing less protocol dependent. ok claudio@
2012-09-19vhold() and vdrop() are prototyped in vnode.h, so don't repeat them herePhilip Guenthe
ok beck@
2012-09-17Recognize executables tagged with ELFOSABI_OPENBSD (such as generatedMatthew Dempsky
by the Go linker) as native executables even if they don't contain an OpenBSD PT_NOTE segment. Confirmed to fix Go by sthen ok kettenis, deraadt
2012-09-17Fix indent white spaces.Alexander Bluhm
2012-09-11Remove the 'OLF method' used for the transition from a.out to ELF andTheo de Raadt
for all the compat layers which are now gone. Linux compat still works because it always used another method in any case, and nothing looks at p_os anymore. ok jsing
2012-09-10Cleanup VFS mount string handling:Joel Sing
- Avoid using copyinstr() without checking the return value. - sys_mount() has already copied the path in, so pass this to the filesystem mount code so that it does not have to copy it in again. - Avoid copyinstr()/bzero() dance when we can simply bzero() and strlcpy(). ok krw@
2012-09-10syncTheo de Raadt
2012-09-10compat_o48_sys_getdirentries can die; ok guentherTheo de Raadt
2012-09-10delete compat_o48_sys_getdirentries; ok guentherTheo de Raadt
2012-09-08Plug a race where we're trying to kill a traced process while it is aleadyMark Kettenis
exiting. At that point ps_single may point to a proc that's already freed. Since there is no point in killing a process that's already exiting, just skip this step. ok guenther@
2012-09-02Do not need bcopy trickery to update the file descriptorTheo de Raadt
pointer array; we can access it directly. ok guenther
2012-08-30Remove a useless test for "elem_count < 0", which can never be trueMatthew Dempsky
because elem_count has an unsigned type (size_t). Noted by Brad/Clang; no binary change on amd64 using GCC either.
2012-08-28Add __guard_local as a hidden symbol to ld.so, kernel, and everyMatthew Dempsky
executable and DSO (via crtbegin.c/crtbeginS.c). Not used yet, but needed before GCC can start emitting -fstack-protector code that uses them instead of __guard.
2012-08-23syncTheo de Raadt
2012-08-23kill nnpfs deadTheo de Raadt
2012-08-23To protect assumptions inside systrace, don't let systrace fds bePhilip Guenthe
shared between processes. ok djm@
2012-08-21Stop "inlining" setrunnable() we already had two bugs because of it.Christiano F. Haesbaert
This also makes sure we call cpu_unidle() on the correct cpu, since the inlining order was wrong and could call it on the old cpu. ok kettenis@
2012-08-20Add support for .openbsd.randomdata sections and PT_OPENBSD_RANDOMIZEMatthew Dempsky
segments to the kernel, ld (2.15), and ld.so. Tested on alpha, amd64, i386, macppc, and sparc64 (thanks naddy, mpi, and okan!). Idea discussed for some time; committing now for further testing. ok deraadt