summaryrefslogtreecommitdiff
path: root/sys/kern/vfs_bio.c
AgeCommit message (Collapse)Author
2012-12-28Avoid spinning in the cleaner when there are insufficient clean pages, butJoel Sing
there are no buffers on the dirty queue to clean. ok beck@
2012-12-02Fix kva reserve - ensure that kva reserve is checked for, as wellBob Beck
as fix the case where buffers can be returned on the vinvalbuf path and we do not get woken up when waiting for kva. An earlier version looked at and ok'd by guenther@ in coimbra. - helpful comments from kettenis@
2012-12-02Don't wake the cleaner and potentially throw away pages we shouldn'tBob Beck
be throwing away when growing the buffer cache - ok mlarkin@
2012-11-07Fix the buffer cache.Bob Beck
A long time ago (in vienna) the reserves for the cleaner and syncer were removed. softdep and many things have not performed ths same ever since. Follow on generations of buffer cache hackers assumed the exising code was the reference and have been in frustrating state of coprophagia ever since. This commit 0) Brings back a (small) reserve allotment of buffer pages, and the kva to map them, to allow the cleaner and syncer to run even when under intense memory or kva pressure. 1) Fixes a lot of comments and variables to represent reality. 2) Simplifies and corrects how the buffer cache backs off down to the lowest level. 3) Corrects how the page daemons asks the buffer cache to back off, ensuring that uvmpd_scan is done to recover inactive pages in low memory situaitons 4) Adds a high water mark to the pool used to allocate struct buf's 5) Correct the cleaner and the sleep/wakeup cases in both low memory and low kva situations. (including accounting for the cleaner/syncer reserve) Tested by many, with very much helpful input from deraadt, miod, tobiasu, kettenis and others. ok kettenis@ deraadt@ jj@
2012-10-16Cleanup.Bob Beck
- Whitespace KNF - Removal/fixing of old useless comments - Removal of unused counter - Removal of pointless test that had no effect ok krw@
2012-10-09bufq write limitingBob Beck
This change ensures that writes in flight from the buffer cache via bufq are limited to a high water mark - when the limit is reached the writes sleep until the amount of IO in flight reaches a low water mark. This avoids the problem where userland can queue an unlimited amount of asynchronous writes resulting in the consumption of all/most of our available buffer mapping kva, and a long queue of writes to the disk. ok kettenis@, krw@
2012-05-30Fix a few issues in the pressure logic when the available buffers run low:Miod Vallat
- make sure the buffer reclaiming loop in buf_get() actually does something but spin, if `backoffpages' is nonzero and all free queues have been drained. - don't forget to set a poor man's condition variable to nonzero before tsleeping on it in bufadjust(), otherwise you'll never get woken up. - don't be too greedy and reassing backoffpages a large amount immediately after bufadjust() has been called. This fixes reproduceable hangs seen during heavy I/O (such as `make install' of many large files, e.g. run in /usr/src/lib with NOMAN=) on systems with a challenged number of pages (less than a few thousands, total). Part of this is temporary bandaid until a better pressure logic is devised, but it's solving an immediate problem. Been in snapshots for a solid month.
2012-03-23Make rusage totals, itimers, and profile settings per-process insteadPhilip Guenthe
of per-rthread. Handling of per-thread tick and runtime counters inspired by how FreeBSD does it. ok kettenis@
2011-09-19clean up buffer cache statistics somewhat toBob Beck
remove some now useless statistics, and add some relevant ones regarding kva usage in the cache. make systat io and show bcstats in ddb both show these counters. ok deraadt@ krw@
2011-07-06the rest of the uvm commit - I commited from uvm instead of sysBob Beck
(part missed from previous commit)
2011-07-04move the specfs code to a place people can see it; ok guenther thib krwTheo de Raadt
2011-07-04bread does nothing with its ucred argument. remove it. ok matthewTed Unangst
2011-06-05Move the bufcachepercent setting code to MI locations -- set it to 42%Theo de Raadt
for now; that is unlikely to hit some of the remaining starvation bugs. Repair the bufpages calculation too; i386 was doing it ahead of time (incorrectly) and then re-calculating it. ok thib
2011-04-07Revert previous diff decrementing bcstats.numbufpages here. This functionBob Beck
does not do what it purports to do, it shrinks mapping, not allocation, as the pages have already been given away to other buffers. This also renames the function to make this a little more obvious and art should not name funcitons ok thib@, art@
2011-04-02Constrain the buffer cache to use only the dma reachable region of memory.Bob Beck
With this change bufcachepercent will be the percentage of dma reachable memory that the buffer cache will attempt to use. ok deraadt@ thib@ oga@
2010-11-13backout 1.86Theo de Raadt
it is totally wrong to convert bdwrite into bawrite on the fly. this just causes way bigger issues. ok beck blambert
2010-08-03matthew did not commit the diff he passed around for us to inspect...Theo de Raadt
repair that situation. Darn newbies...
2010-08-03If an asynchronous request invalidates a buf, then we might remove itMatthew Dempsky
from its vnode's buffer cache in an interrupt context. Therefore we need interrupt protection when searching the buffer red-black tree. ok deraadt@, thib@, art@
2010-07-01Call bufq_done at the top of biodone, so we don't call it onThordur I. Bjornsson
a freed buf as that causes problems...
2010-06-30Disable/partially backout the bufq quiesce changes as thisThordur I. Bjornsson
is causing havoc with vnds and release must be buildable.
2010-06-29Introduce bufq_quiesce(), which will block I/O ifrom getting on the queues,Mark Kettenis
and waits until all I/O currently on the queues has been completed. To get I/O going again, call bufq_restart(). To be used for suspend/resume. Joint effort with thib@, tedu@; tested by mlarkin@, marco@
2010-02-05Use correct format specifiers for 'show bcstats'.Joel Sing
ok beck@ krw@
2009-08-08two things:Bob Beck
1) fix buffer cache low water mark to allow for extremely low memory machines without dying 2) Add "show bcstats" to ddb to allow for looking at the buffer cache statistics in ddb ok art@ oga@
2009-08-02Dynamic buffer cache support - a re-commit of what was backed outBob Beck
after c2k9 allows buffer cache to be extended and grow/shrink dynamically tested by many, ok oga@, "why not just commit it" deraadt@
2009-06-25backout the buf_acquire() does the bremfree() since all callersThordur I. Bjornsson
where doing bremfree() befure calling buf_acquire(). This is causing us headache pinning down a bug that showed up when deraadt@ too cvs to current, and will have to be done anyway as a preperation for backouts. OK deraadt@
2009-06-15Back out all the buffer cache changes I committed during c2k9. This reverts ↵Bob Beck
three commits: 1) The sysctl allowing bufcachepercent to be changed at boot time. 2) The change moving the buffer cache hash chains to a red-black tree 3) The dynamic buffer cache (Which depended on the earlier too). ok on the backout from marco and todd
2009-06-06All caller of buf_acquire were doing bremfree before the call.Artur Grabowski
Just put it in the buf_acquire function. oga@ ok
2009-06-05Dynamic buffer cache sizing.Bob Beck
This commit won't change the default behaviour of the system unless the buffer cache size is increased with sysctl kern.bufcachepercent. By default our buffer cache is 10% of memory, which with this commit is now treated as a low water mark. If the buffer cache size is increased, the new size is treated as a high water mark and the buffer cache is permitted to grow to that percentage of memory. If the page daemon is invoked, the page daemon will ask the buffer cache to relenquish pages. if the buffer cache has more than the low water mark it will relenquish pages allowing them to be consumed by uvm. after a short period the buffer cache will attempt to re-grow back to the high water mark. This permits the use of a large buffer cache without penalizing the available memory for other purposes. Above the low water mark the buffer cache remains entirely subservient to the page daemon, so if uvm requires pages, the buffer cache will abandon them. ok art@ thib@ oga@
2009-06-03add kern.bufcachepercent sysctl to allow adjusting the buffer cacheBob Beck
size on a running system. ok art@, oga@
2009-06-03Change bufhash from the old grotty hash table to red-black trees hangingBob Beck
off the vnode. ok art@, oga@, miod@
2009-04-22Make the interactions in allocating buffers less confusing.Artur Grabowski
- getnewbuf dies. instead of having getnewbuf, buf_get, buf_stub and buf_init we now have buf_get that is smaller than some of those functions were before. - Instead of allocating anonymous buffers and then freeing them if we happened to lose the race to the hash, always allocate a buffer knowing which <vnode, block> it will belong to. - In cluster read, instead of allocating an anonymous buffer to cover the whole read and then stubs for every buffer under it, make the first buffer in the cluster cover the whole range and then shrink it in the callback. now, all buffers are always on the correct hash and we always know their identity. discussed with many, kettenis@ ok
2009-03-23fix buffer cache pending writs statistic so it does not go negative.Bob Beck
this ensures we ignore counting any buffers returning through biodone() for which B_PHYS has been set - which should be set on all transfers that manually do raw io bypassing the buffer cache by setting up their own buffer and calling strategy.. ok thib@, todd@, and now that he is a buffer cache and nfs hacker oga@
2009-01-11backout revision 1.109Owain Ainsworth
"keep b_proc set to the process, thats doing the io as advertised" This broke dvd playing on my laptop (page fault trap in vmapbuf in the physio path). thib's cookie privileges are hereby suspended until further notice.
2009-01-09keep b_proc set to the proccess,Thordur I. Bjornsson
thats doing the io as advertised closes PR3948 OK tedu@ (and blambert@ I think).
2008-11-22Move diagnostic assertions concerning the recycle process of buffersPedro Martelletto
from getnewbuf() to buf_put(), since getnewbuf() does not directly recycle buffers anymore. While at it, remove two lines of dead code from getnewbuf(), which used to disassociate a vnode from a buffer. "just go for it, because everyone had a chance" deraadt@.
2008-06-14Belt, suspenders, duct tape and glue.Artur Grabowski
In brelse, if we end up in the B_INVAL case without mappings, check for B_WANTED and wake up the sleeper if there's one before freeing the buffer. This shouldn't happen, but it looks like there might actually be some dodgy corner cases in nfs where this could just happen if the phase of the moon is right and the wind is blowing from the right direction. thib@ ok
2008-06-12Bring biomem diff back into the tree after the nfs_bio.c fix went in.Theo de Raadt
ok thib beck art
2008-06-11back out biomem diff since it is not right yet. Doing very largeTheo de Raadt
file copies to nfsv2 causes the system to eventually peg the console. On the console ^T indicates that the load is increasing rapidly, ddb indicates many calls to getbuf, there is some very slow nfs traffic making none (or extremely slow) progress. Eventually some machines seize up entirely.
2008-06-10Buffer cache revampBob Beck
1) remove multiple size queues, introduced as a stopgap. 2) decouple pages containing data from their mappings 3) only keep buffers mapped when they actually have to be mapped (right now, this is when buffers are B_BUSY) 4) New functions to make a buffer busy, and release the busy flag (buf_acquire and buf_release) 5) Move high/low water marks and statistics counters into a structure 6) Add a sysctl to retrieve buffer cache statistics Tested in several variants and beat upon by bob and art for a year. run accidentally on henning's nfs server for a few months... ok deraadt@, krw@, art@ - who promises to be around to deal with any fallout
2008-03-16Widen some struct statfs fields to support large filesystem stataOtto Moerbeek
and add some to be able to support statvfs(2). Do the compat dance to provide backward compatibility. ok thib@ miod@
2007-10-21This QUEUE_DEBUG should really be DIAGNOSTIC - we need these checksBob Beck
normally. ok deraadt@ tedu@ otto@
2007-10-18Correct possible spl problem in buffer cleaning daemon - the buffer cleaningBob Beck
daemon requires splbio when doing dirty buffer queue manipulation. Since version 1.88 of vfs_bio.c, it was possible to break out of the processing loop when the cleaner had been running long enough, and this early exit would mean a future pass through would manipulate the buffer queues not at splbio. This change corrects this. ok krw@, deraadt@, tedu@, thib@
2007-09-15replace ctob and btoc with ptoa and atop respectivelyMartin Reindl
help and ok miod@ thib@
2007-08-07 A few changes to deal with multi-user performance issues seen. thisBob Beck
brings us back roughly to 4.1 level performance, although this is still far from optimal as we have seen in a number of cases. This change 1) puts a lower bound on buffer cache queues to prevent starvation 2) fixes the code which looks for a buffer to recycle 3) reduces the number of vnodes back to 4.1 levels to avoid complex performance issues better addressed after 4.2 ok art@ deraadt@, tested by many
2007-07-09Do not allow clustering read for filesystems which block size is smallerMiod Vallat
than the hardware page size, as was the case in the old clustering code. This fixes vnd reads on alpha and sparc64 On behalf of pedro@, ok art@
2007-06-17de-registerJasper Lievisse Adriaanse
ok thib@
2007-06-09Protect access to 'bufhead' with splbio(), okay art@ millert@ marco@Pedro Martelletto
2007-06-03backout rev 1.91 and 1.92, it causes proceses to hang on low memOtto Moerbeek
machines. ok deraadt@
2007-06-01Uninline bio_doread(), okay art@Pedro Martelletto
2007-06-01dont request zeroed memory when we allocate data regions for buffers. thisDavid Gwynne
moves memset from the 20th most expensive function in the kernel to the 331st when doing heavy io. ok tedu@ thib@ pedro@ beck@ art@