Age | Commit message (Collapse) | Author |
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
In order to reduce the number of breadcrumbs the kernel must emit to
track our batches, reuse the last query until it has retired.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
If we filled the batch exactly, then subtract -1 for the reserved
BATCH_BUFFER_END, it would underflow to a large value - convincing us
that we had sufficient room to stuff many, many more commands in.
However, all the callsites should be guarded by checking already that
they had sufficient space to emit at least one operation...
References: https://bugs.freedesktop.org/show_bug.cgi?id=77074
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Somce early machines have 512MiB apertures, but we can still only use
the low 256MiB for fencing. Separate out the mappable restriction checks
from the fencing in order to further constrain those devices.
Reported-by: Matti Hämäläinen <ccr@tnsp.org>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
As root, X gets away with many things, including submitting commands to
the DRM device whilst it is no longer authorised (i.e. when it has
relinquished master to another client across a VT switch). In the
non-root future, if we attempt to use the device whilst unauthorized the
rendering will be lost and we will mark the device as unusable. So flush
our render queue to the device around a VT switch.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Fresh bo (those without a reservation already defined, ala
presumed_offset) will cause the kernel to do a full relocation pass. So,
if possible flush the already correct batch in the hope of trimming the
amount of checking the kernel has to perform on this new batch.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
This was disabled in
commit 9f4f855ba37966fb91d31e9081d03cf72affb154
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Mon May 26 07:06:18 2014 +0100
sna: Implicit release of upload buffers considered bad
as retiring the buffers during the command setup could free one of the
earlier bo used in the command. But discarding the snooped bo could
still be advantageous. So restore the automatic discard of upload
proxies, but make sure we only do between operations.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matti Hämäläinen <ccr@tnsp.org>
|
|
The primary benefit of this is avoid the extra blit when using a
compositor and instead propagate the compositor flip on the frontbuffer
to the scanout, or equivalently allows a fullscreen game to flip onto
the scanout without intervention by TearFree.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
This should always be set during bo creation
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Currently upload buffers are automatically decoupled when the buffer is
retired. As retiring can happen during command setup after we have
selected which bo to render with, this can free the bo we plan to use.
Which is bad.
Instead of making the release of upload buffers automatic, we manually
check whether the buffer is idle before use as a source to consider
scrapping it and replacing it with a real GPU bo. This is likely to keep
upload buffers alive for longer (limiting reuse between Pixmaps but
making reuse of the buffer within a Pixmap more likely) which is both
good and bad. (Good - may improve the content cache, bad - may increase
the amount of memory used by upload buffers for arbitrary long periods.)
Reported-by: Matti Hämäläinen <ccr@tnsp.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79238
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
In particular allow the pointer cache to be disabled for valgrind.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Otherwise, we may never decouple it again afterwards leading to a
dangling pointer dereference.
Bugzilla: https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/1289923
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Since we call kgem_bo_submit() along one path when synchronising a
cached bo (which is known to be inactive) but still want to keep the
assertion on the refcnt, simply rearrange the code to only assert on the
active path.
References: https://bugs.freedesktop.org/show_bug.cgi?id=73406
Reported-by: Matti Hamalainen <ccr@tnsp.org>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
If the CPU bo is wholly damaged, then it makes an ideal candidate for
simply converting into the GPU bo.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Reduce the logging verbosity of DBG so that it only appears in the
logfile by default - makes debugging much more pleasant.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Remove the attempt to trick us into mapping large bo.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Some linear GPU bo that we create must be naturally aligned, and the
extra alignment imposed for pure paranoia is counter productive.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
We can optimistically only require that we waste the largest fence
region in a batch, as all other fences will then be naturally aligned as
well. So long as the kernel succeeds in defragmenting the aperture...
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
On older architectures, large BO have to be untiled and so we can reuse
an existing CPU bo by adjusting its caching mode.
References: https://bugs.freedesktop.org/show_bug.cgi?id=70924
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Adapt the legacy BLT commands in preparation for future changes.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Before attempting to map the destination for uploading into after a
failure to use the BLT, we need to recheck that it is indeed mappable.
References: https://bugs.freedesktop.org/show_bug.cgi?id=70924
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
After converting aperture_mappable to count in pages, there were a few
residual users expecting a byte count.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71117
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Make sure we never unwind a used buffer.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Expunge our caches if we fail to write into a bo (presuming that
allocation failure is the likely fixable cause).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
When trying to conserve power, reduce the number of small batches we
emit - trying to maximise GPU efficacy and minimise CPU overhead.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
This is required to ensure that the tiled offsets are tile-row aligned.
Bugzilla: https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/1232546
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Now that we use CPU mmaps to read/write to tiled X surfaces, we find
ourselves frequently switching between CPU and GTT mmaps and so wish to
cache both.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
This problematic GPU still seems to like to fallover when faced with
Y-tiling. It was reserved only for use with glyphs, but even that
occasionally runs into trouble, so disable all selection of Y-tiling for
our own use.
Bugzilla: https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/1222203
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Fixes cache bookkeepping when mixing userptr uploads.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Whilst we reserved exec entry slots for the deferred VBO, there were no
relocation spaces reserved. So if we submitted a render command followed
by a multitude of BLT copies, we could then overrun the relocation array
when adding the deferred vbo to the batch.
Reported-by: Danny <moondrake@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67504
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
If we allocate the scanout from stolen, we cannot then access it via the
CPU - so prevent the mapping in those cases.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
On Iris, we may store the framebuffer in the eLLC/LLC and mark it as
being Write-Through cached. This means that we can treat it as being
cached for read accesses (either by the GPU or CPU), but must be careful
to still not write directly to the scanout with the CPU (only the GPU
writes are cached and coherent with the display).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
If the linear bo is still in the CPU domain, we can map it through the
CPU with no penalty, so treat it as mappable.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|