Age | Commit message (Collapse) | Author |
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Check if the source and mask are identical pictures and just copy the
source channel to the mask in that case.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
When simply creating a source GPU bo it is preferrable not to mark it as
all-damaged.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
GTK+ has a clever trick for premultiplying its images by loading the
same pixel data into both the source and mask, and then performing the
composite. This causes us to upload the same pixel data twice!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Replace the source picture+alpha with a bo that contains the RGB
channels from source and A from the alpha map.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Only marginally better than falling all the way back to using the CPU,
is to perform a double copy to workaround the overlapping copy.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
If we used the BLT to prepare the source, see if we can continue the
operation on the BLT.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
If we incurred a context switch to the BLT in order to prepare the
target (uploading damage for instance), we should recheck whether we can
continue the operation on the BLT rather than force a switch back to
RENDER.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
If we will need to extract either the source or the destination, we
should see if we can do the entire operation on the BLT.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Replace the growing bitfield with an enum marking where it was last
used.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
In order to avoid having to perform a copy of the cacheable buffer into
GPU space, we can map a bo as cacheable and write directly to its
contents. This is only a win on systems that can avoid the clflush, and
also we have to go to greater measures to avoid unnecessary
serialisation upon that CPU bo. Sadly, we do not yet go to enough length
to avoid negatively impacting ShmPutImage, but that does not appear to
be a artefact of stalling upon a CPU buffer.
Note, LLC is a SandyBridge feature enabled by default in kernel 3.1 and
later. In time, we should be able to expose similar support for
snoopable buffers for other generations.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
For large render targets, we prefer to use tiled bo in order to avoid
severe performance degradation. However, if we don't have a GPU bo but
do have a CPU bo and the operation would be untiled, then simply use the
CPU bo.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
If we change contexts, then we will submit the batch obsoleting the
earlier resource checks.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
i.e. only force the BLT if using the sampler is going to be incredibly
slow.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
The cost of the TLB miss on every sample far outweighs the impact of the
context (and ring) switch.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Make sure that the damage is always set, even if only to NULL, so that
we are safe if in future the operation state is not initially cleared.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
There is no point even attempting a BLT operation if we know that it is
an unusual render operation.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Performance of this lazy interface looks inconclusive:
Speedups
========
xlib swfdec-giant-steps 1063.56 -> 710.68: 1.50x speedup
xlib firefox-asteroids 3612.55 -> 3012.58: 1.20x speedup
xlib firefox-canvas-alpha 15837.62 -> 13442.98: 1.18x speedup
xlib ocitysmap 1106.35 -> 970.66: 1.14x speedup
xlib firefox-canvas 33140.27) -> 30616.08: 1.08x speedup
xlib poppler 629.97 -> 585.95: 1.08x speedup
xlib firefox-talos-gfx 2754.37 -> 2562.00: 1.08x speedup
Slowdowns
=========
xlib gvim 1363.16 -> 1439.64: 1.06x slowdown
xlib midori-zoomed 758.48 -> 904.37: 1.19x slowdown
xlib firefox-fishbowl 22068.29 -> 26547.84: 1.20x slowdown
xlib firefox-planet-gnome 2995.96 -> 4231.44: 1.41x slowdown
It remains off and a curiosity for the time being.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Applications may use the same pixmap with multiple formats within the
same operation. For instance, you can premultiply and composite a normal
pixmap in this manner. However, as we reused the sampler binding
locations of the source (without an alpha channel) for the mask, we
failed to read and multiply by the alpha channel causing it to remain
black instead of transparent.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=40926
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
This is set in configure and redefining it later inside the C files just
leads to trouble and broken compilation.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
This reverts commit 15266e1b9500f6b348661c60d1982bde911f2d0e.
KDE relies upon the ability to render into a sampler and then render
upon itself. Not the first sign of madness...
Will have to find another way of winning back the compwinwin
performance.
|
|
As exemplified by KDE (using Kate) on gen3, it would attempt to render a
large set of boxes using OVER and a transparent colour. As gen3 copied
across some of the BLT assumptions, it was incorrectly reducing that to
a CLEAR and thus rendering incorrectly.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
This appeared to introduce a visual gitch into the xfce4 selection box
on gen6 at least.
References: https://bugs.freedesktop.org/show_bug.cgi?id=42367
Reported-by: Paul Neumann <paul104x@yahoo.de>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
|
|
This is slower than falling back to swrast for x11perf (up to 4x slower
on SNB), it is still faster than doing that rasterisation through a
WC-mapping and much faster in ordinary usage due to avoiding the
readback hit.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
The backends are all expected to initialise the state required.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
For many of the core drawing routines, passing a BoxRec for the fill is
more convenient since they already have one generated by the clip
intersection.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
In the vain hope of reducing switching between rings and introducing
stalls between batches.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
We can only emit state between primitives, ergo we need only check for
state updates if we've finished the vbo or are starting a new operation.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Didn't spot anything that might have led to a genuine bug, but this
should help improve the signal-to-noise ratio of warnings in the future.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|