Age | Commit message (Collapse) | Author |
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Highlights of that distribution include xorg-xserver-1.6.5, kernel
3.0.76 and gcc-4.3.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
On gt1, the BCS is faster than the RCS for all equivalent operations,
unlike gt2+ where the RCS is faster (but at greater power draw).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
The BLT is more power-efficient for the operations it can handle, so use
it when possible (following the usual caveats) if we know we only have
battery power.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
As we only use these buffers once, we should not benefit from requesting
them to be moved into L3/LLC cache - over and above the default
recommendations we make when creating the buffer. Indeed, this may even
lead to artefacts if we fail to invalidate those other caches when
reusing the buffers.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
If the target is already on the render ring, don't force the switch away.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Rearrange the tests so that we check both src/dst for which rings they
are currently on.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Reduce the number of pipe-controls we emit by combining one of the
frequent flushes with a stall.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
As we get more well-endowed GPUs with ever more execution units, it
becomes advantageous to do even basic copies through the render ring.
However, the extra performance comes at a cost - higher power usage. To
mitigate this, we apply a heuristic of only allowing a switch over to
the render ring if the render ring is already active with an early
request (in addition to the usual stall avoidance and general
performance heuristics).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Don't force us to select BLT too early if we allow ring switching. As
the RENDER ring benefits from more cacheing over time (e.g. HSW:GT3e) it
becomes much more preferable to use it over the BLT. Since we already
have the logic to decide if ring switching is possible/preferred, relax
the initial checks on where the current activity is to allow switching
between batches.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
On Iris, we may store the framebuffer in the eLLC/LLC and mark it as
being Write-Through cached. This means that we can treat it as being
cached for read accesses (either by the GPU or CPU), but must be careful
to still not write directly to the scanout with the CPU (only the GPU
writes are cached and coherent with the display).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
GT3 has twice the number of cores and URB as GT2, and so we can use
more threads and URB entries.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
The codename changed midcycle - along more rational lines (all the chips
within the platform are now part of the Baytrail family rather than
different codenames for each).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Occasionally when forced to use an intermediate destination surface, we
know that we will completely overwrite the contents of the surface and
so we can forgo the initial copy from the target.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66297
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Complete logic fail for finding the bounding box of the boxes to be
copied.
Reported-by: Clemens Eisserer <linuxhippy@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66168
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
The clear hint is correctly updated when performing the move-to-gpu and
so it is being superfluously repeated by the callers.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Otherwise the sampler on Haswell will just read all zeros when trying to
playback a video.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65699
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
If the target bo is not bound when we start to emit the composite state
for the operation, we are screwed.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
This is useful, for example, with the multiple gen7 variants.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
The two similar chipsets do not use the same PCI-ID encoding schema.
Fixes regression from
commit 235a3981ea9759317b392302a2b2b8f4fafab410
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Mar 26 20:37:14 2013 +0000
sna/gen7: Use GT2 values for GT2 variants
Reported-by: zaverel@free.fr
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
The memory attributes changed slightly, and in particular there is now
an explicit uncached setting - which of course happened to be the value
currently selected.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
The check was only testing for GT2+ and excluding the normal GT2
devices. See also
commit ce9f0448367ea6a90490a28150bfdc0a76500129
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Fri Feb 8 16:01:54 2013 +0000
sna/gen6: Use GT2 settings for both GT2 and GT2+
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Without the casts, the division ends up as 0 rather than the fractional
offset into the texture.
The casts were missed in the claimed fix:
commit 89038ddb96aabc4bc1f04402b2aca0ce546e8bf3
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Thu Feb 28 14:35:54 2013 +0000
sna/video: Correct scaling of source offsets
Reported-by: Roman Elshin <roman.elshin@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62343
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
When applying pan and zoom to a mismatched video, it would inevitably
miscompute the origin and scale factors.
Reported-by: Matti Hamalainen <ccr@tnsp.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61610
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Oops, the assertions that we had sufficient free space was inverted.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Allow use of advanced ISA when available by detecting support at
runtime. This initial work just uses GCC to emit varying ISA, future
work could use hand written code for these hot spots.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
If we end up contending for the vertex lock, we need to double check
there is sufficient vertex space left for us.
Bugzill: https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/1124576
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Various assertions to track down a potential programming error.
References: https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/1124576
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
The idea is to implement more fine-grained checks as we may want
different heuristics for desktops with GT1s than for mobile GT2s, etc.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
We need to be careful not just when finishing the current vbo to
synchronize with the sharing threads, but also before we emit the batch
state that no other thread will try and do the same.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
When completing a batch mid-operation, we need to wait upon the other
threads to complete their writes so that memory is coherent before
submitting the work to the GPU. This was achieved by forcing the finish,
but all that from that is the wait, which makes the handling of threads
much explicit and removes the unnecessary vbo refresh.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|