summaryrefslogtreecommitdiff
path: root/src/sna/sna_accel.c
AgeCommit message (Collapse)Author
2016-09-28sna: Handle GetImage planemask inplaceChris Wilson
As found by Adam Jackson, we can perform the masking of the planemask on the user buffer and so avoid hitting the fallback paths, so long as we have no 24bpp Pixmaps. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-08-17sna: Add CPU damage to DRI flushChris Wilson
When we damage the CPU shadow of a DRI exported pixmap, we must remember to add that pixmap of the list to be flushed. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-08-02sna: Split SHM and DRI flush trackingChris Wilson
Tracking SHM flushes precludes some of the optimisations we can make in future for tracking DRI flushes, so split the two paths. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-08-02sna: Only flush GPU bo for a damage eventChris Wilson
Based on xf86-video-ati commit 9a1afbf61fbb2827c86bd86d295fa0848980d60b Author: Michel Dänzer <michel.daenzer@amd.com> Date: Mon Jul 11 12:22:09 2016 +0900 Use EventCallback to avoid flushing every time in the FlushCallback reports seeing an improvement in reducing flushes at the expense of checking every event for a DamageNotifyEvent. Since we also mix rendering with SHM buffers, we have a more diverse set of conditions under which to flush - but maybe we will see enough of a win for DRI to merit. So far seeing improvement of ~20% for series of small operations under the compositor without seeing any regressions, should benefit composited desktop users. The biggest danger here is missed flushes. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-07-20Adapt to libXfont2 ABI changesChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-07-20Update to ABI 22 and NotifyFdChris Wilson
ABI 22 brings in a new BlockHandler/WakeupHandler interface (SetNotifyFd) and throws out the current interface (albeit without delivering any improvements). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-06sna/gen9: Quick and dirty implementationChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-03-16sna: Ensure the scanout is fully flushed on LeaveVTChris Wilson
Just in case we haven't otherwise flushed and invalidated the scanout prior to loosing control of the output. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-11-27sna: Add DBG for small_copy()Chris Wilson
Emit a DBG when we decide that the region is small and pass that hint to the backends. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-11-19Re-enable acceleration!Chris Wilson
Double negatives are most confusing before coffee. In removing the double negation from the xorg.conf, I inverted the option in the code but didn't invert the test. As a result, acceleration was now disabled unless you explicitly asked for NoAccel. Reported-by: Jan Steffens Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-11-19Rename Option "NoAccel" to "Accel"Chris Wilson
Totally cribbed from xf86-video-amdgpu/-radeon: commit 560b7fe6dc66405762020f00e9a05918a36f3a17 Author: Michel Dänzer <michel.daenzer@amd.com> Date: Wed Nov 11 17:31:34 2015 +0900 Rename Option "NoAccel" to "Accel" Renaming the option removes the need for a double negation when forcing acceleration on and is backwards compatible as the option parser automagically handles the 'No' prefix. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-11-09sna: Allow a CPU bo to be read by the GPU as we read from itChris Wilson
Just a minor assertion relaxation to treat a bo as being read by both parties as still coherent. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-08-20sna: Check for system memory contents when looking for empty sourcesChris Wilson
Fixes a regression from commit 3f128867d957e30690218404337b00bb327e647b Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Aug 7 15:19:17 2015 +0100 sna: Skip a no-op copy that forgot that we can flush damage but still have valid contents to copy from. Reported-by: Timo Aaltonen <tjaalton@ubuntu.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-08-07sna: Skip a no-op copyChris Wilson
If the source has no contents, the destination is equally undefined. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-07-29sna: Fix off by one in constructing XCopyPlane on bdwChris Wilson
Broadwell expanded all the relocations and we needed to adjust our command construction to match. I missed offsetting the XY_SRC_COPY_IMM used for XCopyPlane resulting in garbage for small copies on Broadwell. Reported-by: Omar Sandoval Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91499 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-07-26sna: Add a few more DBG and assertions around Present/TearFree interactionsChris Wilson
References: https://bugs.freedesktop.org/show_bug.cgi?id=91467#c12 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-07-26sna: Add a small pixmap sanity checkChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-07-26sna: Add a some DBG info to Window creation/destructionChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-07-26sna: Add a DBG trace to reusing pixmap headersChris Wilson
References: https://bugs.freedesktop.org/show_bug.cgi?id=91467#c9 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-06-09sna: Pass scanout flag when creating PRIME boChris Wilson
For PRIME bo, we need to use uncached render targets so that any writes are flushed out to main memory where they can be immediately read by a PCI device. For simplicity, we just request that PRIME bo be also SCANOUTs as that ensures that they will be created with the right attributes for coherent main memory. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-06-09sna: Flush SlavePixmap dirty rects before calling ProcessPendingChris Wilson
As the slave may use the ProcessPending damage callback to do its own copying, we need to flush before. Reported-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-06-05sna: Only add the COW to the flush write if exported for writingChris Wilson
If the source is only being exported for reading, we can skip adding it to the flush list only to perform a no-op. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-06-05sna: Add COW source pixmap to flushing listChris Wilson
In the case of an exported pixmap, e.g. with DRI3, it is possible for the client to render into the pixmap whilst we are unaware. To serialise the xserver and the client, we flush all operations on exported pixmaps before talking to the client. In the case of COW however, we did not flush the copy-on-write when transferring control to the client, and thereby we could capture the modified contents. Bugzilla: https://bugs.kde.org/show_bug.cgi?id=340202 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90836 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-06-02sna: Mark GPU as wholly damage when replacing a drawableChris Wilson
References: https://bugs.freedesktop.org/show_bug.cgi?id=90725#c37 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-06-01sna/dri2: Only attempt to change tiling when requiredChris Wilson
If we have no fence, we then try to discard the tiling. However, the change_tiling routine assumes that it is only called when a change is actually required. Make it so for dri2. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-06-01sna: Double check that a tiling change request results in a changeChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-05-18sna: Wrap GetImage with sigtrapChris Wilson
Mostly for completeness, though it is still remotely possibly for the dst pointer to raise a SIGBUS (just less likely since it is not a i915 bo). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-05-18sna/glyphs: Improve handling of low bitdepth mask format conversionsChris Wilson
We shouldn't just discard the mask if the user requests that we render the glyphs through a low bitdepth mask - and in doing so we should also be careful not to improve the bitdepth of that mask (since we don't take into account the extra quantisation desired). Testcase: render-glyphs Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-05-17sna: Markup a couple more potential mmap() accessesChris Wilson
All pointer access into a mmap() arena should be wrapped by sigtrap, in case the kernel generates a SIGBUS (oom, eio, bugs, etc). Add a couple more missing annotations. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-05-17sna: Wrap CPU access for composite operations with sigtrapChris Wilson
Anytime we access a mmap() we need to be prepared for the kernel to send us a SIGBUS, but we were missing a few sigtraps around calls to pixman_fill and pixman_blt. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-04-21sna: Fix ancient typo in DEFAULT_TILING == YChris Wilson
We could just fix the typo, but that whole if block is redundant. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-04-21sna: Add a define to change scanout tilings by defaultChris Wilson
Just for testing, you hear? Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-04-17sna: Enable blitting with Y-tiled surfacesChris Wilson
Since gen6, there has been a magic register bit to change the interpretation of the tiling mode between X and Y for BLT operations. With the advent of DRI3 and scanouts supporting Y, enabling support at last appears interesting, perhapss even by default for non-scanouts? Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-04-05sna/dri2: Prevent the sw cursor from copyig to a buffer as we discard itChris Wilson
During swapbuffers, the sw cursor tries to write to the old buffer. Ordinary this is not an issue as we are discarding it, but under TearFree that write causes us to instantiate the shadow buffer with a possible recursion into set_bo and mayhem. v2: commit 226a58bc592d4ed305b7ad0e460f1ee2548e0ddf Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Sat Apr 4 20:58:24 2015 +0100 sna/dri2: Prevent the sw cursor from copyig to a buffer as we discard it Tried to fix it by disabling SourceValidate. However, it a direct hook into the Damage code by miSprite that triggers the copy. Since there appears to be no way to intervene, we just mark that copy as internal and ignore it. Reported-by: Chris Bainbridge <chris.bainbridge@gmail.com> References: https://bugs.freedesktop.org/show_bug.cgi?id=89903 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-04-02sna: Do not call an extra busy ioctl for scanout flushsChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-04-01sna: Don't unroll BLT pointsChris Wilson
The compiler is smarter than I am; unrolling hurts here. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-04-01sna: Relax unclean rules to check busyness on all foreign pixmapsChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-04-01sna: Fixup inverted logic for new boChris Wilson
We only want to inspect the busy status of bo we have not yet added to our execbuffer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-03-31sna: Flush BLT operations to an idle GPUChris Wilson
We improve dispatch latency if after creating a command buffer we immediately submit if the GPU is idle. This improves concurrency as we continue to build the next command buffer as the GPU executes and helps prevent needlessly using one engine for too long (i.e. sometimes we may be able to execute the work much early and do the ring switch cheaply). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-03-31sna: Query the engine residency on foreign bo before useChris Wilson
Since knowing which ring the bo is currently active on is important when considering the impact of semaphores on the next operation, be sure to query it on foreign bo before we use them. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-03-31sna/gen6+: Prefer the BLT for small copiesChris Wilson
Even on GT3, it is preferrential to use the blitter if the copy is small (due to the latency in execution). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-03-31sna/gen6+: Prefer the BLT for small self-copiesChris Wilson
Even on GT3, it is preferrential to use the blitter if the copy is small (due to the latency in execution). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-03-14sna: Skip inplace operation to a busy clear GPU boChris Wilson
Since clearing is a relatively trivial operation, allow us to do the clear to a CPU bo rather than block on the GPU. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-03-12sna: Futureproof acceleration backend selectionChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-03-05sna: Remove the flush after waking up between clientsChris Wilson
In the normal command processing stream, we will have lots of opportunity to ask whether we should be batching requests together. If we wakeup without doing any work, then we will check inside the block handler whether the GPU is idle and flush then. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-03-02sna: Tweak find clip box assertsChris Wilson
Reordering the asserts to save one predicate! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-03-01sna: Unrecurse clip box searchChris Wilson
Unwind the trivial tail recursion from the clip box bisection and add a couple of assertions on the inlined fast-paths. References: https://bugs.freedesktop.org/show_bug.cgi?id=89295 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-02-24sna: Discard addition of drawable offset if 0Chris Wilson
Missing trim of "add 0" from commit 0b7a6666f82b4fa07f9c9d9a9c1819efc363b31b Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Jan 5 14:00:44 2015 +0000 sna: Partially unroll conversion of rectangles to boxes for fills not all redundant +(dx,dy) were dropped. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-02-11sna: Partially unroll conversion of rectangles to boxes for fillsChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-02-06sna/trapezoids: Use incremental region clipping for spansChris Wilson
Within a span, we have the advantage of knowing that we only need to intersect one box with the clip region, and that box has monotonically increasing y. This avoid having to compute RegionIntersect for every span element which was very slow (e.g. libreoffice). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>