Age | Commit message (Collapse) | Author |
|
We have to route all the drawing function to glamor first, when
glamor is enabled. This adds a few more functions that were previously
just falling back to swrast and passes them to glamor instead.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
With the introduction of GEM, we can continue to submit batch buffers
irrespective of ownership of the console, so do so.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
So that we avoid leaking the region if hooking into glamor.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
This commit hooks up all the remaining rendering routines to call into
glamor; the takeover is nearly complete! When tested with the latest
glamor master branch, it passes rendercheck.
One thing need to be pointed out is the picture's handling.
Pictures support many different color formats, but glamor's
texture only support a few color formats. And the most common
scenario is that we create a pixmap with a color depth and
then attach it to a picture which has a specific color format
with the same color depth. But there is no way to change a
texture's internal format after the texture was allocated.
If you do that, the OpenGL will allocate a new texture. And
then the glamor side and UXA side will be inconsitent. So
for all the picture related operations, we can't fallback to
UXA path directly, even it is rather a straight forward
operation. So for the get_image, Addtraps.., we have to add
wrappers function for them to jump into glamor firstly.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
[ickle: prefer access; ok = glamor(); finish; if (!ok) goto fallback; return; ]
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Fixes regression from e0066e77e026b0dd0daa0c3765473c7d63aa6753
(uxa: Simplify Composite solid acceleration for spans by only clipping
once) [2.15.901]
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=43649
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
This commit only enables two glamor functions for
uxa_fill_spans and uxa_poly_fill_rects.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Integrate glamor acceleration into UXA framework. Add
necessary flushing at the following points:
1. Flush UXA batch buffer before call into glamor.
2. Flush GL operations after return from a glamor function.
3. The point we need to flush UXA batch buffer, we also
need to flush GL operations, for example, in
intel_flush_callback and couple of places in intel_display.c.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
The attempt was still ridden with bugs, such as
http://bugs.freedesktop.org/show_bug.cgi?id=28768
http://bugs.freedesktop.org/show_bug.cgi?id=28798
http://bugs.freedesktop.org/show_bug.cgi?id=28908
http://bugs.freedesktop.org/show_bug.cgi?id=29401
A fresh approach was taken with SNA, but in the mean time before that
can be enabled downstream, restore correct behaviour.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Reviewed-by: Keith Packard <keithp@keithp.com>
|
|
Unlike the previous commit removing this style of code, the code in
this one was originally wrong, and would fail to clip in the second
pass of clipping when y was > pbox->y2.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=37233
Reviewed-by: Keith Packard <keithp@keithp.com>
|
|
We were clipping each span against the bounds of the clip, throwing
out the span early if it was all clipped, and then walked the clip box
clipping against each of the cliprects. We would expect spans to
typically be clipped against one box, and not thrown out, so we were
not saving any work there. For multiple cliprects, we were adding
work. Only for many spans clipped entirely out of a complicated clip
region would it have saved work, and it clearly didn't save bugs as
evidenced by the many fix attempts here.
Reviewed-by: Keith Packard <keithp@keithp.com>
|
|
uxa_acquire_solid returns NULL under OOM. Thus the value of solid
must be checked before dereferencing it in the uxa_get_offscreen()
call.
Signed-off-by: Bryce Harrington <bryce@canonical.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Matthias Hopf <mhopf@suse.de>
|
|
|
|
planemask is an unsigned long initialised to ~0, on 64-bit this is not equal
to an (unsigned int)-1.
Use the macro provided to do this.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
A slight confusion in computing the correction image location resulted
in the application of the source offsets to the pixel location in the
target and not in the source as intended.
Fixes the visual corruption of the scrollbar in Chromium, and hopefully
the crash reported by Robert Hooker when starting gdm after plymouth.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Now with streaming uploads and downloads for composite operations in
place, shared memory pixmaps are no longer that dire performance wise.
With careful use these can in fact be the most efficient means of
transfer between a wholly software renderer in the client and a backing
store. For instance, Chromium renders internally to an ARGB32 image
buffer and uses a shared pixmap to composite dirty regions into the
backing store. Thereby using the GPU to either perform the blit or the
format conversion. Enabling shared pixmaps, reduces our CPU overhead
whilst scrolling by a factor of 5 or so.
And this is achieved simply by deleting obsolete code!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
|
|
All but uxa_copy_window() perform the preliminary checks for whether
acceleration is available. The simplest method for adding the fallback
for uxa_copy_window() seems to be to add it in the core copy function,
so be it.
This allows X to survive a little longer once we encounter a GPU hang.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Trigger happy bug fixing. The sign *was* right, the endpoint was wrong.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Introduced with e5c971e7639095d38da3518a5dc404b708d45cfb.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
This is wildly optimistic, but it should work in a surprising number of
error situations and some output in those cases will be hopefully be
better than none...
If we submit a batchbuffer and the kernel reports the GPU is hung (which
will be caused by an earlier execbuffer, and so the kernel should have
had enough time to determine whether or not it could reset the GPU) then
disable any further attempt to accelerate gfx and force fallbacks to map
the buffers and use the CPU. We cannot normally map any more buffers if
the GPU is hung, so only those already mapped prior to the hang can be
written to, or those allocated in system memory. However, we can expect
that the framebuffer is already mapped, and so have a reasonable
expectation to continue to see the display update.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Fixes GPU hang on gen6.
|
|
Due to the relocation overhead, using a single composite with many
rectangles outperforms many solid blits.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
If the destination cannot fit into the 3D pipeline when we need to
composite, we fallback to doing the operation on the CPU. This is very
slow, and quite easy to trigger on i915 by plugging in an external
display.
An alternative is to extract the extents of the operation from the
destination using the blitter which can usually handle much larger
operations. This gives us a temporary target that can fit into the 3D
pipeline and thus be accelerated, before copying back into the larger
real destination.
For x11perf this boosts glyph rendering on PineView, from 38kglyphs/s to
480kglyphs/s. Just a little shy of the native performance of 601kglyphs/s
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Use composite rather than solid blits in order to bring performance on
a par with the CPU when using GEM and relocations.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Allow us to check whether we can handle the operation using the blitter
prior to doing any work.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
This reverts commit 6d50553e8f70d8f2142efdfd6c90bc27a599d0bc.
Now we have taught the fallback path not to infinitely recurse,
re-enable the accelerated path for ShmPutImage and friends.
|
|
This reverts commit 27195d7dba0f3ff08b92f3fd916cdf5113cbef58.
put_image often calls copy_area. Which calls put_image. Exhausting of
the stack follows.
|
|
This reverts commit 299b0338d0811192dc4f8eae5d79453e9882c5d1.
A debugging patch, it was never intended to go into master
|
|
Often, for example in the fallback for ShmPutImage, we will attempt to
use uxa_copy_area() copying to a normal pixmap from a memory buffer.
This triggers a fallback, and maps the destination pixmap back into the
GTT. The accelerated put_image path will attempt to stream a blit to the
destination pixmap if it is currently active, avoiding the stall.
|
|
|
|
|
|
Around a call to uxa_put_image() it is possible to mix both accelerated
and fallback paths, with the fallback code making the presumed
optimisation of only trying to call uxa_prepare_access() once. This
fails if the accelerated path also uses prepare/finish access on the
same drawable and then later fallback to the fallback path. This can
happen currently if an error is reported whilst attempting to accelerate
PutImage.
#0 memcpy () at ../sysdeps/x86_64/memcpy.S:162
#1 0x00007ffff43ce4bd in fbBlt (srcLine=<value optimized out>, srcStride=40, srcX=<value optimized out>, dstLine=0xffffffffffffffff, dstStride=64, dstX=0, width=<value optimized out>, height=8, alu=3, pm=4294967295, bpp=8, reverse=0, upsidedown=0) at fbblt.c:93
#2 0x00007ffff43ce740 in fbBltStip (src=0xffffffffffffffff, srcStride=156555204, srcX=34, dst=0xfffffffc, dstStride=64, dstX=40, width=304, height=8, alu=3, pm=4294967295, bpp=8) at fbblt.c:944
#3 0x00007ffff4c32c53 in uxa_do_put_image (pDrawable=0x246aa410, pGC=0x2c0a4f0, depth=8, x=0, y=0, w=38, h=8, leftPad=0, format=2, bits=0x954d7c4 "") at uxa-accel.c:196 #4 uxa_do_shm_put_image (pDrawable=0x246aa410, pGC=0x2c0a4f0, depth=8, x=0, y=0, w=38, h=8, leftPad=0, format=2, bits=0x954d7c4 "") at uxa-accel.c:223
#5 uxa_put_image (pDrawable=0x246aa410, pGC=0x2c0a4f0, depth=8, x=0, y=0, w=38, h=8, leftPad=0, format=2, bits=0x954d7c4 "") at uxa-accel.c:289
#6 0x00000000004d574f in damagePutImage (pDrawable=0x246aa410, pGC=0x2c0a4f0, depth=8, x=0, y=0, w=38, h=8, leftPad=0, format=2, pImage=0x954d7c4 "") at damage.c:905
#7 0x00000000004287db in ProcPutImage (client=0x47ca72d0) at dispatch.c:2073
#8 0x000000000042bd94 in Dispatch () at dispatch.c:445
#9 0x000000000042513a in main (argc=4, argv=0x7fffffffe2a8, envp=<value optimized out>) at main.c:285
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
We've talked about doing this since the start of the project, putting it off
until "some convenient time". Just after removing a third of the driver seems
like a convenient time, when backporting's probably not happening much anyway.
|
|
Signed-off-by: Keith Packard <keithp@keithp.com>
|
|
|
|
|
|
|
|
|
|
uxa_prepare_access may fail to map the pixmap into user space. Recover from
this without crashing.
Signed-off-by: Keith Packard <keithp@keithp.com>
|
|
Failing xalloc in a rendering function means just dropping the drawing on
the floor (that's what we've always done).
|
|
|
|
|
|
This eliminates the cost of EXA migration management while providing full
pixmap allocation control to the driver. The goal is to make something
useful for UMA drivers.
|