summaryrefslogtreecommitdiff
path: root/src/i830_render.c
AgeCommit message (Collapse)Author
2012-05-23intel: convert to new screen conversion APIsDave Airlie
The compat header takes care of the old server vs new server. this commit was autogenerated from util/modular/x-driver-screen-scrn-conv.sh Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-03-15uxa: Simplify flush trackingChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-23i830: amalgamate consecutive composites into a single primitiveChris Wilson
Improve aa10text on i845 from 218kglyphs/s to 234kglyphs/s Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-03i965: Amalgamate surface binding tablesChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-07Include a chipset generation number to clarify device specific paths.Chris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-25Rename common infrastructure to the intel namespace.Chris Wilson
After splitting out the i810 driver into its own legacy directory, we can identify the common routines not as i830 but as intel. This clarifies the code which *is* i830 specific. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-14i965: Sanity check ComponentAlpha status in prepare_compositeChris Wilson
Fixes: Bug 28446 - Garbled Font with Mathematica 7 https://bugs.freedesktop.org/show_bug.cgi?id=28446 Rewriting the glyphs to render to the destination directly and removing the more expensive multiple invocations of CompositePicture per picture was a great performance boost -- except that it needs special handling in the backend in order to not fallback. Having done so for i915, I neglected to ensure the sanity checking in i965_prepare_composite() was sufficient. As it turns out, it was not and so we misrendered CA-glyphs when rendering directly to the destination. This causes us to fallback properly, but is a performance regression as we no longer try the 2-pass magic helper before resorting to s/w. At the moment, I'd rather live with the temporary regression and fix i965 to do the same magic as i915, as it critical to fixing the severe performance issues currently crippling i965, as I believe that this regression only affects the minority of applications (incorrect, as it turns out, as the glyphs are overlapping) rendering directly to the destination. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-09Revert "xp:trapezoids"Chris Wilson
This reverts commit f429fb9d872950705e11171d0e7407fb7673c786. An experimental patch I forgot was on my main branch as I was bugfixing. ARGH!
2010-06-08xp:trapezoidsChris Wilson
2010-05-24uxa: Use temporary dest when target is too large for compositorChris Wilson
If the destination cannot fit into the 3D pipeline when we need to composite, we fallback to doing the operation on the CPU. This is very slow, and quite easy to trigger on i915 by plugging in an external display. An alternative is to extract the extents of the operation from the destination using the blitter which can usually handle much larger operations. This gives us a temporary target that can fit into the 3D pipeline and thus be accelerated, before copying back into the larger real destination. For x11perf this boosts glyph rendering on PineView, from 38kglyphs/s to 480kglyphs/s. Just a little shy of the native performance of 601kglyphs/s Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-24Kill paranoid assertions on every write into the batchbuffer.Chris Wilson
On my PineView box these represent ~5% overhead on x11perf text: Before: 16000000 trep @ 0.0020 msec (495000.0/sec): Char in 80-char aa line (Charter 10) 12000000 trep @ 0.0022 msec (461000.0/sec): Char in 80-char rgb line (Charter 10) After: 16000000 trep @ 0.0020 msec (511000.0/sec): Char in 80-char aa line (Charter 10) 16000000 trep @ 0.0021 msec (480000.0/sec): Char in 80-char rgb line (Charter 10) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-16i830: Encode surface bpp into formatChris Wilson
References: Bug 28135 - [855GM] Slowdown/High CPU-Usage after Git-Commit 926fbc7d90ac1d0d49d154f136f9c9ed613c98c2 https://bugs.freedesktop.org/show_bug.cgi?id=28135 The simple answer is that I had assumed that 0 was a reserved value. However, without the bbp encoded into the format 0 was used for a8r8g8b8 and r5g6b5, which are very common formats! The other possibility for the slowdown is that gtkperf is using of the now verboten xrgb formats -- but would in fact be valid if the source covers the clip and we could fixup the alpha value in the fixed function combine. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-15i830: Remove incorrectly mapped tex formats.Chris Wilson
We no longer workaround the lack of alpha expansion for xrgb textures as this interferes with EXTEND_NONE, though we could if we know the source covers the clip... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-10uxa: Rearrange checking and preparing of composite textures.Chris Wilson
x11perf regression caused by 2D driver https://bugs.freedesktop.org/show_bug.cgi?id=28047 caused by commit a7b800513fcc94e063dfd68d2f63b6bab7fae47d uxa: Extract sub-region from in-memory buffers. The issue is that as we extract the region prior to checking whether the composite can in fact be accelerated, we perform expensive surplus operations. This is particularly noticeable for ComponentAlpha text, such as rgb10text. The solution here is to rearrange the check_composite() prior to acquiring the sources, and only extracting the subregion if the render path can not actually handle the texture. Performance (on PineView): a7b800513^: aa=68600 glyphs/s, rgb=29900 glyphs/s a7b800513: aa=65700 glyphs/s, rgb=13200 glyphs/s now: aa=66800 glyph/s, rgb=28800 glyphs/s The residual lossage seems to be from the extra function call and dixPrivate lookups. Hmm. More warning is the extremely low performance, however the results are consistent so the improvement looks real... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-04-13i830 render: check aperture space requirementsDaniel Vetter
No point not doing this. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2010-04-13i830 render: use tiling bits where possibleDaniel Vetter
This is in preparation to explicit fence allocation with execbuf2. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2010-03-16Fill alpha on xrgb images.Chris Wilson
Do not try to fixup the alpha in the ff/shaders as this has the side-effect of overriding the alpha value of the border color, causing images to be padded with black rather than transparent. This can generate large and obnoxious visual artefacts. Fixes: Bug 17933 - x8r8g8b8 doesn't sample alpha=0 outside surface bounds http://bugs.freedesktop.org/show_bug.cgi?id=17933 and many related cairo test suite failures. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-03-16i830: Remove coord-adjust for nearest centre-sampling.Chris Wilson
Fixes a number of cairo test suite failures. Also affects: Bug 16917 - Blur on y-axis also when only x-axis is scaled bilinear http://bugs.freedesktop.org/show_bug.cgi?id=16917 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-02-20Remove dead assignments noticed by clang.Eric Anholt
2009-12-07batch: Ensure we send a MI_FLUSH in the block handler for TFPChris Wilson
This should restore the previous level of synchronisation between textures and pixmaps, but *does not* guarantee that a texture will be flushed before use. tfp should be fixed so that the ddx can submit the batch if required to flush the pixmap. A side-effect of this patch is to rename intel_batch_flush() to intel_batch_submit() to reduce the confusion of executing a batch buffer with that of emitting a MI_FLUSH. Should fix the remaining rendering corruption involving tfp [inc compiz]: Bug 25431 [i915 bisected] piglit/texturing_tfp regressed http://bugs.freedesktop.org/show_bug.cgi?id=25431 Bug 25481 Wrong cursor format and cursor blink rate with compiz enabled http://bugs.freedesktop.org/show_bug.cgi?id=25481 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-11-29batch: Emit a 'pipelined' flush when using a dirty source.Chris Wilson
Ensure that the render caches and texture caches are appropriately flushed when switching a pixmap from a target to a source. This should fix bug 24315, [855GM] Rendering corruption in text (usually) https://bugs.freedesktop.org/show_bug.cgi?id=24315 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-11-10Check that batch buffers are atomic.Chris Wilson
Since batch buffers are rarely emitted by themselves but as part of a sequence of state and vertices, the whole sequence is emitted atomically. Here we just enforce that batches are marked as being part of an atomic sequence as appropriate. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-11-05Fix "Remove flow-control macros for fallbacks in the 2D driver."Eric Anholt
I guess this is the sort of failure due to rebase-happiness that makes Linus yell at us for rebasing.
2009-11-05Remove flow-control macros for fallbacks in the 2D driver.Eric Anholt
It's poor style, and has confused new developers.
2009-10-14conf: Add debugging flush optionsChris Wilson
Make the following options available via xorg.conf: Section "Driver" Option "DebugFlushBatches" "1" # Flush the batch buffer after every # single operation; Option "DebugFlushCaches" "1" # Include a MI_FLUSH at the end of every # batch buffer to force data to be # flushed out of cache and into memory # before the completion of the batch. Option "DebugWait" "1" # Wait for the completion of every batch buffer # before continuing, i.e. perform synchronous # rendering. EndSection Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-10-08Call pPixmaps plain old pixmaps.Eric Anholt
2009-10-08de-pCamelHungarian the Render pictures and pixmaps.Eric Anholt
2009-10-08Share several render fields between render implementations.Eric Anholt
Also, start settling on the cairo naming for things: source, mask, and dest.
2009-10-08Rename the xf86 screen private from pScrn to scrn.Eric Anholt
2009-10-08Rename the screen private from I830Ptr pI830 to intel_screen_private *intel.Eric Anholt
This is the beginning of the campaign to remove some of the absurd use of Hungarian in the driver. Not that I don't like Hungarian, but I don't need to know that pI830 is a pPointer.
2009-10-06Move to kernel coding style.Eric Anholt
We've talked about doing this since the start of the project, putting it off until "some convenient time". Just after removing a third of the driver seems like a convenient time, when backporting's probably not happening much anyway.
2009-09-22Revert "8xx: Fallback for any non-affine transformation."Chris Wilson
This reverts commit 505025053d66d415e1c23ac858b9238fa8541d37. In theory, the non-affine paths work -- at least for the stated test case, so re-enable them and avoid the slow work-around. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-09-22i8xx: Format projective texture coordinates correctly.Keith Packard
Projective texture coordinates must be delivered as TEXCOORDFMT_3D using TEXCOORDTYPE_HOMOGENOUS. This meant selecting the correct type in i830_texture_setup, the correct format in i830_emit_composite_state and sending only 3 coordinates in i830_emit_composite_primitive. Signed-off-by: Keith Packard <keithp@keithp.com> [ickle: tweaked to fix up a couple of use-before-initialised] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-09-21Split i915/i830 composite_emit_primitive into two functions.Keith Packard
The i915 and i830 take similar but different data when emitting the primitives, instead of trying to share code here, just split this apart and avoid potentially breaking things later on. Signed-off-by: Keith Packard <keithp@keithp.com>
2009-09-218xx: Fallback for any non-affine transformation.Carl Worth
There are definitely bugs in the 8xx code dealing with non-affine transformations. Disable that code for now to get things working. Fixes bug #22947 ([855GM, xf86-video-intel-2.8.0] "Freeze" when RENDER extension is being used)
2009-09-14Avoid fallbacks for compositing gradient patternsChris Wilson
Currently when asked to composite using a gradient source or mask, we fallback to using fbComposite(). This has the side-effect of causing a readback on the destination surface, stalling the GPU pipeline. Instead, like uxa_trapezoids(), we can use pixman to fill a scratch pixmap and then copy that to an offscreen pixmap for use with uxa_composite(). Speedups on i915: firefox-talos-svg: 710378.14 -> 549262.96: 1.29x speedup No slowdowns. Thanks to Søeren Sandmann Pedersen for spotting the missing ValidatePicture(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-09-09i915: Restore nearest samplingChris Wilson
My recent commit [94fc93] to use the pixel centre for sampling with the i830 broke the i915. This restores the previous sampling coordinates for the i915 whilst preserving the correct coordinates for i830. Fixes: gnome characters disappear http://bugs.freedesktop.org/show_bug.cgi?id=23803 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-09-05i830/i915: Set the sample position to the pixel center.Chris Wilson
And in particular we apply the nearest sample bias separately for src/mask. Fixes cairo/test: device-offset-scale finer-grained-fallbacks mask-transformed-{similar,image} meta-surface-pattern pixman-rotate surface-pattern-big-scale-down text-transform Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-09-05i830: Update commentsChris Wilson
i830_composite() is no longer shared with i915 but i830_emit_composite_primitive() is. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-09-05i830: Trim composite setupChris Wilson
Remove a couple of redundant NOOPs from the setup and correct the required space checking for atomic batch operation.
2009-09-05i830: remove padding NOOPs from compositeChris Wilson
Bumps aa10text up from 249k to 260k! These NOOPs have existed uncommented since 04d1584737fd0d14e99608a97281fd7b1549ae0e. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-09-05i830: do not use stale mask transformChris Wilson
Not only were incorrectly falling back if we had non-affine transformations, but we made the decision based on a stale transformation matrix. Related bug 22877: batch_start_atomic horribly breaks performance after a while https://bugs.freedesktop.org/show_bug.cgi?id=22877 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Maximilian Grothusmann <maxi@own-hero.net>
2009-07-228xx render: Add limited support for a8 dests.Eric Anholt
This improves aa10text performance from 74k to 569k on my 855 laptop. This also causes my 865 to hang on aa10text like it does on rgb10text, thanks to actually hitting render accel.
2009-07-16Fix 915-class Render after the 8xx-class Render fix.Eric Anholt
The two shared i830_composite.c, so giving i830 atomic batch support triggered anger about starting i830's atomic area while in i915's atomic area. Instead, split the emit-a-primitive stuff from the state emission.
2009-07-15Use batch_start_atomic to fix batchbuffer wrapping problems with 8xx render.Eric Anholt
Bug #22483.
2009-06-23Harden i830 render in case check_composite didn't throw out bad formats.Alan Coopersmith
Fixes a warning in a static analysis program, and the code's a little clearer. Bug #21667
2009-05-10Fix "Unkown" typo in two FatalError messagesAlan Coopersmith
Signed-off-by: Alan Coopersmith <alan.coopersmith@sun.com>
2009-04-21Replace a bunch of #ifdef debug flushing/syncing with a single function.Eric Anholt
This removes it from a callsite where it would have just resulted in a fatalerror.
2009-03-06intel: Nuke shared-entity support (zaphod mode).Eric Anholt
It's been broken for years now, and KMS offers a much better chance of getting this working sensibly without making a mess of the 2D driver.
2008-11-05Make I830FALLBACK debugging a runtime instead of compile-time option.Eric Anholt