summaryrefslogtreecommitdiff
path: root/src/i915_render.c
AgeCommit message (Collapse)Author
2011-04-07i965: Avoid transform overheads for vertex emit where possibleChris Wilson
Minor improvement as the bottlenecks lie elsewhere. But it was annoying me. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-28i915: Remove unused 'w' and 'h'Chris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-28i915: Remove unused 'num_floats' variableChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-17Quiet compiler warning about is_affine_src same way we do is_affine_mask.Eric Anholt
2010-12-03i965: Amalgamate surface binding tablesChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-09-22Make driver compile for 1.6 Xserver series again.Matthias Hopf
Signed-off-by: Matthias Hopf <mhopf@suse.de>
2010-06-25Rename common infrastructure to the intel namespace.Chris Wilson
After splitting out the i810 driver into its own legacy directory, we can identify the common routines not as i830 but as intel. This clarifies the code which *is* i830 specific. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-17i915: Force the emission of BUF_INFO on every composite_setupChris Wilson
We should be able to eliminate these as the drawable remains unchanged. However, the implicit flush of BUF_INFO fixes the rendering in KDE. Alternatively, we need an MI_FLUSH | INHIBIT_RENDER_CACHE_FLUSH between composites. (Note that it is not stale cache data causing the rendering corruption and that a pipelined flush is not sufficient either.) Also, having tried varies points at which to flush, the only place where the flush is effective seems to be between composite operations - that is a flush after 2D is not sufficient. Reported-by: Vasily Khoruzhick <anarsoul@gmail.com> Reported-by: Clemens Eisserer <linuxhippy@gmail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-09Revert "xp:trapezoids"Chris Wilson
This reverts commit f429fb9d872950705e11171d0e7407fb7673c786. An experimental patch I forgot was on my main branch as I was bugfixing. ARGH!
2010-06-08xp:trapezoidsChris Wilson
2010-06-08implicit-flushChris Wilson
2010-06-01i915: Centre sampling.Chris Wilson
Use centre sampling of textures to match pixman, and remove numerous off-by-one and visual artefacts when rendering. The classic example for this is cairo/text/xcomposite-projection where the edge of the rotated rectangle is jaggy due to the incorrect sample position. Fixes: Bug 16917 - [i915] Blur on y-axis also when only x-axis is scaled billiear https://bugs.freedesktop.org/show_bug.cgi?id=16917 And about 15 tests from the Cairo test suite. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-01i915; Avoid the implicit flush on changing BUF_INFOChris Wilson
3DSTATE_BUF_INFO is an implicit flush of the piepline, so avoid emitting that and associated state unless the destination pixmap has actually changed. This is a win of around 3-5% for cairo-perf-trace, notably for firefox. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-28i915: Don't re-emit vertex size unless it has changed.Chris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-24i915: Emit CA over using OutReverse + Add passesChris Wilson
On PineView: 578/621 -> 610/617 kglyphs/sec [rgb/aa]
2010-05-24uxa: Use temporary dest when target is too large for compositorChris Wilson
If the destination cannot fit into the 3D pipeline when we need to composite, we fallback to doing the operation on the CPU. This is very slow, and quite easy to trigger on i915 by plugging in an external display. An alternative is to extract the extents of the operation from the destination using the blitter which can usually handle much larger operations. This gives us a temporary target that can fit into the 3D pipeline and thus be accelerated, before copying back into the larger real destination. For x11perf this boosts glyph rendering on PineView, from 38kglyphs/s to 480kglyphs/s. Just a little shy of the native performance of 601kglyphs/s Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-24i915: compute normalized texcoords using a scale factor.Chris Wilson
500 -> 580kglyphs/s on i945. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-24i915: Add special case primitive emitters for glyphs.Chris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-24i915: Move vertices into a vertex buffer object.Chris Wilson
In theory this should allow us to pack far more operations into a single batch buffer, and reduce our overheads. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-24Use pwrite to upload the batch bufferChris Wilson
By using pwrite() instead of dri_bo_map() we can write to the batch buffer through the GTT and not be forced to map it back into the CPU domain and out again, eliminating a double clflush. Measing x11perf text performance on PineView: Before: 16000000 trep @ 0.0020 msec (511000.0/sec): Char in 80-char aa line (Charter 10) 16000000 trep @ 0.0021 msec (480000.0/sec): Char in 80-char rgb line (Charter 10) After: 16000000 trep @ 0.0019 msec (532000.0/sec): Char in 80-char aa line (Charter 10) 16000000 trep @ 0.0020 msec (496000.0/sec): Char in 80-char rgb line (Charter 10) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-24Kill paranoid assertions on every write into the batchbuffer.Chris Wilson
On my PineView box these represent ~5% overhead on x11perf text: Before: 16000000 trep @ 0.0020 msec (495000.0/sec): Char in 80-char aa line (Charter 10) 12000000 trep @ 0.0022 msec (461000.0/sec): Char in 80-char rgb line (Charter 10) After: 16000000 trep @ 0.0020 msec (511000.0/sec): Char in 80-char aa line (Charter 10) 16000000 trep @ 0.0021 msec (480000.0/sec): Char in 80-char rgb line (Charter 10) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-24i915: Emit composite primitive with specialised functions.Chris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-23i915: amalgamate composite into a single primitive listChris Wilson
Combine all the calls to composite between prepare_composite and done_composite into a single primitive list, rather than a primitive call per composite(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-15i915: Load texture into directly into OC when possible.Chris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-14i915: Remove a couple of unsupported 16bpp no-alpha tex formatsChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-14i915: Don't force alpha=1 for RGB drawables in the shader.Chris Wilson
I was blindly fixing rendercheck without thinking. We need to force the alpha value to be in the blend unit and not before -- otherwise we generate the incorrect result whilst blending. D'oh.
2010-05-13i915: Force output alpha to 1. if dst has no alpha channel.Chris Wilson
Ensure that garbage is not stored in the unused alpha channel so that we can rely on it being currently initialiased when used as a source or returning via GetImage. Partial fix for rendercheck -t blend
2010-05-13i915: Add a2r10g10b10 format and friendsChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-10i915: Fix pixmap based masks.Chris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-10uxa,i915: Handle SourcePict through uxa_composite()Chris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-10i915: Use 1x1R pixmap for solid drawablesChris Wilson
x11perf has a regression https://bugs.freedesktop.org/show_bug.cgi?id=25068 caused by commit e581ceb7381e29ecc1a172597d258824f6a1d2d3 i915: Use the color channels to pass along solid sources and masks. Do not convert 1x1R pixmaps into a solid color as the readback from the bo negates all the performances advantages of using a smaller vertex buffer and fewer samplers. Before (PineView): aa=66800 glyph/s, rgb=28800 glyphs/s Now: aa=96800 glyphs/s, rgb=48500 glyphs/s Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-10uxa: Rearrange checking and preparing of composite textures.Chris Wilson
x11perf regression caused by 2D driver https://bugs.freedesktop.org/show_bug.cgi?id=28047 caused by commit a7b800513fcc94e063dfd68d2f63b6bab7fae47d uxa: Extract sub-region from in-memory buffers. The issue is that as we extract the region prior to checking whether the composite can in fact be accelerated, we perform expensive surplus operations. This is particularly noticeable for ComponentAlpha text, such as rgb10text. The solution here is to rearrange the check_composite() prior to acquiring the sources, and only extracting the subregion if the render path can not actually handle the texture. Performance (on PineView): a7b800513^: aa=68600 glyphs/s, rgb=29900 glyphs/s a7b800513: aa=65700 glyphs/s, rgb=13200 glyphs/s now: aa=66800 glyph/s, rgb=28800 glyphs/s The residual lossage seems to be from the extra function call and dixPrivate lookups. Hmm. More warning is the extremely low performance, however the results are consistent so the improvement looks real... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-04-13i915 render: use tiling bits where possibleDaniel Vetter
This is in preparation to explicit fence allocation with execbuf2. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2010-03-17i915: Correct preamble for emit_compositeChris Wilson
Fixes: http://bugs.freedesktop.org/show_bug.cgi?id=27123 Fatal server error: i915_emit_composite_setup: ADVANCE_BATCH: under-used allocation 100/104 Introduced with commit d6b7f96fde1add92fd11f5a75869ae6fc688bf77. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-03-16Fill alpha on xrgb images.Chris Wilson
Do not try to fixup the alpha in the ff/shaders as this has the side-effect of overriding the alpha value of the border color, causing images to be padded with black rather than transparent. This can generate large and obnoxious visual artefacts. Fixes: Bug 17933 - x8r8g8b8 doesn't sample alpha=0 outside surface bounds http://bugs.freedesktop.org/show_bug.cgi?id=17933 and many related cairo test suite failures. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-12-07batch: Ensure we send a MI_FLUSH in the block handler for TFPChris Wilson
This should restore the previous level of synchronisation between textures and pixmaps, but *does not* guarantee that a texture will be flushed before use. tfp should be fixed so that the ddx can submit the batch if required to flush the pixmap. A side-effect of this patch is to rename intel_batch_flush() to intel_batch_submit() to reduce the confusion of executing a batch buffer with that of emitting a MI_FLUSH. Should fix the remaining rendering corruption involving tfp [inc compiz]: Bug 25431 [i915 bisected] piglit/texturing_tfp regressed http://bugs.freedesktop.org/show_bug.cgi?id=25431 Bug 25481 Wrong cursor format and cursor blink rate with compiz enabled http://bugs.freedesktop.org/show_bug.cgi?id=25481 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-11-30i915: Disable centre-point sampling.Chris Wilson
I still have no idea how this is triggering failures, but it is. So revert until the problem is solved. Should fix once again: Bug 23803 [bisected i915] gnome characters disappear http://bugs.freedesktop.org/show_bug.cgi?id=23803 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-11-30i915: WhitespaceChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-11-30i915: Remove routing of alpha channel to green.Chris Wilson
This modification is redundant since the routing is done in the blend unit anyway. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-11-30i915: Fix missing texture offset for mask.Chris Wilson
In commit e581ceb, I modified the shader generation to accommodate mixed textures and solids but missed applying the new computed sampler for the mask. References: Bug 23803 [bisected i915] gnome characters disappear http://bugs.freedesktop.org/show_bug.cgi?id=23803 Bug 25031 rendering and color corruption since 14109abf http://bugs.freedesktop.org/show_bug.cgi?id=25031 Bug 25047 [945GM bisected] rendercheck/repeat/triangles regressed http://bugs.freedesktop.org/show_bug.cgi?id=25047 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-11-29batch: Emit a 'pipelined' flush when using a dirty source.Chris Wilson
Ensure that the render caches and texture caches are appropriately flushed when switching a pixmap from a target to a source. This should fix bug 24315, [855GM] Rendering corruption in text (usually) https://bugs.freedesktop.org/show_bug.cgi?id=24315 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-11-13i915: Derive the correct target color from the pixmap by checking its formatChris Wilson
Particularly noting to route alpha to the green channel when blending with a8 destinations. Fixes: rendercheck/repeat/triangles regressed http://bugs.freedesktop.org/show_bug.cgi?id=25047 introduced with commit 14109a. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-11-10i915: Fix texture sampling coordinates.Chris Wilson
RENDER specifies that texels should sampled from the pixel centre. This corrects a number of failures in the cairo test suite and a few off-by-one bug reports. Grey border around images https://bugs.freedesktop.org/show_bug.cgi?id=21523 Note that the earlier attempt to fix this was subverted by the buggy use of 1x1R textures for solid sources -- which caused the majority of text to disappear. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-11-10i915: Use the color channels to pass along solid sources and masks.Chris Wilson
Instead of allocating and utilising the texture samplers for 1x1R solid sources and masks we can simply use the default diffuse and specular colour channels and adjust the fragment shader appropriately. The big advantage is the reduction in size of batches which should give a good boost to glyph performance, irrespective of the additional boost from using simpler shaders. However, the motivating factor behind the switch is that our use of 1x1 textures turns out to be buggy... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-11-10Check that batch buffers are atomic.Chris Wilson
Since batch buffers are rarely emitted by themselves but as part of a sequence of state and vertices, the whole sequence is emitted atomically. Here we just enforce that batches are marked as being part of an atomic sequence as appropriate. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-11-05Remove flow-control macros for fallbacks in the 2D driver.Eric Anholt
It's poor style, and has confused new developers.
2009-10-08Call pPixmaps plain old pixmaps.Eric Anholt
2009-10-08de-pCamelHungarian the Render pictures and pixmaps.Eric Anholt
2009-10-08Share several render fields between render implementations.Eric Anholt
Also, start settling on the cairo naming for things: source, mask, and dest.
2009-10-08Rename the xf86 screen private from pScrn to scrn.Eric Anholt