summaryrefslogtreecommitdiff
path: root/src/sna/Makefile.am
AgeCommit message (Collapse)Author
2014-06-02sna: Add support for PresentChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2014-06-02sna: Add support for DRI3Chris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2014-05-14sna: Rename DRI2 files, functions and variablesChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-12-11sna/gen8: Initial backend for BroadwellChris Wilson
Should match the functionality of the earlier generations, but untuned. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-11-25sna: Keep @NOWARNFLAGS@ lastChris Wilson
As the last option overrides the earlier options, make sure these particular overrides always take effect by adding them last. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-11-25Makefile convert @var@ to $(var)Zdenek Kabelac
Avoid using @var@ since this could not be easily overwritten through 'make var=xxx' option which is normally available. For Makefile.am users should avoid using @var@. Signed-off-by: Zdenek Kabelac <zkabelac@redhat.com>
2013-10-10sna/gen4+: Share a few common routinesChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-10-10sna/gen6+: Share the common routines for ring preferrenceChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-10-05sna/trapezoids: Add a precise scan converterChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-10-04sna: Start splitting the trapezoids megafile into parseable blocksChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-09-07Revert "sna: Add XMir support"Chris Wilson
This reverts commit 42d94356f65972eb7fb8991234a4e9388c4c2031. Ordered-by: The Management.
2013-09-06sna: Listen to ACPI events for power state notificationsChris Wilson
When on-battery, we would prefer to use more power efficient operations. For example, the BCS is far more economical to more data around with, but it doesn't have quite the same throughput as the hungry RCS. (Not that there is any reason why, the BCS is supposed to run at full memory speed, unfortunately that is main memory speed and not the caches...) Note: that X already listens to acpid for video switch notifications, it would be useful if we could extend that interface to emit power notifications as well. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-09-04sna: Add XMir supportChris Wilson
With lots of updates by Christopher James Halse Rogers as he updated the XMir API - but now supposedly frozen! "<RAOF> ickle: I think the xmir api should be pretty much stable now, barring people coming up with more awesome ways of doing things." Signed-off-by: Christopher James Halse Rogers <raof@ubuntu.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-07-29intel: Suppress some extremely noisy warningsChris Wilson
Warning about redundant declarations within the xorg headers hides genuine warnings in our own code - disable them until the headers are cleaned up. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-07-28configure: Print a summary of compilation optionsChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-07-17sna: Wrap cpuid.hChris Wilson
More our ifdef out of line from the main code into a header file, where we can also apply a little bit of synatic sugar. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-06-19configure: test for librt (clock_gettime)Jonathan Gray
clock_gettime() is in libc not librt on OpenBSD so check to see if linking librt is required. Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
2013-04-16sna: Add VALGRIND_CFLAGS whilst compiling with --enable-valgrindChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-03-07sna: Supply a fake pipe to run completely headlessChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-02-25sna: Detect available instruction sets at runtimeChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-27sna: Begin sketching out a threaded rasteriser for spansChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-24sna: Experiment with a threaded renderer for fallback compositingChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-02sna/gen4+: Specialise linear vertex emissionChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-12-20sna/gen4+: Amalgamate all the gen4-7 vertex buffer emissionChris Wilson
Having reduced all the vb code for these generations to the same set of routines, we can refactor them into a single set of functions. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-11-30sna: Unify gen4 acceleration againChris Wilson
After disabling render-to-Y, 965g seems just as happy with the new code paths as g4x. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-11-23sna/gen4: Revert changes to 965g[m]Chris Wilson
The changes tested on g45/gm45 prove to be highly unstable on 965gm, suggesting a radical difference in the nature of the bugs between the two generations. In theory, g4x has additional features that could be exploited over and above gen4 which may prove interesting in the future. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-30sna: Add the brw assemblerChris Wilson
In order to construct programs on the fly to cater for the combinatorial number of possible shaders, we need an assembler, whilst also taking the opportunity to remove some of the inefficiencies and mistakes from the current shaders. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-12sna: Fix build without DRI2Chris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-09sna: Simplify the DBG incarnationChris Wilson
It was only ever used in conjunction with HAS_DEBUG_FULL. For debug purposes it is as easy to redefine DBG locally. By simplifying the DBG macro we can create it consistently and so reduce the number of compiler warnings. Long term, this has to be dynamic. Sigh. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-07-08sna: Fixup fb wrapperChris Wilson
To accommodate changes in the Xserver and avoid breakage; would have been much easier had the fb been exported in the first place.
2012-03-28sna: Add video sprite support for ILK+Chris Wilson
Based on the work by Jesse Barnes. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-02-11Include a local copy of list.hChris Wilson
In 1.11.903, the list.h was renamed to xorg-list.h with a corresponding change to all structures. As we carried local fixes to list.h and extended functionality, just create our own list.h with a bit of handwaving to protect us for the brief existence of xorg/include/list.h. Reported-by: Armin K <krejzi@email.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45938 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-01-14sna: Use top_srcdir to detect .git rather than top_builddirChris Wilson
For srcdir != builddir builds, we need to be searching the source tree for the git id. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-11sna: Enable hooking up of valgrind during debuggingChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-11-16sna: Correct dependencies for DRI2Chris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-11-16sna: Reduce and clarify dependenciesChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-11-11sna: Begin debugging gen7Chris Wilson
This is the stub of the decoder, sufficient to give details of the ops within the batch and to keep the debugger happy. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-19sna: Micro-optimise fill-spansChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-09-09sna: Record git-tree used for compilationChris Wilson
Hopefully, I have all the dependencies correct for auto-updating and should continue to work with tarballs... The next step is to perhaps include it in the usual version number, perhaps as patch level? Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-30Fix typos for distcheckChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-30sna: Port IVB acceleration code (Xrender + Xv)Chris Wilson
Based on the superlative work by Kenneth Graunke and Xiang, Haihao. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-07-01sna: Downsample sources 2x too large to fit in the 3D pipelineChris Wilson
This is quite trivial to hit given the 2k limits on gen2/gen3. We compromise on image quality by pre-downscaling the source by a fixed factor to make it fit into the pipeline in preference to performing the entire operation on the CPU. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-06-26sna: Only create bo up to half the size of the mappable apertureChris Wilson
As we use GTT mappings if writing directly into the tiled buffer and the available aperture is reported by the kernel as the total GTT and not limited to the fenceable/mappable region, we need to manually probe this value and ensure that our creation and fenced routines observe this distinct limit. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-06-10sna: Remove the ability to disable chipset specific codeChris Wilson
This was a fun little, but pointless, exercise. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-06-04sna: Introduce a new acceleration model.Chris Wilson
The premise is that switching between rings (i.e. the BLT and RENDER rings) on SandyBridge imposes a large latency overhead whilst rendering. The cause is that in order to switch rings, we need to split the batch earlier than is desired and to add serialisation between the rings. Both of which incur large overhead. By switching to using a pure 3D blit engine (ok, not so pure as the BLT engine still has uses for the core drawing model which can not be easily represented without a combinatorial explosion of shaders) we can take advantage of additional efficiencies, such as relative relocations, that have been incorporated into recent hardware advances. However, even older hardware performs better from avoiding the implicit context switches and from the batching efficiency of the 3D pipeline... But this is X, and PolyGlyphBlt still exists and remains in use. So for the operations that are not worth accelerating in hardware, we introduce a shadow buffer mechanism through out and reintroduce pixmap migration. Doing this efficiently is the cornerstone of ensuring that we do exploit the increased potential of recent hardware for running old applications and environments (i.e. so that the latest and greatest chip is actually faster than gen2!) For the curious, sna is SandyBridge's New Acceleration. If you are running older chipsets and welcome the performance increase offered by this patch, then you may choose to call it Snazzy instead. Speedups ======== gen3 firefox-fishtank 1203584.56 (1203842.75 0.01%) -> 85561.71 (125146.44 14.87%): 14.07x speedup gen5 grads-heat-map 3385.42 (3489.73 1.44%) -> 350.29 (350.75 0.18%): 9.66x speedup gen3 xfce4-terminal-a1 4179.02 (4180.09 0.06%) -> 503.90 (531.88 4.48%): 8.29x speedup gen4 grads-heat-map 2458.66 (2826.34 4.64%) -> 348.82 (349.20 0.29%): 7.05x speedup gen3 grads-heat-map 1443.33 (1445.32 0.09%) -> 298.55 (298.76 0.05%): 4.83x speedup gen3 swfdec-youtube 3836.14 (3894.14 0.95%) -> 889.84 (979.56 5.99%): 4.31x speedup gen6 grads-heat-map 742.11 (744.44 0.15%) -> 172.51 (172.93 0.20%): 4.30x speedup gen3 firefox-talos-svg 71740.44 (72370.13 0.59%) -> 21959.29 (21995.09 0.68%): 3.27x speedup gen5 gvim 8045.51 (8071.47 0.17%) -> 2589.38 (3246.78 10.74%): 3.11x speedup gen6 poppler 3800.78 (3817.92 0.24%) -> 1227.36 (1230.12 0.30%): 3.10x speedup gen6 gnome-terminal-vim 9106.84 (9111.56 0.03%) -> 3459.49 (3478.52 0.25%): 2.63x speedup gen5 midori-zoomed 9564.53 (9586.58 0.17%) -> 3677.73 (3837.02 2.02%): 2.60x speedup gen5 gnome-terminal-vim 38167.25 (38215.82 0.08%) -> 14901.09 (14902.28 0.01%): 2.56x speedup gen5 poppler 13575.66 (13605.04 0.16%) -> 5554.27 (5555.84 0.01%): 2.44x speedup gen5 swfdec-giant-steps 8941.61 (8988.72 0.52%) -> 3851.98 (3871.01 0.93%): 2.32x speedup gen5 xfce4-terminal-a1 18956.60 (18986.90 0.07%) -> 8362.75 (8365.70 0.01%): 2.27x speedup gen5 firefox-fishtank 88750.31 (88858.23 0.14%) -> 39164.57 (39835.54 0.80%): 2.27x speedup gen3 midori-zoomed 2392.13 (2397.82 0.14%) -> 1109.96 (1303.10 30.35%): 2.16x speedup gen6 gvim 2510.34 (2513.34 0.20%) -> 1200.76 (1204.30 0.22%): 2.09x speedup gen5 firefox-planet-gnome 40478.16 (40565.68 0.09%) -> 19606.22 (19648.79 0.16%): 2.06x speedup gen5 gnome-system-monitor 10344.47 (10385.62 0.29%) -> 5136.69 (5256.85 1.15%): 2.01x speedup gen3 poppler 2595.23 (2603.10 0.17%) -> 1297.56 (1302.42 0.61%): 2.00x speedup gen6 firefox-talos-gfx 7184.03 (7194.97 0.13%) -> 3806.31 (3811.66 0.06%): 1.89x speedup gen5 evolution 8739.25 (8766.12 0.27%) -> 4817.54 (5050.96 1.54%): 1.81x speedup gen3 evolution 1684.06 (1696.88 0.35%) -> 1004.99 (1008.55 0.85%): 1.68x speedup gen3 gnome-terminal-vim 4285.13 (4287.68 0.04%) -> 2715.97 (3202.17 13.52%): 1.58x speedup gen5 swfdec-youtube 5843.94 (5951.07 0.91%) -> 3810.86 (3826.04 1.32%): 1.53x speedup gen4 poppler 7496.72 (7558.83 0.58%) -> 5125.08 (5247.65 1.44%): 1.46x speedup gen4 gnome-terminal-vim 21126.24 (21292.08 0.85%) -> 14590.25 (15066.33 1.80%): 1.45x speedup gen5 firefox-talos-svg 99873.69 (100300.95 0.37%) -> 70745.66 (70818.86 0.05%): 1.41x speedup gen4 firefox-planet-gnome 28205.10 (28304.45 0.27%) -> 19996.11 (20081.44 0.56%): 1.41x speedup gen5 firefox-talos-gfx 93070.85 (93194.72 0.10%) -> 67687.93 (70374.37 1.30%): 1.37x speedup gen4 evolution 6696.25 (6854.14 0.85%) -> 4958.62 (5027.73 0.85%): 1.35x speedup gen3 swfdec-giant-steps 2538.03 (2539.30 0.04%) -> 1895.71 (2050.62 62.43%): 1.34x speedup gen4 gvim 4356.18 (4422.78 0.70%) -> 3276.31 (3281.69 0.13%): 1.33x speedup gen6 evolution 1242.13 (1245.44 0.72%) -> 953.76 (954.54 0.07%): 1.30x speedup gen6 firefox-planet-gnome 4554.23 (4560.69 0.08%) -> 3758.76 (3768.97 0.28%): 1.21x speedup gen3 firefox-talos-gfx 6264.13 (6284.65 0.30%) -> 5261.56 (5370.87 1.28%): 1.19x speedup gen4 midori-zoomed 4771.13 (4809.90 0.73%) -> 4037.03 (4118.93 0.85%): 1.18x speedup gen6 swfdec-giant-steps 1557.06 (1560.13 0.12%) -> 1336.34 (1341.29 0.32%): 1.17x speedup gen4 firefox-talos-gfx 80767.28 (80986.31 0.17%) -> 69629.08 (69721.71 0.06%): 1.16x speedup gen6 midori-zoomed 1463.70 (1463.76 0.08%) -> 1331.45 (1336.56 0.22%): 1.10x speedup Slowdowns ========= gen6 xfce4-terminal-a1 2030.25 (2036.23 0.25%) -> 2144.60 (2240.31 4.29%): 1.06x slowdown gen4 swfdec-youtube 3580.00 (3597.23 3.92%) -> 3826.90 (3862.24 0.91%): 1.07x slowdown gen4 firefox-talos-svg 66112.25 (66256.51 0.11%) -> 71433.40 (71584.31 0.14%): 1.08x slowdown gen4 gnome-system-monitor 5691.60 (5724.03 0.56%) -> 6707.56 (6747.83 0.33%): 1.18x slowdown gen3 ocitysmap 3494.05 (3502.44 0.20%) -> 4321.99 (4524.42 2.78%): 1.24x slowdown gen4 ocitysmap 3628.42 (3641.66 9.37%) -> 5177.16 (5828.74 8.38%): 1.43x slowdown gen5 ocitysmap 4027.77 (4068.11 0.80%) -> 5748.26 (6282.25 7.38%): 1.43x slowdown gen6 ocitysmap 1401.61 (1402.24 0.40%) -> 2365.74 (2379.14 4.12%): 1.69x slowdown [Note the performance regression for ocitysmap comes from that we now attempt to support rendering to and (more importantly) from large surfaces. By enabling such operations is the only way to one day be faster than purely using the CPU, in the meantime we suffer regression due to the increased migration and aperture thrashing. The other couple of regressions will be eliminated with improved span and shader support, now that the framework for such is in place.] The performance increase for Cairo completely overlooks the other critical aspects of the architecture: World of Padman: gen3 (800x600): 57.5 -> 96.2 gen4 (800x600): 47.8 -> 74.6 gen6 (1366x768): 100.4 -> 140.3 [F15] 144.3 -> 146.4 [drm-intel-next] x11perf (gen6); aa10text: 3.47 -> 14.3 Mglyphs/s [unthrottled!] copywinwin10: 1.66 -> 1.99 Mops/s copywinpix10: 2.28 -> 2.98 Mops/s And we do not have a good measure for how much improvement the reworking of the fallback paths give, except that xterm is now over 4x faster... PS: This depends upon the Xorg patchset "Remove the cacheing of the last scratch PixmapRec" for correct invalidations of scratch Pixmaps (used by the dix to implement SHM operations, used by chromium and gtk+ pixbufs. PPS: ./configure --enable-sna Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>