summaryrefslogtreecommitdiff
path: root/src/sna/compiler.h
AgeCommit message (Collapse)Author
2020-04-17sna: fix typo for --enable-debug=fullAlexei Podtelezhnikov
A typo in tightly_packed define for builds with optimisation disabled left us creating many packed objects. When compiled with -fno-common the compiler rightfully complains about the duplication. Signed-off-by: Alexei Podtelezhnikov <apotele@gmail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2019-02-21Fix build on i686Adam Jackson
Presumably this only matters for i686 because amd64 implies sse2, but: BUILDSTDERR: In file included from gen4_vertex.c:34: BUILDSTDERR: gen4_vertex.c: In function 'emit_vertex': BUILDSTDERR: sna_render_inline.h:40:26: error: inlining failed in call to always_inline 'vertex_emit_2s': target specific option mismatch BUILDSTDERR: static force_inline void vertex_emit_2s(struct sna *sna, int16_t x, int16_t y) BUILDSTDERR: ^~~~~~~~~~~~~~ BUILDSTDERR: gen4_vertex.c:308:25: note: called from here BUILDSTDERR: #define OUT_VERTEX(x,y) vertex_emit_2s(sna, x,y) /* XXX assert(!too_large(x, y)); */ BUILDSTDERR: ^~~~~~~~~~~~~~~~~~~~~~~~ BUILDSTDERR: gen4_vertex.c:360:2: note: in expansion of macro 'OUT_VERTEX' BUILDSTDERR: OUT_VERTEX(dstX, dstY); BUILDSTDERR: ^~~~~~~~~~ The bug here appears to be that emit_vertex() is declared 'sse2' but vertex_emit_2s is merely always_inline. gcc8 decides that since you said always_inline you need to have explicitly cloned it for every permutation of targets. Merely saying inline seems to do the job of cloning vertex_emit_2s as much as necessary. So to reiterate: if you say always-inline, it won't, but if you just say maybe inline, it will. Thanks gcc, that's helpful.
2016-04-06sna: Mark sse2 routines as "fast"Chris Wilson
Trying to unify all the target attributes to chase down: blt.c: In function ‘memcpy_from_tiled_x__swizzle_0__sse2’: blt.c:345:1: error: inlining failed in call to always_inline ‘memcpy_sse64xN’: target specific option mismatch memcpy_sse64xN(uint8_t *dst, const uint8_t *src, int bytes) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-05sna: Add alignment hints to tiled memcpyChris Wilson
Telling the compiler the known alignment should improve the memcpy operation, but only has a small impact today (a few bytes/instructions per function). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-04-24sna: Reuse compiler attribute fast to build fast_memcpyChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-04-24sna: Mark avx as being a subset of avx2 optimisationsChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-01-17sna: Provide a few compiler hintsChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-08-05sna: Rename the attribute macro __packed__ to avoid clang barfingChris Wilson
Using __packed__ as shorthand for ___attribute__(__packed__) confuses clang as. (I guess to it expands (__packed__) which gcc skips.) As clang also uses packed in its builtins, we have to find a compromise, and so tightly_packed wins for being a more verbose description without the dangerous leading underscores. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-08-04sna: Define fast function attribute for old gcc or other compilersChris Wilson
Also written by Mark Kettenis and reported by Sedat Dilek. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-08-01sna: Don't force inline string-ops for the general memcpy_blt routineChris Wilson
As we need optimal copy code for the general case, where unlike swizzling the run lengths are not known before hand, we need to call the arch specific routines from glibc. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-07-10sna: Ofast was introduced with gcc-4.6Chris Wilson
Thomas Jones reported that the build was failing with gcc-4.5 due to the memcpy routines requesting an unsupported optimisation mode (-Ofast) and supplied this patch to only enable Ofast for gcc-4.6+ Reported-by: Thomas Jones <thomas.jones@utoronto.ca> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-06-29sna: Add the Ofast option to the critical memcpy routinesChris Wilson
Always enable gcc to fully optimize the core memcpy routines (provided that optimisations are not entirely disabled, for instance for debugging). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-04-10sna: Align uploads to start on page boundariesChris Wilson
This reduces the number of loops and restarts required in the kernel. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-04-01sna: Allow the compiler to inline memcpy for the bitblt routinesChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-03-12sna/gen4: Tweak compilation flags to avoid mixed settings across functionsChris Wilson
Confusing gcc with different flags for supposedly inlined functions is not a good idea. References: https://bugs.freedesktop.org/show_bug.cgi?id=62198 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-02-27sna: Prettify GCC version detection in headersChris Wilson
And fixup a basic error in the process. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-02-26sna: Force GCC to use the SSE unit for SSE2 routinesChris Wilson
Merely hinting that it was preferred by using sse+387 was not enough for GCC to emit the faster SSE2 code. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-02-26sna: Bump required GCC for sse2Chris Wilson
gcc-4.4.5 (on squeeze) triggers an ICE when using target(sse2). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-02-26sna: Conditionally compile sse2 routinesChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-02-26sna: Conditionally compile sse4_2 routinesChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-02-26sna: Conditionally compile avx routinesChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-02-26sna/gen4: Cluster ISAChris Wilson
Otherwise we seem to confuse the poor little compiler. This should also make it easier to use CPP to turn off blocks. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-02-26sna/gen3: Allow conditional use of SSE2Chris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-02-26sna/gen4+: Begin specialising vertex programs for ISAChris Wilson
Allow use of advanced ISA when available by detecting support at runtime. This initial work just uses GCC to emit varying ISA, future work could use hand written code for these hot spots. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-02-25sna/trapezoids: Instruct the compiler to flatten the callees whilst rasterisingChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-02-11sna: Fix inaccurate use of __attribute__((const))Chris Wilson
'const' is only allowed to use the function parameters and not allowed to access global memory - that includes not allowed to deference its arguments... Thanks to Jiri Slaby for spotting my mistake. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2013-01-27sna: Begin sketching out a threaded rasteriser for spansChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-06-05sna: Add inline keyword in conjunction with attribute(always_inline)Chris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-06-02sna/trapezoids: Implement trapezoidal opaque fills inplaceChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-24sna: Encourage large operations to be migrated to the GPUChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-14sna: Protect against deferred malloc failures for pixel dataChris Wilson
As we now defer the allocation of pixel data until first use, it can fail in the middle of a rendering routine. In order to prevent chasing us passing a NULL pointer into the fallback routines, we need to propagate the failure from the malloc and suppress the failure, discarding the operation, which is less than ideal. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-12-11sna: Enable hooking up of valgrind during debuggingChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-11-18sna/gen7: minor tidy of redundant definesChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-11-08sna: Begin hooking up valgrind/memcheckChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-10-19sna: Micro-optimise fill-spansChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>