Age | Commit message (Collapse) | Author |
|
Those are identical that using one define is much clear.
And it can also apply fixes for GM45 too, which is missing with
origin define.
|
|
Conflicts:
src/i965_render.c
|
|
It's very convenient that the hardware supports this non-default
mode since it's exactly what is specified by the Render extension.
This provides a more efficient means of fixing bug #16820:
[EXA] Composition result in black for areas outside of source-surface bo
https://bugs.freedesktop.org/show_bug.cgi?id=16820
without the software fallback we had in the earlier fix,
(commit 76c9ece36e6400fd10f364ee330faea470e2da64 ).
|
|
This is consistent with the documentation, (and just plain makes
more sense).
|
|
This reverts commit 76c9ece36e6400fd10f364ee330faea470e2da64.
We've learned a new technique that should let us avoid this fallback
to software. See following commit.
|
|
We wish it wouldn't, but the hardware ignores the alpha in the
BorderColor we set when the source picture format has no alpha
in it, (and it uses alpha of 1.0 where we want 0.0). For now,
fallback for these cases. This gives a correct result, but
obviously is not as fast as we would like.
This fixes bug #16820:
[EXA] Composition result in black for areas outside of source-surface bounds
https://bugs.freedesktop.org/show_bug.cgi?id=16820
|
|
Eric informed me that the repeat field exists only for backwards
compatibility with old drivers that weren't prepared for values
other than 0 or 1 here. Since we are, we can just ignore that
field and examine only repeatType. So the code's a (tiny) bit
simpler this way.
|
|
It's quite simple to support these modes---we simply need to
turn on the support for them in the hardware.
These changes have been verified with the extend-pad and
extend-reflect tests in cairo's test suite. However, this
currently required using a custom-modified version of cairo.
The issue is that released versions of cairo, (and even
cairo master so far), don't pass RepeatPad and RepeatReflect
to Render, (due to various bugs and workarounds in cairo
and pixman). I do plan to fix those issues in cairo, so that
in a future release of cairo, (1.8.2 perhaps?), the cairo
test suite will usefully test these new repeat modes in our
driver.
|
|
The existing switch statement was switching on the Boolean
repeat field rather than the correct repeatType field. This
had not caused any problem before as only two possible repeat
values were supported (RepeatNone = 0 and RepeatNormal = 1)
so they were always the same as the repeat field.
Soon, however, we'll be supporting more repeat types, so we'll
need to switch on the correct value.
|
|
We'll probably end up doing this differently, but avoid this path for now.
|
|
Otherwise just use the GTT address.
|
|
|
|
This reverts commit 1abf4d3a7a203ff5d6e5ceda29573e7fd69ddf8e.
Conflicts:
src/i965_render.c - flushing was removed, keep it that way
|
|
ssh://git.freedesktop.org/git/xorg/driver/xf86-video-intel into drm-gem
|
|
This improves 'x11perf -aa10text' performance from ~144k to ~169k
|
|
|
|
|
|
This allows us to only call i830WaitSync once every 128 calls to composite
rather than on every call. However, we do need to also call MI_FLUSH to
avoid the vertex cache getting in our way, (since our "separate" buffers
are all allocated as one contiguous chunk).
|
|
Using more than one (in the future) will allow for doing less frequent calls
to i830WaitSync.
|
|
This is in preparation for having larger (or multiple) vertex buffers
in the future.
|
|
|
|
|
|
Depend on value returned by function within assert is wrong.
Fixed weird render corrupt on i965.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The gen4_render_state is now always called "render_state" (i965_render.c
bookkeeping) and gen4_state_t is now always called "card_state" (the buffer
for state used by the chip).
|
|
(cherry picked from a2b5c23184d19b386fdfd04f578a55566df60132 commit)
|
|
|
|
|
|
We have a collection of wm_state objects for each ps kernel,
(one for each combination of src and mask extend and repeat
values).
Thanks to Dave Airlie for noticing an errant write through a
wild wm_state pointer in an early version of this commit.
(cherry picked from 7763706a93d3021907273f9b330750ba110e2fc3 commit)
This cherry-pick required more reformatting than most, due to the
projective texturing merge.
|
|
This will eventually allow for the elimination of sampler state
updates while compositing---and initializing everything in the
initialization function.
(cherry picked from commit d0874697be8086cd64740c24698df8cd4d31c76f)
|
|
We need one for each possible combination of src and dst
blend_factors. Again, as with recent changes, this eliminates
state updates from prepare_composite and allows that function
to instead simply reference an existing object initialized
within gen4_state_init.
Thanks to Dave Airlie (and git-bisect) for pointing out that with
gnome-terminal all text was appearing as solid black with an early
version of this commit. As expected the bug was an alignment issue.
(cherry picked from 0c0ab52c2d100c47f38c7ef826ef585c8b9815e9 commit)
Performance is approximately equivalent on text tests, but may be
around +2%.
|
|
This reverts commit 346cf57deabb4c336612df4c13650a87b5ef6775.
Mixing randr transforms and video caused screen corruption for Render
operations. No, I don't understand why.
|
|
Instead of leaving pixel values in src_sample registers, compute the pixel
values directl to the data port to save 8 moves. This cannot work when no
computation is done as there is both no way to wait for the sampler to
finish and because the sampler returns data in a different order from that
required by the data port (sigh).
|
|
Performance change is in the noise. Also from Carl Worth.
|
|
|
|
This reduces the CPU overhead of memcpying them in every time, for a speedup
in aa24text of around 30%. This is based on work by Carl Worth which is
in the intel-batchbuffer branch.
|
|
|
|
|
|
Saving registers means we can run more in parallel.
|
|
Clean up register allocation to never overlap
Always write 4 values for each texture vertex.
|
|
|
|
Use macros for register names, modularize functions into separate files.
|
|
This involves correctly computing u/v locations based on x/y vectors and
line constants computed in new sf program.
Also, use fewer instructions to make this go a bit faster (2X for 500x500
composite).
|
|
The homogeneous coordinate computation in the core server cannot be used for
many legal matrices as it overflows. Just use floats in the driver; faster
and avoids troubles.
When compositing with bilinear filter, don't push the dst coordinates around
as that makes the output blurry when pixels are aligned.
|
|
|