Age | Commit message (Collapse) | Author |
|
This reverts commit d2106384be6f9df498392127c3ff64d0a2b17457.
Breaks compiz (but not mutter/gnome-shell) on gen6. Not sure if this is
not seem deep interaction issue with multiple clients sharing the GPU or
just with compiz, but for now we have to revert and suffer the inane
performance hit. It looks suspiciously like another deferred damage
issue...
Bugzilla: 51a27e88b073cff229fff4362cb6ac22835c4044
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Minor improvement as the bottlenecks lie elsewhere. But it was annoying me.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
So that we always remember to re-emit the initial vertex elements state.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Rather than just creating and submitting a batch that simply contains a
flush in order to periodically ensure that rendering reaches the
scanout, we can simply ask the kernel whether the scanout is busy. The
kernel will then submit a flush on our behalf if it is dirty, which
takes advantage of the kernel's dirty state tracking.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Reduce the number of relocations emitted by only emitting one relocation
per vertex element per vertex buffer.
References: https://bugs.freedesktop.org/show_bug.cgi?id=35733
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
References: https://bugs.freedesktop.org/show_bug.cgi?id=35733
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Since, with GPU-on-package, it's hard to talk about a model number for
a specific chipset like 855GM, just use the platform names.
Signed-off-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
This reverts commit 03e8351179b1c25d219842ef3e01ee8e176f594f.
* sigh.
This was only meant to be a temporary debugging hack, not for public
consumption (or embarrassment).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
... or else we may forget to flush them again.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
... now who can explain why.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Something is wrong, we should be tracking when to invalidate the caches
as appropriate, yet I can not finding the missing flush to replace the
implicit one of DRAW_RECTANGLE.
Fixes cacomposite.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
As a corollary to filling one vertex array and beginning a new one is
remembering to emit the old one before overwriting...
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
There was a reason why we need to check at the start of every composite
operation to see if we have enough space in the array to fit the
vertices, which I promptly forgot when moving the code around to make
it look pretty.
* sigh.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
As the kernel controls the relocation of state buffers, we should not
hard code the maximum permissible value for them.
Fixes an eventual hang with full-gtt.
Reported-by: Peter Clifton <pcjc2@cam.ac.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
|
|
It is the same as commit 73d4c7d7
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Matthias Hopf <mhopf@suse.de>
|
|
After splitting out the i810 driver into its own legacy directory, we
can identify the common routines not as i830 but as intel. This
clarifies the code which *is* i830 specific.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
The driver is still built but is no longer under active development so
move it and supporting files to a new directory.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Oops, I spent more time discussing these flushing bugs than I spent
paying attention to what I was actually doing.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
The key difference between i965 and earlier, is that the surfaces passed
to the samplers through an indirect table and so the batch and render
target was not being marked dirty by the relocation (since the
relocation only happens within prepare_composite() which may have been
in another batch.) Simply call intel_pixmap_mark_dirty() when binding
the sampler table into the batch to ensure that the dirty is tracked
appropriately.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
As the batch submit may not trigger further drawing through flushing the
vertices, pass the requirement to emit the flush down to the submission
routine so that the flush can be appended after the final commands.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Fixes:
Bug 28446 - Garbled Font with Mathematica 7
https://bugs.freedesktop.org/show_bug.cgi?id=28446
Rewriting the glyphs to render to the destination directly and removing
the more expensive multiple invocations of CompositePicture per picture
was a great performance boost -- except that it needs special handling
in the backend in order to not fallback. Having done so for i915, I
neglected to ensure the sanity checking in i965_prepare_composite() was
sufficient. As it turns out, it was not and so we misrendered CA-glyphs
when rendering directly to the destination. This causes us to fallback
properly, but is a performance regression as we no longer try the 2-pass
magic helper before resorting to s/w. At the moment, I'd rather live
with the temporary regression and fix i965 to do the same magic as i915,
as it critical to fixing the severe performance issues currently
crippling i965, as I believe that this regression only affects the
minority of applications (incorrect, as it turns out, as the glyphs are
overlapping) rendering directly to the destination.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
This reverts commit f429fb9d872950705e11171d0e7407fb7673c786.
An experimental patch I forgot was on my main branch as I was bugfixing.
ARGH!
|
|
|
|
This paranoid check is deceased; pining for the fjords.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
|
|
If the destination cannot fit into the 3D pipeline when we need to
composite, we fallback to doing the operation on the CPU. This is very
slow, and quite easy to trigger on i915 by plugging in an external
display.
An alternative is to extract the extents of the operation from the
destination using the blitter which can usually handle much larger
operations. This gives us a temporary target that can fit into the 3D
pipeline and thus be accelerated, before copying back into the larger
real destination.
For x11perf this boosts glyph rendering on PineView, from 38kglyphs/s to
480kglyphs/s. Just a little shy of the native performance of 601kglyphs/s
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
On my PineView box these represent ~5% overhead on x11perf text:
Before:
16000000 trep @ 0.0020 msec (495000.0/sec): Char in 80-char aa line (Charter 10)
12000000 trep @ 0.0022 msec (461000.0/sec): Char in 80-char rgb line (Charter 10)
After:
16000000 trep @ 0.0020 msec (511000.0/sec): Char in 80-char aa line (Charter 10)
16000000 trep @ 0.0021 msec (480000.0/sec): Char in 80-char rgb line (Charter 10)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
x11perf regression caused by 2D driver
https://bugs.freedesktop.org/show_bug.cgi?id=28047
caused by
commit a7b800513fcc94e063dfd68d2f63b6bab7fae47d
uxa: Extract sub-region from in-memory buffers.
The issue is that as we extract the region prior to checking whether the
composite can in fact be accelerated, we perform expensive surplus
operations. This is particularly noticeable for ComponentAlpha text,
such as rgb10text. The solution here is to rearrange the
check_composite() prior to acquiring the sources, and only extracting
the subregion if the render path can not actually handle the texture.
Performance (on PineView):
a7b800513^: aa=68600 glyphs/s, rgb=29900 glyphs/s
a7b800513: aa=65700 glyphs/s, rgb=13200 glyphs/s
now: aa=66800 glyph/s, rgb=28800 glyphs/s
The residual lossage seems to be from the extra function call and
dixPrivate lookups. Hmm. More warning is the extremely low performance,
however the results are consistent so the improvement looks real...
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
|
|
The PRM (Vol 1, p32) specifies that the URB_FENCE command must not cross
a cache-line boundary (64-bytes) in order to workaround a silicon issue.
Ensure that it does not by inserting an alignment point before the atomic
section.
This is a slightly too large hammer, but the easiest method to work with
the current BEGIN_BATCH/ADVANCE_BATCH protections.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Beware the potential buffer overflow.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
This should restore the previous level of synchronisation between
textures and pixmaps, but *does not* guarantee that a texture will be
flushed before use. tfp should be fixed so that the ddx can submit the
batch if required to flush the pixmap.
A side-effect of this patch is to rename intel_batch_flush() to
intel_batch_submit() to reduce the confusion of executing a batch buffer
with that of emitting a MI_FLUSH.
Should fix the remaining rendering corruption involving tfp [inc compiz]:
Bug 25431 [i915 bisected] piglit/texturing_tfp regressed
http://bugs.freedesktop.org/show_bug.cgi?id=25431
Bug 25481 Wrong cursor format and cursor blink rate with compiz enabled
http://bugs.freedesktop.org/show_bug.cgi?id=25481
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
There is only a single caller that wishes to forcibly append a flush
into the batch: intel_sync(). So move the logic there.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Ensure that the render caches and texture caches are appropriately
flushed when switching a pixmap from a target to a source.
This should fix bug 24315,
[855GM] Rendering corruption in text (usually)
https://bugs.freedesktop.org/show_bug.cgi?id=24315
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|