[cairo] GL backend work

Tue Jan 12 09:56:28 PST 2010

On Tue, Jan 12, 2010 at 6:09 PM, Eric Anholt <eric at anholt.net> wrote:
> On Fri, 8 Jan 2010 20:48:30 +1800, Zach Laine <whatwasthataddress at gmail.com> wrote:
>> I've recently been working on the OpenGL backend on a repo hosted on GitHub:
>>
>> git://github.com/tzlaine/cairo_gl_contributions.git
>
> Very cool!  I love to see someone picking up this code.  Bunch of review
> below, but please don't take my concerns badly

Not at all -- I welcome the discussion!

> there's stuff to work
> on but I definitely want to see a bunch of these improvements land!

Me too. ;)

>> Ive fixed a majority of the GL regressions.  There are two main classes of
>> failures that I understand, but so far cannot fix.  A full test summary is
>> below.
>>
>> Known failure type #1: The cairo-gl span renderer was pretty badly broken
>> before I got there.  It rendered all spans as black, with the alpha associated
>> with each span.  No creation of a mask and then masking a source was being
>> done.  I changed it to render the source color, using the alpha for each span.
>> This at least makes tests like ft-text-vertical-layout-type1 look
>> approximately right, though it's producing incorrect color values.  If I
>> understood better what pixman was doing to combine a span and source color
>> with a destination color I could fix this.  Anyone got any advice on this?
>
> What's a source color when you've got a texture as the source?  The
> spans renderer is providing the mask in the form of scanlines with
> alpha.  The color of the scanlines is entirely ignored, since it isn't
> used in the mask part of the IN operator (unless you're doing component
> alpha, which isn't the case with the span renderer).

Well, it might have been acting as a mask in the texturing case (I
don't remember now if that case worked or not), but it definitely was
not in the solid color case, which is why I made the change I did.
The color used for texturing is (1.0, 1.0, 1.0, span_alpha), which
does the right thing.

>> Known failure type #2: Texturing in shaders does not work given the current
>> use of FBOs.  I have no idea why.  I have a post in on the opengl.org forums
>> asking for help.  Maybe one of the gurus there can help me fix it.  Shaders
>> are certainly desirable for doing certain things more efficiently, but they
>> are essentially optional in most cases.  However, for cases in which a source
>> texture is being rendered onto a target texture under extend policy
>> CAIRO_EXTEND_NONE, a shader is required.  This is because CAIRO_EXTEND_NONE is
>> implemented as GL_CLAMP_TO_BORDER, which is completely appropriate, except
>> that it can't be done with textures bound to framebuffers as all the textures
>> are in cairo_gl_surface_t's.  Setting a gl_surface's texture border to 1
>> causes the framebuffer to be incomplete.  This is true when using the
>> EXT_framebuffer_object extension, and when using the framebuffer support in
>> core GL > 3.0.  It's undocumented, but I found it to be consistent, at least
>> for NVidia GL implementations (and that's a pretty big chunk of the end user
>> environment space).  The only way to simulate the behavior of
>> GL_CLAMP_TO_BORDER is with a truly trivial shader.  I have already implemented
>> it, but haven't committed it until I get a fix to the shader texturing/FBO
>> problem.
>
> I suspect you've confused texture borders with the texture border
> color.  It's a trap, and I don't know of anyone that's seen them both
> and not been confused by it.

[snip]

Quite right.  I completely misremembered how these work, since I never
use them.  I went back to the Red Book and boned up.  It looks like
the problem has nothing to do with border colors at all, but instead
has to do with the fact that the tests surface-pattern-scale-up and
rotate-image-surface-paint, the ones that motivated this "bug" hunt,
both use intermediate CAIRO_FORMAT_RGB24-format image surfaces.  So
while texturing, even if the border color is (0, 0, 0, 0), the
texturing operation uses (0, 0, 0, 1) to do its interpolation, since
the texture has no alpha channel.  This is why I still get bad results
for these two tests.  Changing the surface image in the test to use
format CAIRO_FORMAT_ARGB32 makes the tests pass.  Since handling of
the intermediate surface's alpha channel isn't what the tests are
testing anyway, maybe the format of the intermediate image surface
could be changed.

Moreover, it seems that either clip-unbounded.rgb24.ref.png is wrong
to have a black left side or rotate-image-surface-paint is wrong not
to have a black background behind its painted texture.  In both cases,
we are painting a surface with no alpha channel onto another surface,
using CAIRO_OPERATOR_OVER.  Shouldn't they both fill the target, or
neither fill it?

Am I missing something?

> Note that texture border colors are a bit tricky.  Most hardware will
> sample texture border color's alpha as 1 when the texture format itself
> doesn't have an alpha channel.  This is not what's desired for our
> common usage of GL_CLAMP_TO_BORDER ("please sample 0 alpha outside of
> the texture").  In order to do this right, I think we're going to want
> to promote CAIRO_CONTENT_COLOR to be RGBA and adjust our rendering to
> fill the alpha with 1.

Do you mean that you'd like to make all cairo_gl_surface_t's use RGBA?
 If so, I guess that would fix things too.  It just seems to me that
if the user explicitly asks for RGB, she should get it.  Silently
padding out the format to make this case work seems wrong to me.  It
seems that the user should instead just be notified that she needs to
use RGBA to get proper alpha blending.

>> New Work:
>> =========
>>
>> In addition to fixing regressions, I added some new functionality.  I
>> implemented several more cases in _cairo_gl_surface_paint().  It only
>> implemented CAIRO_OPERATOR_CLEAR before.  For all supported
>> CAIRO_OPERATOR_*'s, it now paints: constant color patterns, surface patterns,
>> and gradient patterns (though these are nonfunctional due to the shader
>> texturing issue).  These code paths are far shorter than the fallback of using
>> the much more general _cairo_gl_surface_composite().  They do things in a much
>> smaller number of GL calls, and use shaders where appropriate, such as the
>> gradients.
>
> Cool.  Note that in many cases optimizing by doing work to reduce GL
> calls won't help, since inside of the GL no-op state changes get removed
> before being sent out to the hardware.  Of course, the answer is testing
> and profiling.  But this sounds promising.

Truly redundant calls can probably be elided, but in cases where you
ping pong back and forth between values in the GL using multiple
related calls, there's probably extra work being done.  As you say,
though, this looks like a job for profiling.

>> Futute Work:
>> ============
>>
>> Once I get the shader texturing thing worked out, I'd like to keep adding to
>> the implementations of paint, mask, stroke, and fill.  I'd really like to
>> simplify the code paths used to render everything, so that the number of GL
>> calls is reduced, hopefully making things more efficient, and so that
>> maintainance (specifically, optimization) is easier.  The current
>> implementations seems to follow what cairo-image-surface.c does, even when
>> there are GL shortcuts available.  I also haven't looked at the glyph code
>> much at all -- it might do with some simplification as well.

BTW, I solved the "shader" issue.  It was actually a badly initialized
texture state issue.  Setting the proper min/mag filters solved the
problem.

>> Please have a look at the work so far, and let me know if you see any issues.
>
> I have to second Chris on "please make sure you do one change per
> commit."  For example, you removed the loops on glGetError() even
> though glGetError may have multiple errors that it reports in
> succession.  From the manpage:
>
>> If more than one flag has recorded an error, glGetError returns and
>> clears an arbitrary error flag value.  Thus, glGetError should always
>> be called in a loop, until it returns GL_NO_ERROR, if all error flags
>> are to be reset.

I made that change after reading the man page myself.  I totally
misread it before.  Mea culpa.

> You'd want to remove that change, but since it's mixed in with 2 other
> changes it's now hard to do.  With a bunch of rebase -i, though, you
> should be able to clean things up.

I'm usually much better about that.  I'll stick to that more closely
from now on.

> commit 5becb8ef955ea22cd84e2a6f11146fcb0efabcac
> Author: T. Zachary Laine <whatwasthataddress at gmail.com>
> Date:   Fri Dec 18 14:48:54 2009 -0600
>
>    Simplified _cairo_gl_surface_fill_rectangles() dramatically by removing its
>    use of a color vector -- it renders all in one color.
>
> A Mesa driver with your change ends up doing basically what this code
> did before -- particularly, it's loading the color as a vertex
> attribute, even though it seems obviously like a constant to the GL
> programmer.  (and a bunch of really terrible stuff happens to figure out
> how to load it as a vertex attr, which may actually hurt performance).

Finding all that stuff out, and passing the color array, are certain
to be slower and harder to maintain than a single call to glColor().

> I want to see cairo-gl move from deprecated functionality to current
> stuff, so this would be a step backwards by using immediate mode.

I assume by "deprecated" you mean 3.x-deprecated.  That would be a
pretty radical step, requiring us to dump all the
glTexEnv(GL_TEXURE_ENV, *) calls, glEnable(GL_TEXTURE_*D) calls, use
of GL_CLAMP for TEXTURE_WRAP_*, etc., etc.  I think it's safe to say
that that sort of change is very large, and not going to happen soon.
It seems prudent to make the current code, rife as it is with GL
3.x-deprecated code, as simple as possible.  This will only make it
easier to do the conversion to full-shader, fully-non-deprecated code
later.

If anything, we could replace all the glTexEnv(GL_TEXURE_ENV, *) calls
(specifically, related to GL_COMBINE) with a shader-based
implementation, which would make the implementation *much* clearer.

As an aside, it seems very unlikely that the 3.x-deprecated
functionality will be going anywhere anytime soon (years at least).
IIUC, NVidia (maybe AMD too?) is even leaving all that deprecated
stuff in the core profile, despite what the spec says, to make
developers' lives easier.

> We *really*
> need to get to using shaders on shaderful hardware, though.

Agreed.

> Actually, chatting with idr, I'm doing it wrong in the texenv setup.  We
> shouldn't even need the dummy texture, since texture environment state
> is separate from texture object state and is always active unless
> shaders are bound.  Just use GL_COMBINE and we can play with all the
> stages we want.

Good, I was hoping that dummy texture could go away.  I hadn't
investigated removing it yet, but it was on my TODO list.

> commit dbda90996852e67e4b766d46cc08b58cb96aee55
> Author: T. Zachary Laine <whatwasthataddress at gmail.com>
> Date:   Fri Dec 25 01:01:27 2009 -0600
>
>    A major step in the right direction for the span renderer.  It seems that the
>    span renderer was only setting the alpha channel in the color values in its
>    VBOs.  It now passes through the span renderer's fixed color, for constant
>    color rendering, or rgba(1.0, 1.0, 1.0, prev_alpha) if the source is a
>    texture.  The ft-text-vertical-layout-type3 test still fails.  This was the
>    test I was attempting to fix with these changes.  At least the colors are a
>    lot closer now.  I still have to figure out why the alpha values are
>    off.
>
> Not sure what's going on with this one.  The span renderer is setting up
> the source operand like normal using shared code -- GL_TEXTURE0 unit
> produces the constant color or the texture color or whatever for the
> source.  The color provided by the spans is used in GL_TEXTURE1 blending
> to multiply the mask alpha by GL_TEXTURE0.  So the color of that should
> be entirely ignored.  It looks like you ended up multiplying by source
> alpha twice in the case of a constant source.

Yeah, this is the change I'm least happy with.  It still doesn't do
the right thing, in that the final color values still aren't quite
right, but the code before my changes was rendering all constant-color
spans in black, so it seemed at least to be an improvement.  AFAICT,
my current output for constant color spans is very similar to the
wip/compositor output (they're both failing, but look visually very
close to the ref image).

> commit 23a649801ba23d23951e6f5df2fc833f1d53f672
> Author: T. Zachary Laine <whatwasthataddress at gmail.com>
> Date:   Fri Dec 25 22:07:42 2009 -0600
>
>    Added an optimization that elides the emission of any 0-alpha spans.  As an
>    example, the ft-text-vertical-layout-type3 test goes from emitting around 16k
>    spans per text outline to emitting around 700. Also removed a misplaced
>    /* fall through */ comment.
>
> For some cairo_operator_t, alpha in the mask still has an effect on the
> destination, right?  DestAtop, as an example.  This would break those,
> I think.
>
> commit 5e025d3cdfc742b5857ce2c37b5c751779ef76f8
> Author: T. Zachary Laine <whatwasthataddress at gmail.com>
> Date:   Sat Dec 26 02:14:55 2009 -0600
>
>    Made the 0-alpha span elision predicate a bit smarter, since the dumber
>    version was causing clip-twice and rotated-clip to fail.  Now, it only
>    performs the elision if 0-alpha spans would not affect the output, based on
>    the current operation type.
>
> OK, looks like you fixed it here.  A simpler implementation might just
> be (blend_factors.dst != GL_ONE && blend_factors.dst !=
> GL_ONE_MINUS_SOURCE_ALPHA).  It would be good to squash these commits
> together so there's one change of "gl: avoid emitting spans with 0 alpha
> when they are a no-op"

Ok.

> commit 5bb924b3e60867302112cc86e721e33baf18a24b
> Author: T. Zachary Laine <whatwasthataddress at gmail.com>
> Date:   Sat Dec 26 21:50:28 2009 -0600
>
>    Removed a redundant call to _cairo_gl_glyph_cache_add_glyph() in
>    _render_glyphs()
>
> Hmm, that second glyph_cache_add_glyph() looks wrong, but it also looks
> like the surrounding code is wrong, and it looks like it's been wrong
> since I first copy'n'pasted it from the cairo-drm stuff.

[snip]

> Note how the if (status == CAIRO_INT_STATUS_UNSUPPORTED) recovery case
> won't get hit because we've already jumped to FINISH up above.  I think
> the intent was:

[snip]

Got it.

> commit e63fda04b7a1c6e4cf6b4c87148be3531ec910d1
> Author: T. Zachary Laine <whatwasthataddress at gmail.com>
> Date:   Tue Jan 5 13:36:18 2010 -0600
>
>    Greatly simplified _cairo_gl_set_operator(), which was taking measures to
>    ensure correct behavior for destinations without alpha.  The desired behavior
>    is already guaranteed by OpenGL.
>
> I think this was a workaround for Mesa driver bugs, and if we're sure
> they were just bugs then this code should go.  I do often get lost in
> the spec with regards to alpha channel handling in various pieces of the
> GL.  Given the border color concern above, we may need to keep this
> code, though.  Time for some testcases :)

This change didn't break any tests -- I checked.  The alpha value for
RGB data is always taken to be 1.0.

> Regarding the shader stuff, it looks pretty cool.  The shader gradients
> should fix up a lot of painful parts of the cairo-traces we've got.

I actually have no idea what cairo-traces is.  Could you explain this briefly?

> They'll need to grow checks for the new extensions used and fall back on
> failure, and I think we'll want to use the pre-2.0 entrypoints for them
> so that we can do them on older hardware as well.

This is at odds with the desire to dump deprecated code, no?  What's
the overarching plan here?

> It would also be nice
> to figure out how to share the gradients support between paint,
> composite, spans, glyphs, etc., and be able to do it for both source and
> mask.  I'm imagining a cairo-gl-shader.c that glues together snippets of
> code for the various consumers according to what the sources and masks
> are and what the compositing type it is.  (glyphs and composite being
> pretty similar, but spans being kind of different to to mask values
> coming in as vertex data).

I agree.  I already have plans to break the shader stuff out into its
own TU, and eventually I'd like to be able to create a mask or clip
texture directly from cairo data structures like cairo_clip_t and
cairo_region_t.

Zach