[cairo] GL backend work

Tue Jan 12 14:13:04 PST 2010

On Tue, 12 Jan 2010 11:56:28 -0600, Zach Laine <whatwasthataddress at gmail.com> wrote:
> On Tue, Jan 12, 2010 at 6:09 PM, Eric Anholt <eric at anholt.net> wrote:
> > On Fri, 8 Jan 2010 20:48:30 +1800, Zach Laine <whatwasthataddress at gmail.com> wrote:
> >> Ive fixed a majority of the GL regressions.  There are two main classes of
> >> failures that I understand, but so far cannot fix.  A full test summary is
> >> below.
> >>
> >> Known failure type #1: The cairo-gl span renderer was pretty badly broken
> >> before I got there.  It rendered all spans as black, with the alpha associated
> >> with each span.  No creation of a mask and then masking a source was being
> >> done.  I changed it to render the source color, using the alpha for each span.
> >> This at least makes tests like ft-text-vertical-layout-type1 look
> >> approximately right, though it's producing incorrect color values.  If I
> >> understood better what pixman was doing to combine a span and source color
> >> with a destination color I could fix this.  Anyone got any advice on this?
> >
> > What's a source color when you've got a texture as the source?  The
> > spans renderer is providing the mask in the form of scanlines with
> > alpha.  The color of the scanlines is entirely ignored, since it isn't
> > used in the mask part of the IN operator (unless you're doing component
> > alpha, which isn't the case with the span renderer).
> 
> Well, it might have been acting as a mask in the texturing case (I
> don't remember now if that case worked or not), but it definitely was
> not in the solid color case, which is why I made the change I did.
> The color used for texturing is (1.0, 1.0, 1.0, span_alpha), which
> does the right thing.
> 
> >> Known failure type #2: Texturing in shaders does not work given the current
> >> use of FBOs.  I have no idea why.  I have a post in on the opengl.org forums
> >> asking for help.  Maybe one of the gurus there can help me fix it.  Shaders
> >> are certainly desirable for doing certain things more efficiently, but they
> >> are essentially optional in most cases.  However, for cases in which a source
> >> texture is being rendered onto a target texture under extend policy
> >> CAIRO_EXTEND_NONE, a shader is required.  This is because CAIRO_EXTEND_NONE is
> >> implemented as GL_CLAMP_TO_BORDER, which is completely appropriate, except
> >> that it can't be done with textures bound to framebuffers as all the textures
> >> are in cairo_gl_surface_t's.  Setting a gl_surface's texture border to 1
> >> causes the framebuffer to be incomplete.  This is true when using the
> >> EXT_framebuffer_object extension, and when using the framebuffer support in
> >> core GL > 3.0.  It's undocumented, but I found it to be consistent, at least
> >> for NVidia GL implementations (and that's a pretty big chunk of the end user
> >> environment space).  The only way to simulate the behavior of
> >> GL_CLAMP_TO_BORDER is with a truly trivial shader.  I have already implemented
> >> it, but haven't committed it until I get a fix to the shader texturing/FBO
> >> problem.
> >
> > I suspect you've confused texture borders with the texture border
> > color.  It's a trap, and I don't know of anyone that's seen them both
> > and not been confused by it.
> 
> [snip]
> 
> Quite right.  I completely misremembered how these work, since I never
> use them.  I went back to the Red Book and boned up.  It looks like
> the problem has nothing to do with border colors at all, but instead
> has to do with the fact that the tests surface-pattern-scale-up and
> rotate-image-surface-paint, the ones that motivated this "bug" hunt,
> both use intermediate CAIRO_FORMAT_RGB24-format image surfaces.  So
> while texturing, even if the border color is (0, 0, 0, 0), the
> texturing operation uses (0, 0, 0, 1) to do its interpolation, since
> the texture has no alpha channel.  This is why I still get bad results
> for these two tests.  Changing the surface image in the test to use
> format CAIRO_FORMAT_ARGB32 makes the tests pass.  Since handling of
> the intermediate surface's alpha channel isn't what the tests are
> testing anyway, maybe the format of the intermediate image surface
> could be changed.
> 
> Moreover, it seems that either clip-unbounded.rgb24.ref.png is wrong
> to have a black left side or rotate-image-surface-paint is wrong not
> to have a black background behind its painted texture.  In both cases,
> we are painting a surface with no alpha channel onto another surface,
> using CAIRO_OPERATOR_OVER.  Shouldn't they both fill the target, or
> neither fill it?
> 
> Am I missing something?

the clip-unbounded.c I see is CAIRO_OPERATOR_SOURCE.  Still confused
about its results, though.

> > Note that texture border colors are a bit tricky.  Most hardware will
> > sample texture border color's alpha as 1 when the texture format itself
> > doesn't have an alpha channel.  This is not what's desired for our
> > common usage of GL_CLAMP_TO_BORDER ("please sample 0 alpha outside of
> > the texture").  In order to do this right, I think we're going to want
> > to promote CAIRO_CONTENT_COLOR to be RGBA and adjust our rendering to
> > fill the alpha with 1.
> 
> Do you mean that you'd like to make all cairo_gl_surface_t's use RGBA?
>  If so, I guess that would fix things too.  It just seems to me that
> if the user explicitly asks for RGB, she should get it.  Silently
> padding out the format to make this case work seems wrong to me.  It
> seems that the user should instead just be notified that she needs to
> use RGBA to get proper alpha blending.

We have a similar experience inside of the OpenGL drivers.  The user
asks for the behavior of RGB, and we have to provide the behavior, but
if the hardware doesn't actually support RGB with the semantics we
require, then we have to do RGBA and work around it ourselves.  In the
Intel case, we can't do blending to XRGB destinations, so we always
treat them as ARGB and tweak the blender to get XRGB behavior.  The
workarounds aren't all that painful, so we don't necessarily have to go
hunt down hardware guys and yell at them (though we might ask kindly for
our lives to be easier).

Oh, there's no padding going on here, if that was the concern with
moving to ARGB.  Graphics hardware doesn't do 24bpp RGB, it does
32bpp/24 depth xRGB.  So we're not going to be wasting any more memory.

> > I want to see cairo-gl move from deprecated functionality to current
> > stuff, so this would be a step backwards by using immediate mode.
> 
> I assume by "deprecated" you mean 3.x-deprecated.  That would be a
> pretty radical step, requiring us to dump all the
> glTexEnv(GL_TEXURE_ENV, *) calls, glEnable(GL_TEXTURE_*D) calls, use
> of GL_CLAMP for TEXTURE_WRAP_*, etc., etc.  I think it's safe to say
> that that sort of change is very large, and not going to happen soon.
> It seems prudent to make the current code, rife as it is with GL
> 3.x-deprecated code, as simple as possible.  This will only make it
> easier to do the conversion to full-shader, fully-non-deprecated code
> later.
> 
> If anything, we could replace all the glTexEnv(GL_TEXURE_ENV, *) calls
> (specifically, related to GL_COMBINE) with a shader-based
> implementation, which would make the implementation *much* clearer.

Yeah, that's the intention.  glTexEnv was just the minimal
implementation that would work on all the hardware that anyone could be
interested in (though non-shader hardware doesn't work for cairo-gl
anyway because of the reliance on ARB_texture_non_power_of_two, which
needs to be fixed), and it got some code up and running quickly.  In
retrospect, that was a mistake and I should have done shaders first and
compat later :)

Right now in Mesa for cairo-gl cairo-traces we're spending probably
around 10% of CPU time on thrashing around computing the shader and
other state from the fixed function state we set up in cairo-gl.  I'm
pretty sure that cairo-gl could look up the appropriate shaders and bind
them much faster, and it would reduce the cache footprint as well (which
may ease up other areas of the profile).

> As an aside, it seems very unlikely that the 3.x-deprecated
> functionality will be going anywhere anytime soon (years at least).
> IIUC, NVidia (maybe AMD too?) is even leaving all that deprecated
> stuff in the core profile, despite what the spec says, to make
> developers' lives easier.

In the open-source world, we're really interested in getting to 3.0
drivers.  One of the promising things there is that we could ship
separate 3.0 core and 3.0 compat drivers (built from the same driver
source) that are dynamically loaded as appropriate, which would cut out
tons of compat code in hot paths.  Given that a major concern that many
people have about GL is the amount of instructions between the app
calling GL and the driver emitting code, this should help GL's popularity.

In order for a core 3.0 to be interesting, though, we want some userland
ready to live on 3.0.  So moving cairo-gl closer to being ready for it
is a feature.

> > Actually, chatting with idr, I'm doing it wrong in the texenv setup.  We
> > shouldn't even need the dummy texture, since texture environment state
> > is separate from texture object state and is always active unless
> > shaders are bound.  Just use GL_COMBINE and we can play with all the
> > stages we want.
> 
> Good, I was hoping that dummy texture could go away.  I hadn't
> investigated removing it yet, but it was on my TODO list.

I took a look into this last night, and it looks like the texture unit
has to be enabled, at least.  I think the GL may bind a dummy texture in
place, though, if we haven't bound one.  More investigation required.

> > commit e63fda04b7a1c6e4cf6b4c87148be3531ec910d1
> > Author: T. Zachary Laine <whatwasthataddress at gmail.com>
> > Date:   Tue Jan 5 13:36:18 2010 -0600
> >
> >    Greatly simplified _cairo_gl_set_operator(), which was taking measures to
> >    ensure correct behavior for destinations without alpha.  The desired behavior
> >    is already guaranteed by OpenGL.
> >
> > I think this was a workaround for Mesa driver bugs, and if we're sure
> > they were just bugs then this code should go.  I do often get lost in
> > the spec with regards to alpha channel handling in various pieces of the
> > GL.  Given the border color concern above, we may need to keep this
> > code, though.  Time for some testcases :)
> 
> This change didn't break any tests -- I checked.  The alpha value for
> RGB data is always taken to be 1.0.
> 
> > Regarding the shader stuff, it looks pretty cool.  The shader gradients
> > should fix up a lot of painful parts of the cairo-traces we've got.
> 
> I actually have no idea what cairo-traces is.  Could you explain this briefly?

Oh, they're awesome.

cd ~/src/cairo/perf
git clone git://anongit.freedesktop.org/git/cairo-traces
(cd cairo-traces && make)
CAIRO_TEST_TARGET=gl ./cairo-perf-trace -i 3 ~/src/cairo/perf/cairo-traces/benchmark/firefox-talos-gfx.trace

(or no trace file to run them all)

So you can get answers to "does my stuff make
firefox/swfdec/evolution/whatever faster/slower than before".

> > They'll need to grow checks for the new extensions used and fall back on
> > failure, and I think we'll want to use the pre-2.0 entrypoints for them
> > so that we can do them on older hardware as well.
> 
> This is at odds with the desire to dump deprecated code, no?  What's
> the overarching plan here?

It's tricky.  I want to see cairo-gl usable on everyone's hardware.  I
also want cairo-gl to be fast and awesome on new hardware.  That means
we need both compat paths for old fixed-function stuff, and
new-path-only stuff for new hardware.

For most GL stuff, we can just rely on new general features and do well.
For example, vertex buffer objects should be available everywhere, and
we can just use them in any place that's appropriate.  Vertex arrays or
BOs should always be used over immediate mode.  Rely on FBOs and not
pbuffers or worse hacks.

However, there's this unfortunate popular generation in between 3.0
hardware and 1.3ish hardware with limited fragment shader capability
(enough to do everything we want in cairo) and no occlusion queries (so
2.0 can't be exposed).  We should be able to do all the shaders we want
there, using the pre-2.0 entrypoints.  The ARB made significant changes
between the core 2.0 entrypoints and the ARB shader extensions that
means that apps that want to support both core 2.0+ and pre-2.0 using
that functionality has to have to separate paths for using the differing
entrypoints.  What I'm expecting to end up with is a cairo-gl-shaders.c
file that handles compiling, linking, loading, and setting uniforms for
GLSL shaders that is smart about what GL version is available.

Luckily, the ARB seems to have realized this was a mistake and
extensions for exposing later core functionality since then have shared
entrypoints with core.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
Url : http://lists.cairographics.org/archives/cairo/attachments/20100112/f24a0751/attachment-0001.pgp