[cairo] [PATCH] cairo-gl: Make VBO size run-time settable

Chris Wilson chris at chris-wilson.co.uk
Wed Sep 4 03:18:26 PDT 2013


On Wed, Sep 04, 2013 at 01:36:04AM +0000, Bryce W. Harrington wrote:
> On Fri, Aug 30, 2013 at 10:11:12AM +0100, Chris Wilson wrote:
> > On Thu, Aug 29, 2013 at 06:56:21PM -0400, Behdad Esfahbod wrote:
> > > On 13-08-29 01:55 PM, Bryce W. Harrington wrote:
> > > > Chris, btw, sort of an aside question...
> > > > 
> > > > As I've been running various performance tests for each of the GL
> > > > compositors, I am noticing that spans and traps basically have identical
> > > > performance (any differences are in the noise).  I'm aware of the
> > > > implementational differences between the two, and I've expected to see
> > > > spans perform better than traps on at least a few of these tests, but
> > > > nothing so far.  I'm guessing the tests simply aren't exercising spans'
> > > > talents, or I'm not running the right tests.
> > 
> > This is what I measured on one of my systems:
> > 
> > old: gl-traps
> > new: gl-spans
> > Speedups
> > ========
> >    gl           firefox-fishtank: 55.16x speedup
> >    gl             grads-heat-map: 16.03x speedup
> >    gl             firefox-canvas: 12.33x speedup
> >    gl         swfdec-giant-steps: 11.99x speedup
> >    gl       firefox-canvas-alpha: 11.55x speedup
> >    gl         firefox-chalkboard:  8.96x speedup
> >    gl          firefox-paintball:  7.02x speedup
> >    gl               firefox-tron:  6.93x speedup
> >    gl       gnome-system-monitor:  6.10x speedup
> >    gl          firefox-particles:  5.63x speedup
> >    gl           firefox-fishbowl:  5.43x speedup
> >    gl          firefox-talos-svg:  5.41x speedup
> > etc.
> 
> Thanks; I don't get anywhere near these differences, so assume that
> means I'm not running the tests properly.  I'll poke around and
> hopefully figure it out.
> 
> One question though, if spans is so much better than traps, why do we
> still have traps as an option?  Are there cases where spans may
> underperform, or situations it can't handle and we must fall back to
> traps?

Traps isn't really an option for cairo-gl, I had to add a couple of
missing routines for it to even run standalone. Traps should only be used
by cairo-gl for the composite_glyphs routine. (We could move that out
into its own compositor to avoid the confusion).

I guess what you are seeing instead is the mask compositor which renders
the mask using spans on the CPU rather than emitting the spans as
geometry.
 
> > > Speaking of which, Chris, can you explain to those of us not following cairo
> > > closely these days how all the various new compositors work?
> > 
> > The difference between the compositors of cairo-1.12 and the single
> > trapezoid compositor of cairo-1.0 is that are more of them! The surface
> > backends have to plug directly into the high level surface API (the old
> > low level compositor API is removed) and explicitly decide how they want
> > to render each individual operation. We have a few common strategies,
> > the trapezoid compositor (based on the original Xrender approach), the
> > spans scanline compositor (efficient for image based software
> > rendering), and a "mask" compositor (where the backend can render the
> > various channels separator and the horrible logic of combining mask with
> > the clip with the source onto the destination is handled by the
> > compositor).
> > 
> > For example, with cairo-gl it will first use its msaa compositor,
> > falling back to the spans compositors, and then to a mask compositor
> > (with a stage for glyphs to use the code from the traps compositor),
> > with a final fallback to the CPU.
> 
> 'traps' being the final fallback here?

No, the final fallback is to create a temporary image, read back the
destination into that image, render the operation using the image backend, 
and then copy it back. Typically if we hit a CPU fallback, we switch to
using the CPU for all future operations until the surface is flushed, as
the read-modify-write for every operation is usually an order of
magnitude slower than the actual operation itself.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the cairo mailing list