Some notes on optimization work in progress (was: Re: [cairo] WinXp benchmarks)

David Reveman davidr at novell.com
Thu Mar 3 12:59:12 PST 2005


On Thu, 2005-03-03 at 21:02 +0100, Soeren Sandmann wrote: 
> Carl Worth <cworth at redhat.com> writes:
> 
> > Here's an update on where that work stands. First, I've chosen
> > gearflowers.svg[*] as a profile image. It's a rather complex image
> > with lots of splines, a mixture of strokes and fills, and a *lot* of
> > gradients.
> > 
> > Rendering this image with current cairo takes about 5-7 seconds on my
> > laptop. Here's how that breaks down under oprofile (using a slightly
> > modified version of svg2png that produces no output PNG file):
> > 
> > 	CPU: CPU with timer interrupt, speed 0 MHz (estimated)
> > 	Profiling through timer interrupt
> > 	samples  %        app name                 symbol name
> > 	1371     31.9879  libpixman.so.1.0.0       fbRasterizeEdges8
> > 	788      18.3854  libcairo.so.1.0.0        _cairo_pattern_calc_color_at_pixel
> > 	481      11.2226  libpixman.so.1.0.0       IcCombineOverU
> > 	356       8.3061  libcairo.so.1.0.0        _cairo_pattern_begin_draw
> > 	126       2.9398  libcairo.so.1.0.0        _cairo_pattern_shader_linear
> > 	108       2.5198  libpixman.so.1.0.0       IcStepOver
> > 	102       2.3798  libpixman.so.1.0.0       pixman_compositeGeneral
> > 	89        2.0765  libpixman.so.1.0.0       IcFetch_a8
> > 	83        1.9365  libpixman.so.1.0.0       IcOver
> > 	79        1.8432  libpixman.so.1.0.0       IcCombineMaskU
> > 	... [http://cairographics.org/~cworth/images/gearflowers.oprofile]
> > 
> > So, the rasterization is topping the list, followed closely by
> > gradient computation, and then compositing. It'd be nicer to get some
> > callgraph-based sums to better estimate those things, but the prime
> > candidates for optimization are obvious enough.
> 
> With the sysprof profiler, which does do callgraph-based sums, I get
> these results:
> 
>    _cairo_pattern_calc_color_at_pixel()            36.86 %
>    pixman_composite()                              22.90 %
>            (with 17.16% of those in pixman_CompositeGeneral)
>    fbRasterizeTrapezoid()                          16.17 %
> 
> The percentages are totals, ie. they include children of the
> functions. The rasterization times reported by the two profilers are
> quite different.

Either way it seems like the gradient calculations are quite expensive.
The first thing we should do is check that no larger gradients than
necessary are created, after the recent changes that made so that
patterns are passed to the backends, I'm no longer sure that the size is
optimal. The second thing we could do is to hook up simple optimizations
for vertical and horizontal gradients as Owen suggested recently.

Looking at gearflowers.svg and SVGs in general, it seems that most
patterns are solid or gradients and that should always end in this
composite operation:

SRC(argb32, no transform) in MASK(a8 shape, no transform) op
DST(probably ARGB32 or 32bpp RGB24)

We should be able to accelerate that pretty well, right?

> 
> I am using CVS HEAD of cairo and libpixman. The modification I made to
> svg2png is bascially this:
> 
> -    cairo_set_target_png (cr, png_file, CAIRO_FORMAT_ARGB32, width,
> height);
> +    cairo_set_target_image (cr, data, CAIRO_FORMAT_ARGB32, width,
> height, width);

-David




More information about the cairo mailing list