Some notes on optimization work in progress (was: Re: [cairo] WinXp benchmarks)

Carl Worth cworth at
Wed Mar 2 19:10:17 PST 2005

On Wed, 02 Mar 2005 08:50:19 -0500, Owen Taylor wrote:
> Yes, trapezoid rasterization is the known problem. Carl is working
> on a rewrite ... I don't know offhand when that is scheduled to
> land or whether there is anything you can help with there.

Here's an update on where that work stands. First, I've chosen
gearflowers.svg[*] as a profile image. It's a rather complex image
with lots of splines, a mixture of strokes and fills, and a *lot* of

Rendering this image with current cairo takes about 5-7 seconds on my
laptop. Here's how that breaks down under oprofile (using a slightly
modified version of svg2png that produces no output PNG file):

	CPU: CPU with timer interrupt, speed 0 MHz (estimated)
	Profiling through timer interrupt
	samples  %        app name                 symbol name
	1371     31.9879       fbRasterizeEdges8
	788      18.3854        _cairo_pattern_calc_color_at_pixel
	481      11.2226       IcCombineOverU
	356       8.3061        _cairo_pattern_begin_draw
	126       2.9398        _cairo_pattern_shader_linear
	108       2.5198       IcStepOver
	102       2.3798       pixman_compositeGeneral
	89        2.0765       IcFetch_a8
	83        1.9365       IcOver
	79        1.8432       IcCombineMaskU
	... []

So, the rasterization is topping the list, followed closely by
gradient computation, and then compositing. It'd be nicer to get some
callgraph-based sums to better estimate those things, but the prime
candidates for optimization are obvious enough.

The current rasterization algorithm, (libpixman/src/fbedgeimp.h:rasterizeEdges),
is rather naïvely stepping to each sample row in the trapezoid,
(eg. 15 rows per pixels), and incrementally filling each pixel
within the trapezoid intersected by that sample row in sequence, then
stepping to the next row and looping.

So, that gives us slow, incremental fills of even all the 100% filled
pixels within a large trapezoid, (if we happen to get those).

So some of the questions are:

	1) What shapes are the trapezoids we are getting? Tall and
	   skinny? Short and wide? Is the shorter side most often less
	   than a single pixel?

	2) Given some of the different trapezoid shapes, what pixels
	   can we fill faster?

	3) Can we alter the tessellation algorithm to provide more
	   optimization-friendly trapezoids to the rasterizer?

I've started restructuring the code to help answer (1). For each
trapezoid, we can classify any pixel into one of 16 categories based
on whether the pixel is intersected by each of the 4 trapezoid
edges. The collected statistics on these 16 categories should be quite

For (2), the same code that counts the 16 categories also provides a
natural spot to lodge in separate optimizations for each of the 16
pixel types.

(3) is an interesting question that I haven't looked at yet. It's
basically a way of increasing the counts of pixels in the "easy"
categories. For instance, chopping out big, pixel-aligned rectangles
from the middle of large shapes would be much better than generating
many strips of very narrow trapezoids that fill the same shape, (where
the narrow shape might only be necessary on the extreme left and right

Anyway, this is a fun place to play. I'll try to get things in CVS as
soon as possible, and keep the list apprised of progress. If anyone
wants to jump in and help, just let me know.


PS. And if anyone wants to look at the gradient computation, we might
have lots of low-hanging fruit there too.

[*] By Alex "Tetromino" Rostovtsev. I found it as part of the KDE SVG
wallpaper contest:

The SVG is available under the LGPL, and I've put it, and a cairo
rendering at:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url :

More information about the cairo mailing list