Some notes on optimization work in progress (was: Re: [cairo] WinXp
cworth at redhat.com
Wed Mar 2 19:10:17 PST 2005
On Wed, 02 Mar 2005 08:50:19 -0500, Owen Taylor wrote:
> Yes, trapezoid rasterization is the known problem. Carl is working
> on a rewrite ... I don't know offhand when that is scheduled to
> land or whether there is anything you can help with there.
Here's an update on where that work stands. First, I've chosen
gearflowers.svg[*] as a profile image. It's a rather complex image
with lots of splines, a mixture of strokes and fills, and a *lot* of
Rendering this image with current cairo takes about 5-7 seconds on my
laptop. Here's how that breaks down under oprofile (using a slightly
modified version of svg2png that produces no output PNG file):
CPU: CPU with timer interrupt, speed 0 MHz (estimated)
Profiling through timer interrupt
samples % app name symbol name
1371 31.9879 libpixman.so.1.0.0 fbRasterizeEdges8
788 18.3854 libcairo.so.1.0.0 _cairo_pattern_calc_color_at_pixel
481 11.2226 libpixman.so.1.0.0 IcCombineOverU
356 8.3061 libcairo.so.1.0.0 _cairo_pattern_begin_draw
126 2.9398 libcairo.so.1.0.0 _cairo_pattern_shader_linear
108 2.5198 libpixman.so.1.0.0 IcStepOver
102 2.3798 libpixman.so.1.0.0 pixman_compositeGeneral
89 2.0765 libpixman.so.1.0.0 IcFetch_a8
83 1.9365 libpixman.so.1.0.0 IcOver
79 1.8432 libpixman.so.1.0.0 IcCombineMaskU
So, the rasterization is topping the list, followed closely by
gradient computation, and then compositing. It'd be nicer to get some
callgraph-based sums to better estimate those things, but the prime
candidates for optimization are obvious enough.
The current rasterization algorithm, (libpixman/src/fbedgeimp.h:rasterizeEdges),
is rather naïvely stepping to each sample row in the trapezoid,
(eg. 15 rows per pixels), and incrementally filling each pixel
within the trapezoid intersected by that sample row in sequence, then
stepping to the next row and looping.
So, that gives us slow, incremental fills of even all the 100% filled
pixels within a large trapezoid, (if we happen to get those).
So some of the questions are:
1) What shapes are the trapezoids we are getting? Tall and
skinny? Short and wide? Is the shorter side most often less
than a single pixel?
2) Given some of the different trapezoid shapes, what pixels
can we fill faster?
3) Can we alter the tessellation algorithm to provide more
optimization-friendly trapezoids to the rasterizer?
I've started restructuring the code to help answer (1). For each
trapezoid, we can classify any pixel into one of 16 categories based
on whether the pixel is intersected by each of the 4 trapezoid
edges. The collected statistics on these 16 categories should be quite
For (2), the same code that counts the 16 categories also provides a
natural spot to lodge in separate optimizations for each of the 16
(3) is an interesting question that I haven't looked at yet. It's
basically a way of increasing the counts of pixels in the "easy"
categories. For instance, chopping out big, pixel-aligned rectangles
from the middle of large shapes would be much better than generating
many strips of very narrow trapezoids that fill the same shape, (where
the narrow shape might only be necessary on the extreme left and right
Anyway, this is a fun place to play. I'll try to get things in CVS as
soon as possible, and keep the list apprised of progress. If anyone
wants to jump in and help, just let me know.
PS. And if anyone wants to look at the gradient computation, we might
have lots of low-hanging fruit there too.
[*] By Alex "Tetromino" Rostovtsev. I found it as part of the KDE SVG
The SVG is available under the LGPL, and I've put it, and a cairo
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: not available
Url : http://lists.freedesktop.org/archives/cairo/attachments/20050302/3964c800/attachment.pgp
More information about the cairo