[cairo] GSoC: Scan converting rasteriser update
M Joonas Pihlaja
jpihlaja at cc.helsinki.fi
Thu Aug 28 16:29:07 PDT 2008
Hey cairo-l,
I've commited and pushed the new scan converter into my spans
branch. What a relief!
Executive summary of the results: 99 scans out of 100 enjoyed the
speed boost and would convert again! The main remaining holdouts
are rectilinear polygons.
Last I updated on this I was writing the scan converter of Doom
including all the sexy lookup tables and sparse subsamplings and
things for rasterising edges. To cut a long story of an intense
two weeks short, the code and I agreed to go our separate ways.
If you're interested in the messy details, some of the results
are visible in the final commit of the spans-ugly-and-broken
branch in http://gitweb.freedesktop.org/?p=users/joonas/cairo.
Before that I had added code to make strokes and clips use the
scan converter when possible, so as far as all the internal
rendering goes it's all spanned up now. (From the status blog
you might be under the impression that only cairo_fill() still
did spans. Not so, I'm just lazy and need to update it.)
So anyway, after the GSoC final deadline passed and I'd recovered
some, I went back to the original plan of taking the Rasterizer
shootout code and incorporating it into cairo. I just posted the
results of that work in the "Survey of polygon rasterization
techniques" thread yesterday, and this is the follow up about its
effects on cairo.
I can now freely admit that one reason I posted the results in
the "Survey of [...]" thread was to show that the new rasteriser
is in the same league as the original one after I touched it --
indeed, it even has a slightly improved standing there. Just
wanted to mention this because it makes some of the massive slow
downs below more palatable and because optimised code seems to be
having an adverse reaction to me lately. :)
The good news is that the new rasteriser can speed up your fills
and strokes by up to 2x! This is true especially for things like
complex strokes and paths to fill. The bad news alluded to
earlier is that rectilinear polygons aren't so keen on it -- a
10x slowdown for stroking pixel aligned polygons -- but this
isn't as bad as you might think. Mainly it just means reworking
some of the logic that decides on using traps or spans for
rendering a given path.
A full perf-diff is available at
http://people.freedesktop.org/~joonas/perf-diffs/1.6.4_vs_spans-tor.perf-diff
The salient bits are below. These were run on a 1.6 GHz Athlon64
in all the good benchmarking conditions.
old: 1.6.4
new: spans-tor
Speedups
========
image-rgba stroke_solid_rgb_over-64 0.57 0.46% -> 0.27 0.06%: 2.13x speedup
****
[snip more stroking]
image-rgba world_map-800 232.51 0.09% -> 122.60 0.12%: 1.90x speedup
***
image-rgba zrusin_another_fill-415 9.55 0.60% -> 5.25 0.53%: 1.80x speedup
***
[snip even more stroking and some filling]
image-rgba mosaic_fill_curves-800 126.91 0.14% -> 99.11 0.11%: 1.28x speedup
*
[snip many more]
Slowdowns
=========
image-rgba box-outline-stroke-100 0.01 0.47% -> 0.07 0.04%: 9.79x slowdown
***********************************
image-rgba long-dashed-lines-512 46.87 0.96% -> 170.73 0.30%: 3.67x slowdown
**********
image-rgba unaligned_clip-100 0.03 0.35% -> 0.07 1.31%: 2.37x slowdown
*****
image-rgb paint_solid_rgb_over-256 0.06 1.23% -> 0.07 0.85%: 1.33x slowdown
*
[snip a few more]
The massive slowdown in box-outline-stroke is exactly due to
using the new rasteriser for *stroking* rectilinear paths. By
comparison, box-outline-fill doesn't make an appearance here
because it's hitting the region filling code path correctly.
The unaligned_clip case is slower because the current code
doesn't know to defer _unaligned_ rectilinear things to the trap
code. That the trap code itself is faster for this case is
likely due to overheads coming from rasterising with a general
scan converter vs. one that does just a left and right edge at a
time like fbRasterizeEdges. I'm not sure what's going on with
paint_solid_rgb[a]_over yet.
So yeah, please check out the code and come complain to me. The
code is available, as always, in the spans branch of
http://gitweb.freedesktop.org/?p=users/joonas/cairo
Once the rectilinear regressions are sorted, I'd say that the
code is ready for wider testing in real apps as well. (But don't
let that stop you from testing it already today. There may still
be bugs in this. :))
Cheers,
Joonas
More information about the cairo
mailing list