[cairo] Survey of polygon rasterization techniques
David Turner
david at freetype.org
Tue Jul 31 12:53:45 PDT 2007
On Tue, 31 Jul 2007 15:19:05 -0400, "Jeff Muizelaar" <jeff at infidigm.net> said:
> Yeah, David wrote about problem in his message about the Freetype
> rasterizer. However, that message doesn't seem to have hit any
> archives.. I've included it below.
>
the message didn't hit the list because I mistakenly sent it from my
personal e-mail address, so it's probably stuck in the moderation queue
or was discarded by the list processor. Thanks a lot for re-posting it
here Jeff, I think it answers the original question pretty well.
(i.e. for font rendering, glyphs are not supposed to be self intersecting
so we simply ignore the problem)
- David
> -Jeff
>
> On Tue, Jul 31, 2007 at 01:01:29PM +0200, David Turner wrote:
> > For your information, the FreeType anti-aliasing rasterizer has the
> > following characteristics:
> >
> > - computes the analytical area/coverage contribution of each
> > segment to each edge pixel. this is similar in essence to what
> > LibArt does, though the computations are done pretty differently.
> > Again, I would like to thank Raph Levien for inspiration.
> >
> > - the code used to record the contribution of each individual pixel
> > in a "cell record", then Q-Sort all of them before generating the
> > spans. (like AGG and Qt 4 do)
> >
> > However, this has been changed to a scanline-bucket sorted list
> > since this proves to be faster in many common cases (up to x1.25
> > last time I benchmarked it); but keep in mind this is with glyph
> > shapes, which are generally very complex but render to small
> > amounts of pixels. For a general-purpose graphics library, other
> > choices might be good as well
> >
> > (OBNOTE: my initial implementation used Shell-sort instead of Q-Sort,
> > because the test sample fared better with it, but larger benchmarking
> > showed Q-Sort was better in the general case)
> >
> > - the only things recorded are cell areas/coverage. there are no SVPs,
> > active edge lists or other auxiliary data like that.
> >
> > - doesn't perform any malloc-style allocation. Instead, you pass it a
> > block of memory (called the "render pool") and its size, and the
> > rasterizer does all the work within it.
> >
> > if the shape is too complex to be rendered directly within the render
> > pool, it is recursively sub-divided and the work is re-started. Note
> > however that even with a 16 KB render pool, finding shapes that create
> > this condition is pretty difficult (and they generally correspond to
> > shapes with lots of pixels, which means your bottleneck is likely going
> > to be painting, not rasterizing).
> >
> > in the common case, all cells are recorded and sorted in the render
> > pool before any span is sent to the painting layer. A typical CPU cache
> > loves this.
> >
> > - directly decomposes Bezier paths into line segments during the
> > rasterization process. this is also to reduce the amount of input
> > data to parse during the rasterization.
> >
> > the decomposition is made through a non-recursive Decasteljau
> > subdivision controlled through very simply heuristics which is very
> > speedy.
> >
> >
> > Generally speaking, analytical-based algorithms are very fast, and, in
> > my experience *much* faster than multi-sampling as soon as you use more
> > than 4 samples / pixel. they have one drawback though, is that they do
> > not deal correctly with self-intersecting polygons well.
> >
> > To illustrate this, consider the following diagram describing a single
> > pixel cell that contains a self-intersecting trapezoid ABCD:
> >
> > A (0,0) B (1,0)
> > +----------+
> > \ /
> > \ /
> > \ /
> > \ /
> > \/
> > /\
> > / \
> > / \
> > / \
> > / \
> > +----------+
> > C (0,1) D(1,1)
> >
> > an analytical algorithm will always compute an area of 0, instead of 0.5 here.
> > I claim, without proof, that this 50% error is the worst case you can get
> > with an analytical algorithm, and it is quite significant, though happens
> > on very rare cases.
> >
> > there are several ways to get deal with them:
> >
> > A - get rid of all self-intersection in the input polygon before sending
> > it to the rasterizer. this is equivalent to performing tesselation.
> >
> > B - use some sort of super-sampling, e.g. 4 horizontal bands per pixel
> > cell, each with it's own area/coverage value. the maximum error in
> > each band is then 50/4 = 12.5% which is hardly perceptible in
> > real-life cases.
> >
> > C - use super-sampling or multi-sampling instead of analytical computations.
> >
> > D - ignore the problem, since most users will never notice the problem. this
> > is what is done in FreeType 2, because glyph shapes are supposed to not
> > self-intersect by default anyway (some broken font have them, but nobody
> > ever complained about their rendering anyway).
> >
> > E - find a way to quickly spot the pixels containing self-intersection pixels
> > from others in the rasterization process, and deal with them differently
> > (e.g. with super-sampling). due to the rarity of the cases, this should
> > not be a performance problem for 99.9% of pixels, but the "detection"
> > code needs to be very fast to not slow down the rest of the algorithm
> > too much. false positive are allowed if they're still rare.
> >
> >
> > A is very slow (because tesselation costs a lot), but the most correct you can get
> >
> > B has a definite performance impact compared to "naive" analytical computations.
> > last time I experimented with it, I found it was between x1.3 and x1.4 slower.
> > however, the error was really un-distinguishible to the eye (a 12.5% gray instead
> > of a 50% one isn't too different in vector graphics)
> >
> > a 2 band/pixel approach would yeld a 25% worst-case error which may also sound
> > acceptable to a lot of people.
> >
> > C in all my experimentations, super-sampling and multi-sampling proved to be
> > significantly slower unless you want to sacrifice quality, I'm speaking of
> > orders of magnitudes of at least 2x or 3x. I believe this is mainly due to
> > the following facts:
> >
> > - first, you need to do a *lot* more DDA steps per pixel (unless you
> > use a small amount of sub-pixels, in which case the result looks crap)
> >
> > - bit-counting is slow, unless you have a specialized assembler
> > instruction to do it, and it must be done for each pixel. and
> > the more samples, the slower.
> >
> > OBNOTE: The FreeType 1 rasterizer has 2x2 and 4x4 super-sampling modes that
> > use pretty wicked tricks to speed things up and use a minimum amount
> > of memory. Only the 2x2 one is faster than FreeType 2's anti-aliasing
> > rasterizer, the 4x4 one is slower by a wide margin.
> >
> > however, I admit that it may be possible that ingenious implementations of
> > super-sampling might get good performance as well. Color me skeptic about the
> > claims made by the "edge-flag" paper though.
> >
> > D well, I would not recommend it for Cairo, but it could be used as an interesting
> > "quick rendering" mode for Cairo, for example when drawing animations where this
> > kind of worst cases are totally insignificant...
> >
> > as for E, I've tried experimenting with it, but never found the time to actually
> > implement something very fast :-( I still think it's a promising idea though...
> >
> >
> > hope you'll find all of this interesting
> >
> > - David Turner
> > - The FreeType Project (www.freetype.org)
> >
> >
> _______________________________________________
> cairo mailing list
> cairo at cairographics.org
> http://lists.cairographics.org/mailman/listinfo/cairo
More information about the cairo
mailing list