[cairo] Survey of polygon rasterization techniques

Tue Jul 31 12:53:45 PDT 2007

On Tue, 31 Jul 2007 15:19:05 -0400, "Jeff Muizelaar" <jeff at infidigm.net> said:
> Yeah, David wrote about problem in his message about the Freetype
> rasterizer. However, that message doesn't seem to have hit any
> archives.. I've included it below.
>

the message didn't hit the list because I mistakenly sent it from my
personal e-mail address, so it's probably stuck in the moderation queue
or was discarded by the list processor. Thanks a lot for re-posting it
here Jeff, I think it answers the original question pretty well.

(i.e. for font rendering, glyphs are not supposed to be self intersecting
 so we simply ignore the problem)

- David

> -Jeff
> 
> On Tue, Jul 31, 2007 at 01:01:29PM +0200, David Turner wrote:
> > For your information, the FreeType anti-aliasing rasterizer has the 
> > following characteristics:
> > 
> > - computes the analytical area/coverage contribution of each
> >   segment to each edge pixel. this is similar in essence to what
> >   LibArt does, though the computations are done pretty differently.
> >   Again, I would like to thank Raph Levien for inspiration.
> > 
> > - the code used to record the contribution of each individual pixel
> >   in a "cell record", then Q-Sort all of them before generating the
> >   spans. (like AGG and Qt 4 do)
> > 
> >   However, this has been changed to a scanline-bucket sorted list
> >   since this proves to be faster in many common cases (up to x1.25
> >   last time I benchmarked it); but keep in mind this is with glyph
> >   shapes, which are generally very complex but render to small
> >   amounts of pixels. For a general-purpose graphics library, other
> >   choices might be good as well 
> > 
> >   (OBNOTE: my initial implementation used Shell-sort instead of Q-Sort,
> >   because the test sample fared better with it, but larger benchmarking
> >   showed Q-Sort was better in the general case)
> > 
> > - the only things recorded are cell areas/coverage. there are no SVPs,
> >   active edge lists or other auxiliary data like that.
> > 
> > - doesn't perform any malloc-style allocation. Instead, you pass it a
> >   block of memory (called the "render pool") and its size, and the
> >   rasterizer does all the work within it.
> > 
> >   if the shape is too complex to be rendered directly within the render
> >   pool, it is recursively sub-divided and the work is re-started. Note
> >   however that even with a 16 KB render pool, finding shapes that create
> >   this condition is pretty difficult (and they generally correspond to
> >   shapes with lots of pixels, which means your bottleneck is likely going
> >   to be painting, not rasterizing).
> > 
> >   in the common case, all cells are recorded and sorted in the render
> >   pool before any span is sent to the painting layer. A typical CPU cache
> >   loves this.
> > 
> > - directly decomposes Bezier paths into line segments during the
> >   rasterization process. this is also to reduce the amount of input
> >   data to parse during the rasterization.
> > 
> >   the decomposition is made through a non-recursive Decasteljau
> >   subdivision controlled through very simply heuristics which is very
> >   speedy.
> > 
> > 
> > Generally speaking, analytical-based algorithms are very fast, and, in 
> > my experience *much* faster than multi-sampling as soon as you use more
> > than 4 samples / pixel. they have one drawback though, is that they do
> > not deal correctly with self-intersecting polygons well.
> > 
> > To illustrate this, consider the following diagram describing a single
> > pixel cell that contains a self-intersecting trapezoid ABCD:
> > 
> >         A (0,0)         B (1,0)
> >            +----------+
> >             \        /
> >              \      /
> >               \    /
> >                \  /
> >                 \/
> >                 /\
> >                /  \
> >               /    \
> >              /      \
> >             /        \
> >            +----------+
> >        C (0,1)           D(1,1)
> > 
> > an analytical algorithm will always compute an area of 0, instead of 0.5 here.
> > I claim, without proof, that this 50% error is the worst case you can get
> > with an analytical algorithm, and it is quite significant, though happens
> > on very rare cases.
> > 
> > there are several ways to get deal with them:
> > 
> > A - get rid of all self-intersection in the input polygon before sending
> >     it to the rasterizer. this is equivalent to performing tesselation.
> > 
> > B - use some sort of super-sampling, e.g. 4 horizontal bands per pixel
> >     cell, each with it's own area/coverage value. the maximum error in
> >     each band is then 50/4 = 12.5% which is hardly perceptible in
> >     real-life cases.
> > 
> > C - use super-sampling or multi-sampling instead of analytical computations.
> > 
> > D - ignore the problem, since most users will never notice the problem. this
> >     is what is done in FreeType 2, because glyph shapes are supposed to not
> >     self-intersect by default anyway (some broken font have them, but nobody
> >     ever complained about their rendering anyway).
> > 
> > E - find a way to quickly spot the pixels containing self-intersection pixels
> >     from others in the rasterization process, and deal with them differently
> >     (e.g. with super-sampling). due to the rarity of the cases, this should
> >     not be a performance problem for 99.9% of pixels, but the "detection"
> >     code needs to be very fast to not slow down the rest of the algorithm
> >     too much. false positive are allowed if they're still rare.
> > 
> > 
> > A is very slow (because tesselation costs a lot), but the most correct you can get
> > 
> > B has a definite performance impact compared to "naive" analytical computations.
> >   last time I experimented with it, I found it was between x1.3 and x1.4 slower.
> >   however, the error was really un-distinguishible to the eye (a 12.5% gray instead
> >   of a 50% one isn't too different in vector graphics)
> > 
> >   a 2 band/pixel approach would yeld a 25% worst-case error which may also sound
> >   acceptable to a lot of people.
> > 
> > C in all my experimentations, super-sampling and multi-sampling proved to be
> >   significantly slower unless you want to sacrifice quality, I'm speaking of
> >   orders of magnitudes of at least 2x or 3x. I believe this is mainly due to
> >   the following facts:
> > 
> >      - first, you need to do a *lot* more DDA steps per pixel (unless you
> >        use a small amount of sub-pixels, in which case the result looks crap)
> > 
> >      - bit-counting is slow, unless you have a specialized assembler
> >        instruction to do it, and it must be done for each pixel. and
> >        the more samples, the slower.
> > 
> >   OBNOTE: The FreeType 1 rasterizer has  2x2 and 4x4 super-sampling modes that
> >           use pretty wicked tricks to speed things up and use a minimum amount
> >           of memory. Only the 2x2 one is faster than FreeType 2's anti-aliasing
> >           rasterizer, the 4x4 one is slower by a wide margin.
> > 
> >   however, I admit that it may be possible that ingenious implementations of
> >   super-sampling might get good performance as well. Color me skeptic about the
> >   claims made by the "edge-flag" paper though.
> > 
> > D well, I would not recommend it for Cairo, but it could be used as an interesting
> >   "quick rendering" mode for Cairo, for example when drawing animations where this
> >   kind of worst cases are totally insignificant...
> > 
> > as for E, I've tried experimenting with it, but never found the time to actually
> > implement something very fast :-( I still think it's a promising idea though...
> > 
> > 
> > hope you'll find all of this interesting
> > 
> > - David Turner
> > - The FreeType Project  (www.freetype.org)
> > 
> > 
> _______________________________________________
> cairo mailing list
> cairo at cairographics.org
> http://lists.cairographics.org/mailman/listinfo/cairo