[cairo] Fwd: RE: [ft-devel] latest patch file for spline flattening

Behdad Esfahbod behdad at behdad.org
Tue Sep 7 09:29:56 PDT 2010


Again, FYI.  Would be cool if someone (David?) can look into improving cairo's
de Casteljau.

behdad

-------- Original Message --------
Subject: RE: [ft-devel] latest patch file for spline flattening
Date: Tue, 7 Sep 2010 11:26:24 -0400
From: David Bevan <david.bevan at pb.com>
To: 'Graham Asher' <graham.asher at btinternet.com>, 'freetype-devel'
<freetype-devel at nongnu.org>


I've now implemented something based on Hain's research and it seems to be
measurably faster than previous FT approaches. I have used Hain's paper (now
available from http://tinyurl.com/HainBez) to provide some sensible heuristics
rather than implement all his stuff in detail, and done so without using
square roots or even any divisions.

First, here are the trace results:

                                 OLD     NEW     HAIN
average line segs per arc        13.5     11.3    2.1
min line segs per arc             2        1      1
max line segs per arc            32      133     16

average deviation per line seg    0.29     0.44   6.5
min deviation per line seg        0        0      0
max deviation per line seg       22.2     15.8   15.7

By using reasonably accurate heuristics when deciding whether to split the
curve, we create 5.5 x fewer line segments. This cuts down the number of calls
to split_cubic and the number of iterations within render_cubic.

And now the performance results:

In gray_convert_glyph, the time is distributed as follows:

                  OLD    NEW    HAIN
render_line       20%    15%     12%
render_cubic      15%    33%      9%
render_scanline   14%    10%     10%
split_cubic        6%     9%      2%

The time spent in these functions has been significantly reduced as a fraction
of processing time.

Including children, we have the following actual times per call for handling
cubic curves:

                  OLD    NEW    HAIN
render_cubic      142us  220us  61us

render_cubic is now more than twice as fast as it ever has been.

The effect of the speed-up is even measurable as a 5-10% speed-up of my font
rasterisation program (which is reading and writing data on top of using FT to
do the actual rendering).


These tests are with the same Unicode font as before. I'll run some more test
with Latin-only fonts, though previous testing didn't show any significant
performance differences between Latin and CJK. CJK glyphs just have more cubic
Bezier curves on average, but a Bezier curve is a Bezier curve wherever it
comes from.


The code is below. I hope I've tried to follow Werner's coding standards as
far as I know what they are.

Thanks.

David %^>



  static void
  gray_render_cubic( RAS_ARG_ const FT_Vector*  control1,
                              const FT_Vector*  control2,
                              const FT_Vector*  to )
  {
    FT_Vector*  arc;


    arc      = ras.bez_stack;
    arc[0].x = UPSCALE( to->x );
    arc[0].y = UPSCALE( to->y );
    arc[1].x = UPSCALE( control2->x );
    arc[1].y = UPSCALE( control2->y );
    arc[2].x = UPSCALE( control1->x );
    arc[2].y = UPSCALE( control1->y );
    arc[3].x = ras.x;
    arc[3].y = ras.y;

    for (;;)
    {
    /* Check that the arc crosses the current band. */
    TPos  min, max, y;


    min = max = arc[0].y;
    y = arc[1].y;
    if ( y < min ) min = y;
    if ( y > max ) max = y;
    y = arc[2].y;
    if ( y < min ) min = y;
    if ( y > max ) max = y;
    y = arc[3].y;
    if ( y < min ) min = y;
    if ( y > max ) max = y;
    if ( TRUNC( min ) >= ras.max_ey || TRUNC( max ) < 0 )
      goto Draw;

    /* Decide whether to split or draw */
    /* See Hain's paper at http://tinyurl.com/HainBez for more info */
    {
       TPos  dx, dy, L, dx1, dy1, dx2, dy2, s1, s2;


       /* dx and dy are x- and y- components of the P0-P3 chord vector */
       dx = arc[3].x - arc[0].x;
       dy = arc[3].y - arc[0].y;

       /* L is an (under)estimate of the Euclidean distance P0-P3 */
       L = (  236 * FT_MAX(labs(dx), labs(dy))
            +  97 * FT_MIN(labs(dx), labs(dy))) >> 8;

       /* avoid possible arithmetic overflow below by splitting */
       if (L > 32767)
          goto Split;

       /* s1 is L * the perpendicular distance from P1 to the line P0-P3 */
       s1 = labs(  dy * (dx1 = arc[1].x - arc[0].x)
                 - dx * (dy1 = arc[1].y - arc[0].y));

       /* max deviation is at least (s1 / L) * sqrt(3)/6 (if v = -1) */
       if (s1 > L * (TPos)(FT_MAX_CURVE_DEVIATION / 0.288675))
          goto Split;

       /* s2 is L * the perpendicular distance from P2 to the line P0-P3 */
       s2 = labs(  dy * (dx2 = arc[2].x - arc[0].x)
                 - dx * (dy2 = arc[2].y - arc[0].y));

       /* max deviation may be as much as (max(s1,s2)/L) * 3/4 (if v = 1) */
       if (FT_MAX(s1, s2) > L * (TPos)(FT_MAX_CURVE_DEVIATION / 0.75))
          goto Split;

       /* if P1 or P2 is outside P0-P3, split */
       if (   dy * dy1 + dx * dx1 < 0
           || dy * dy2 + dx * dx2 < 0
           || dy * (arc[3].y - arc[1].y) + dx * (arc[3].x - arc[1].x) < 0
           || dy * (arc[3].y - arc[2].y) + dx * (arc[3].x - arc[2].x) < 0
          )
          goto Split;

       /* no reason to split */
       goto Draw;
    }

    Split:

    gray_split_cubic( arc );
    arc += 3;
    continue;

    Draw:

    gray_render_line( RAS_VAR_ arc[0].x, arc[0].y );

    if (arc == ras.bez_stack)
      return;

    arc -= 3;
    }
  }





_______________________________________________
Freetype-devel mailing list
Freetype-devel at nongnu.org
http://lists.nongnu.org/mailman/listinfo/freetype-devel



More information about the cairo mailing list