[Cairo] The complexity problem in the spline stroking algorithm

Tue Aug 26 08:15:30 PDT 2003

On Aug 25, Soorya Kuloor wrote:
 > I am interested in helping out with this O(n^2) problem. I have close to
 > a week scheduled for speeding up Cairo, especially the higher layers
 > (i.e. above libic). Can you please give me hints about how to speed the
 > code up? I have gone through the code and have a pretty good idea of the
 > code and control flow. However, I am not very clear on the math. Please
 > let me know if you have any other things that can be speeded up.

Obviously, before attempting any optimization work, it's important to
do profiling to determine where the hot spots are and to be sure that
there will be some actual payoff from the optimization, (there's no
getting around Amdahl's Law).

Owen did do some profiling for ellipse drawing at one point:

http://cairographics.org/pipermail/cairo/2003-May/000053.html

which showed that big speedups will come only from improving libic (or
RENDER) [*].

But, you said you're only interested in higher layers, and in Owen's
test, most of the time above libic, (though only 14% of the total
time), was spent in the spline stroking. This brings us back to my
O(n^2) comment. I mentioned it only because I felt as I was coding it
that it was sub-optimal. I don't actually know how much speedup
potential there is here. But it's perhaps an interesting problem
anyway.

I'll describe the problem below. I'll try to be clear, but it can be
hard to get the ideas across in a succinct (ha!) email.

The core of the spline algorithm is _cairo_pen_stroke_spline which
accepts a polygonal approximation of a pen, a spline to be stroked,
and a tolerance value. It returns a list of non-intersecting
trapezoids covering the desired shape of the stroked spline, (formed
by convolving the pen with the polygonal approximation of the spline).

The current implementation first computes the polygonal boundary of
the desired shape. Then this polygon is passed to
_cairo_traps_tessellate_polygon. That function is O(n^2), (or so), and
is the call we want to be able to remove.

The polygon is computed by two calls to _cairo_pen_stroke_spline_half,
(once forward and once backward). This function walks along the spline
doing slope comparisons to find which vertex of the pen is active,
(which point is outermost and is actually contributing to the boundary
of the final desired shape).

In order to improve the current implementation, it's significant that
the pen we use always has 180 degree radial symmetry. Because of this
the active pen vertex for the "backward" half is always directly
opposite the "forward" active vertex. So, as a first step, the
algorithm could be improved to generate the entire polygon in a single
pass.

I know how to do that much. The next part I haven't completely solved
yet. Here's the challenge:

Rather than generating the complete polygon and then tessellating the
whole thing into non-intersecting trapezoids, it could be more
efficient to directly generate the trapezoids during a single pass of
the spline.

It would be straightforward to generate trapezoids while walking the
spline. As the pen steps from one spline point to the next, a
trapezoid is formed by the translation of the two active pen
vertices. Then, when the active vertices rotate around a spline point,
triangles (degenerate trapezoids) are formed by the movement of the
outermost active pen vertex. That's fairly easy.

Unfortunately, trapezoids generated as described in the previous
paragraph will intersect each other. These intersections would lead to
incorrect results if the trapezoids were composited together.
Eliminating the intersections is the hard part. The intersections can
be global in nature, (such as when a spline forms a loop and crosses
over itself), or they can be local, (at every vertex in the polygonal
approximation to the spline, the incoming trapezoid and the outgoing
trapezoid overlap slightly).

For the global intersections, it may be that the best we can do is
something like the current implementation. But I feel like there must
be a faster way to fix the local intersections. I just haven't figured
that part out yet. It may be that we do some analysis of the spline
and only optimize those that we can guarantee have no global
intersections.

Hopefully that made some sense to somebody. If so, any solutions would
be welcome.

-Carl

[*] Owen correctly stated that improving the performance of libic
probably blocks on the rewrite of IcRasterizeTrapezoid. That rewrite
hasn't been done in C yet, (though Keith does have some prototype code
in nickle already). Volunteers to help are welcome as always.