[cairo] OVER / SOURCE optimization for cairo_paint
Antoine Azar
cairo at antoineazar.com
Sat Feb 16 00:29:56 PST 2008
[ I hope this message doesn't land 3 times on the mailing list. For
some reason the first two attempts didn't go through ]
Hey all,
I've got a working optimization that speeds up paint operations for
opaque sources. The optimization replaces an OVER operator by a
SOURCE operator, and in the case of EXTEND_NONE, creates a path
around the source's extents and calls a fill operation instead. The
speedups can be huge, espescially in the case of copying a small
surface onto a much larger one.
I'd like feedback from people with more experience with the inner
workings of Cairo. I've tried to make sense of the different matrices
stored and the logic behind the coord transforms, but I'm sure
there's a better way to do this. I ran the most relevant tests and
they're all still passing (some were failing even before this
optimization. Is this normal?)
The optimization is right now straight in _cairo_gstate_paint, but I
think it could be a good idea to create a layer of functions before
the gstate calls to determine which call really is best considering
the current gstate and to convert it if necessary.
I'm attaching my optimized _cairo_gstate_paint function. I'm also
attaching a summary of the speedups seen with perf (using only the
cases affected by the optim). The best speedup is 11X, many are
3X-5X. The solid/gradients using SOURCE should stay at 1X as they're
not affected. There's one test at 0.68X but I think it's an anomaly
as it had a 20% Std dev.
Looking forward to your feedback.
Thanks,
Antoine
At 05:31 PM 2/7/2008, Carl Worth wrote:
>On Thu, 07 Feb 2008 16:51:03 -0500, Antoine Azar wrote:
> > >Uhm... how exactly are you determining what sources are classified
> > >as opaque here?
> >
> > There is a _cairo_pattern_is_opaque convenience function in
> > cairo-pattern.c.
>
>Right, OK. I had forgotten this function existed. This looks just
>fine.
>
>What I was wondering is whether the decision of "is_opaque"
>considered the extend mode at all, (and it does not).
>
> > There is a caveat listed concerning patterns and
> > gradients:
>...
> > If I understand this correctly a "deep channel"
>
>No need to worry about that now. That's just a future-proofing warning
>that if we add such a backend in the future we'll have to tweak
>CAIRO_COLOR_IS_OPAQUE.
>
>That's really just one of many things that will have to change for
>such a backend to be added in the future. And for you, for now, it's
>nothing you need to worry about.
>
> > You're absolutely right. I'll look more into this.
>
>Good luck!
>
>The easiest thing for now would be to just not do the OVER->SOURCE
>optimization in the face of EXTEND_NONE. Doing better than that will
>require reasoning about the extents of any given operation. I don't
>think we currently compute those extents in advance in any convenient
>form, (but see cairo-analysis-surface.c for code that *does* do all of
>the necessary extents computation).
>
>If we could compute all of that without doing any additional work,
>then it might be worth it, (and we'd likely end up wanting to pass
>such details down to the backends as well).
>
>-Carl
-------------- next part --------------
cairo_status_t
_cairo_gstate_paint (cairo_gstate_t *gstate)
{
cairo_status_t status;
cairo_pattern_union_t pattern;
cairo_operator_t op = gstate->op;
//optimization related declarations
cairo_rectangle_int_t extents;
const cairo_pattern_union_t *pattern_union;
cairo_fixed_t x_fixed, y_fixed;
cairo_fixed_t dx_fixed, dy_fixed;
cairo_rectangle_t surface_rect;
cairo_path_fixed_t *path;
cairo_matrix_t matrix_invert;
if (gstate->source->status)
return gstate->source->status;
status = _cairo_surface_set_clip (gstate->target, &gstate->clip);
if (status)
return status;
status = _cairo_gstate_copy_transformed_source (gstate, &pattern.base);
if (status)
return status;
//AAZAR - BEGIN OPTIMIZATION //////////////////////////////////////////////////////////////////////////////
//Optimize a very common case of calling paint with an OVER operator for opaque surfaces.
//Replace it with a more efficient SOURCE operator, and constrain the operation to the source's extents.
if ( _cairo_pattern_is_opaque(&pattern.base) && (op == CAIRO_OPERATOR_OVER || op == CAIRO_OPERATOR_SOURCE))
{
pattern_union = (cairo_pattern_union_t *) &pattern;
switch (pattern_union->base.type)
{
case CAIRO_PATTERN_TYPE_SOLID:
case CAIRO_PATTERN_TYPE_LINEAR:
case CAIRO_PATTERN_TYPE_RADIAL:
op = CAIRO_OPERATOR_SOURCE;
break;
case CAIRO_PATTERN_TYPE_SURFACE:
//in all cases set the operator to source
op = CAIRO_OPERATOR_SOURCE;
if (pattern.surface.base.extend != CAIRO_EXTEND_NONE)
{
//We'll need to fill the whole destination anyways, so go on with paint
break;
}
else
{
//create a path around the source's extents and call fill with that
path = _cairo_path_fixed_create();
//extents of the source
status = _cairo_surface_get_extents(pattern.surface.surface, &extents);
if (status) {
printf("ERROR\n");
return status; //instead continue without optimization?
}
//multiply by the source's inverse matrix and by the dest context transformation matrix
surface_rect.x = (double)extents.x;
surface_rect.y = (double)extents.y;
surface_rect.width = surface_rect.x+(double)extents.width; //convert width and height to point coords for now
surface_rect.height = surface_rect.y+(double)extents.height;
//FIXME: isn't the inverse already available somewhere?
matrix_invert = pattern.surface.base.matrix;
cairo_matrix_invert(&matrix_invert);
_cairo_matrix_transform_bounding_box (&matrix_invert,
&surface_rect.x, &surface_rect.y, &surface_rect.width, &surface_rect.height, NULL);
surface_rect.x = floor(surface_rect.x);
surface_rect.y = floor(surface_rect.y);
surface_rect.width = ceil(surface_rect.width);
surface_rect.height = ceil(surface_rect.height);
if (surface_rect.width <= 0 || surface_rect.height <= 0)
return CAIRO_STATUS_SUCCESS;
cairo_matrix_transform_point (&gstate->target->device_transform, &surface_rect.x, &surface_rect.y);
cairo_matrix_transform_point (&gstate->target->device_transform, &surface_rect.width, &surface_rect.height);
//FIXME: why doesn't user_to_backend implement the floor and ceil as done above (and why do we need them in the first place anyways?)?
// this is apparent in the filter-nearest-offset test
// _cairo_gstate_user_to_backend (gstate, &surface_rect.x, &surface_rect.y);
// _cairo_gstate_user_to_backend (gstate, &surface_rect.width, &surface_rect.height);
//Go back to width and height instead of point coords
surface_rect.width -= surface_rect.x;
surface_rect.height -= surface_rect.y;
x_fixed = _cairo_fixed_from_double (surface_rect.x);
y_fixed = _cairo_fixed_from_double (surface_rect.y);
dx_fixed = _cairo_fixed_from_double (surface_rect.width);
dy_fixed = _cairo_fixed_from_double (surface_rect.height);
status = _cairo_path_fixed_move_to (path, x_fixed, y_fixed);
status |= _cairo_path_fixed_rel_line_to (path, dx_fixed, 0);
status |= _cairo_path_fixed_rel_line_to (path, 0, dy_fixed);
status |= _cairo_path_fixed_rel_line_to (path, -dx_fixed, 0);
status |= _cairo_path_fixed_rel_line_to (path, 0, -dy_fixed);
status |= _cairo_path_fixed_close_path (path);
if (status) {
printf("ERROR\n");
return status; //instead continue without optimization?
}
_cairo_gstate_fill(gstate, path);
_cairo_path_fixed_destroy(path);
return status;
}
}
}
//AAZAR - END OPTIMIZATION //////////////////////////////////////////////////////////////////////////////
status = _cairo_surface_paint (gstate->target,
op,
&pattern.base);
_cairo_pattern_fini (&pattern.base);
return status;
}
-------------- next part --------------
backend-content test-size Speedup (old median / new median)
image-rgba paint_solid_rgb_over-256 97%
image-rgba paint_solid_rgb_source-256 100%
image-rgba paint_image_rgb_over-256 503%
image-rgba paint_image_rgb_source-256 85%
image-rgba paint_linear_rgb_over-256 154%
image-rgba paint_linear_rgb_source-256 101%
image-rgba paint_radial_rgb_over-256 114%
image-rgba paint_radial_rgb_source-256 100%
image-rgba paint_solid_rgb_over-512 63%
image-rgba paint_solid_rgb_source-512 88%
image-rgba paint_image_rgb_over-512 313%
image-rgba paint_image_rgb_source-512 84%
image-rgba paint_linear_rgb_over-512 147%
image-rgba paint_linear_rgb_source-512 96%
image-rgba paint_radial_rgb_over-512 112%
image-rgba paint_radial_rgb_source-512 100%
image-rgb paint_solid_rgb_over-256 100%
image-rgb paint_solid_rgb_source-256 100%
image-rgb paint_image_rgb_over-256 1137%
image-rgb paint_image_rgb_source-256 96%
image-rgb paint_linear_rgb_over-256 154%
image-rgb paint_linear_rgb_source-256 101%
image-rgb paint_radial_rgb_over-256 114%
image-rgb paint_radial_rgb_source-256 102%
image-rgb paint_solid_rgb_over-512 101%
image-rgb paint_solid_rgb_source-512 98%
image-rgb paint_image_rgb_over-512 454%
image-rgb paint_image_rgb_source-512 106%
image-rgb paint_linear_rgb_over-512 148%
image-rgb paint_linear_rgb_source-512 97%
image-rgb paint_radial_rgb_over-512 113%
image-rgb paint_radial_rgb_source-512 99%
win32-rgb paint_solid_rgb_over-256 100%
win32-rgb& paint_solid_rgb_over-256 99%
win32-rgb paint_solid_rgb_source-256 100%
win32-rgb& paint_solid_rgb_source-256 100%
win32-rgb paint_image_rgb_over-256 99%
win32-rgb& paint_image_rgb_over-256 101%
win32-rgb paint_image_rgb_source-256 98%
win32-rgb& paint_image_rgb_source-256 100%
win32-rgb paint_linear_rgb_over-256 153%
win32-rgb& paint_linear_rgb_over-256 151%
win32-rgb paint_linear_rgb_source-256 99%
win32-rgb& paint_linear_rgb_source-256 100%
win32-rgb paint_radial_rgb_over-256 114%
win32-rgb& paint_radial_rgb_over-256 114%
win32-rgb paint_radial_rgb_source-256 99%
win32-rgb& paint_radial_rgb_source-256 100%
win32-rgb paint_solid_rgb_over-512 120%
win32-rgb& paint_solid_rgb_over-512 98%
win32-rgb paint_solid_rgb_source-512 121%
win32-rgb& paint_solid_rgb_source-512 99%
win32-rgb paint_image_rgb_over-512 97%
win32-rgb& paint_image_rgb_over-512 99%
win32-rgb paint_image_rgb_source-512 96%
win32-rgb& paint_image_rgb_source-512 98%
win32-rgb paint_linear_rgb_over-512 149%
win32-rgb& paint_linear_rgb_over-512 149%
win32-rgb paint_linear_rgb_source-512 104%
win32-rgb& paint_linear_rgb_source-512 104%
win32-rgb paint_radial_rgb_over-512 112%
win32-rgb& paint_radial_rgb_over-512 112%
win32-rgb paint_radial_rgb_source-512 104%
win32-rgb& paint_radial_rgb_source-512 96%
win32-rgba paint_solid_rgb_over-256 114%
win32-rgba& paint_solid_rgb_over-256 94%
win32-rgba paint_solid_rgb_source-256 101%
win32-rgba& paint_solid_rgb_source-256 96%
win32-rgba paint_image_rgb_over-256 513%
win32-rgba& paint_image_rgb_over-256 360%
win32-rgba paint_image_rgb_source-256 68%
win32-rgba& paint_image_rgb_source-256 83%
win32-rgba paint_linear_rgb_over-256 151%
win32-rgba& paint_linear_rgb_over-256 152%
win32-rgba paint_linear_rgb_source-256 99%
win32-rgba& paint_linear_rgb_source-256 100%
win32-rgba paint_radial_rgb_over-256 114%
win32-rgba& paint_radial_rgb_over-256 114%
win32-rgba paint_radial_rgb_source-256 99%
win32-rgba& paint_radial_rgb_source-256 99%
win32-rgba paint_solid_rgb_over-512 61%
win32-rgba& paint_solid_rgb_over-512 103%
win32-rgba paint_solid_rgb_source-512 100%
win32-rgba& paint_solid_rgb_source-512 101%
win32-rgba paint_image_rgb_over-512 312%
win32-rgba& paint_image_rgb_over-512 309%
win32-rgba paint_image_rgb_source-512 98%
win32-rgba& paint_image_rgb_source-512 96%
win32-rgba paint_linear_rgb_over-512 155%
win32-rgba& paint_linear_rgb_over-512 153%
win32-rgba paint_linear_rgb_source-512 104%
win32-rgba& paint_linear_rgb_source-512 104%
win32-rgba paint_radial_rgb_over-512 104%
win32-rgba& paint_radial_rgb_over-512 108%
win32-rgba paint_radial_rgb_source-512 104%
win32-rgba& paint_radial_rgb_source-512 99%
More information about the cairo
mailing list