[cairo] Concerns about using filters for downscaling

Fri Mar 21 15:33:32 PDT 2014

I was reviewing what changes have landed in 1.13.x, since I'd really to
see a stable release with cairo_set_device_scale(). The other major
change in 1.13.x is the use of pixman convolutions for downscaling for
the image backend, to get a better appearance.

Better downscaling is something that has been wanted for a long time.
Yay! But I have some concerns with the code that's in there now.

Performance
===========

Here's a rough set of timings (with a fix for BEST - a missing break
makes it the same as GOOD, and BILINEAR changed to scale-down with
BILINEAR scaling like GOOD/BEST used to rather than using the
convolution). The times are the number of microseconds
it takes to paint a 512x512 image scaled down by the given scale.

[ I'm not sure why the NEAREST/BILINEAR numbers jump around so much
  depending on the scale - it seems to be reproducible, but in
  any case doesn't affect the general conclusions ]

   scale    NEAREST BILINEAR   GOOD    BEST
   ----     ------  --------  -----   ------
     1.1      656     2860    15163   149147 
     1.5      350     1574     7918   108820 
     1.9      220      965     7176    89324 
     2.0       51      873     6397    95665 
     2.5       34      114     4331    71975 
     3.0       25      392     4074    67211 
     3.5       66      290     3101    60168 
     4.0       15      223     3135    59207 
     4.5       12       37     2524    58586 
     5.0       34      143     2643    56798

As you can see, for small downscales, where we used to have OK (if not
great) appearance, we're now slower by a factor of > 5x for the default
GOOD filter. BEST is basically unusable.

The size of the convolution filters for a uniform scale-down of S is NxN
where N is given by:

  GOOD: N = ceil(S + 2);
  BEST: N = ceil(S * 6 + 6);

So for BEST at a scale-down of 5x5 we're actually convolving a 36x36
filter.

Backend consistency
===================

Since this functionality is specific to pixman, it means that
the default filter CAIRO_FILTER_GOOD has radically different performance
and appearance behaviors between the xlib/xcb backends and the pixman
based backends... not only the image backend, but also the Windows
backend, etc.

An application that looks good on Windows, might look ugly under X.
An application that performed OK on X might be unusuably slow under
Windows.

Access to bilinear scaling
==========================

If we have convolution for GOOD and BEST, then it still seems to be
important to have access to BILINEAR filtering - a reliably fast (if
ugly for large downscales) method. Also a bilinear scale down by exactly
2 averages each set of 4 pixels, which is useful.

Bugs?
=====

* For some reason, the results of downscaling images don't have
  left-right symmetry. The result of scaling down:

  000000 000000 000000 000000 
  000000 ffffff ffffff 000000 
  000000 ffffff ffffff 000000 
  000000 000000 000000 000000 

  By a factor of two is:

   646464 3c3c3c 
   3c3c3c 242424 

  (I'm a bit suspicious about the fact that pixman generates
  even length filters, and wonder if that has something to do
  with the asymmetry.)

* When the filters are created, the number of subsample bits
  passed to pixman is always 1 (even for the huge BEST filters)
  This produces artifacts that drown out the choice of filter,
  and may result in results worse than bilinear filtering 
  for some scale factors.

  4 is a pretty safe value for subsample bits - though it 
  depends on a) the scale b) the image. For large scale
  downs it should be fine to use less subsample bits - and
  that's where the expense of generating more copies of the
  filter is going to be.

* In _pixman_image_set_properties computation of scale_x,
  scale_y, yx is used twice and xy not used.

* See comment on top of _cairo_pattern_analyze_filter() -
  it makes assumptions that are no longer true.

Thoughts
========

The good thing about this general approach is that it's
straightforward to understand what is going on and what results it
produces and isn't much code. And it's there in the repository, not
just theory about what we *could* do. :-)

Downsides with the general approach:

 * The approach is not efficient for large downscales, especially
   when transformed. Approaches involving:

    - Sampling from a mipmap
    - Sampling a sparse set of irregularly distributed points

   will be more efficient.

 * On a hardware backend, running even a large convolution filter in
   a shader is likely possible on current hardware, but it's not
   taking efficient use of how hardware is designed to sample images.

Here are my suggestions with an eye to getting a release of 1.14
out quickly

 * The convolution code is restricted to the BEST filter, leaving
   GOOD and BILINEAR untouched for now. The goal in the future
   is we'd try to get similar quality for all backends, whether by:

    A) Triggering a fallback
    B) Implementing convolution with the same filter
    C) Using an entirely different technique with the similar
       quality.

 * For BEST a filter is used that is more like what is used for
   GOOD in the current code - i.e. 10x slowdown from BILINEAR,
   not 100x.

 * An attempt is made to address the bugs listed above.

 * In the future, the benchmark for GOOD is downscaling by
   factors of two to the next biggest power of two, and sampling
   from that with bilinear filtering. Pixel based backends
   should do at least that well in *both* performance and
   quality.