[cairo] Concerns about using filters for downscaling

Mon Mar 24 14:04:59 PDT 2014

On Sun, 2014-03-23 at 02:30 +0100, Søren Sandmann wrote:
> Owen Taylor <otaylor at redhat.com> writes:
> 
> > (Søren pointed out on IRC that the pixman implemention of convolution
> > is not SIMD optimized, so their would be future opportunity to improve
> > what we can do within the constraints of 5-10x slower.)

I took a stab at implementing a pretty straightforward port of the
pixman-fast-path.c code to SSE2 - attached. I was able to get roughly
a 2x speedup:

>    scale    NEAREST BILINEAR   GOOD    BEST
>    ----     ------  --------  -----   ------
>      1.1      656     2860    15163   149147 
>      1.5      350     1574     7918   108820 
>      1.9      220      965     7176    89324 
>      2.0       51      873     6397    95665 
>      2.5       34      114     4331    71975 
>      3.0       25      392     4074    67211 
>      3.5       66      290     3101    60168 
>      4.0       15      223     3135    59207 
>      4.5       12       37     2524    58586 
>      5.0       34      143     2643    56798

    scale    NEAREST BILINEAR   GOOD    BEST     MIPMAP
    -----    ------- --------   ----    ----     ------
      1.1      478     3073    10267    74341     3022 
      1.5      264     1662     5549    53639     1680 
      1.9      171     1053     3467    50495     1058 
      2.0       60      943     3080    46139      151 
      2.5       40      115     2709    40281      729 
      3.0       29      431     1907    37104      556 
      3.5       55      324     1878    35542      439 
      4.0       19      244     1515    34569      173 
      4.5       16       39     1451    33327      364 
      5.0       30      157     1219    33239      330 

I've added another column which is an implementation (in my test
program) of the technique of creating a temporary power-of-two scaled
down image and bilinear-filtering from that. As you can see, even
tossing away the mipmaps, the performance is much better than with
convolution - even though either way we have to touch all the pixels in
the source image, it's much easier to write fast, cache friendly
average-of-four-pixels code, then convolution code.

> Also, the pixman convolution filter is designed to be implemented in a
> separable way, but the current code doesn't do that. This optimization
> should be another major speedup for filters with wide support such as as
> Lanczos.

What's the idea? Just an optimization of the per-pixel convolution?
creating an intermediate image that is a 1D convolution of the source
image and then convolving that the other way? (That seems only be
possible for strict scales.)

Mostly, the timings above don't change my opinion - even if we're only
slowing down GOOD by 3-5x rather than 5-10x, that's still a huge
performance regression for image-based applications and platforms. And
even if only slowing down BEST by 30-50x rather than 50-100x, that's
still creating a trap for anybody who happened to specify it in their
application.

- Owen

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Add-a-SSE2-fast-path-for-separable-convolutions.patch
Type: text/x-patch
Size: 13432 bytes
Desc: not available
URL: <http://lists.cairographics.org/archives/cairo/attachments/20140324/3c3fb374/attachment.bin>