[cairo] Concerns about using filters for downscaling
Owen Taylor
otaylor at redhat.com
Mon Mar 24 14:04:59 PDT 2014
On Sun, 2014-03-23 at 02:30 +0100, Søren Sandmann wrote:
> Owen Taylor <otaylor at redhat.com> writes:
>
> > (Søren pointed out on IRC that the pixman implemention of convolution
> > is not SIMD optimized, so their would be future opportunity to improve
> > what we can do within the constraints of 5-10x slower.)
I took a stab at implementing a pretty straightforward port of the
pixman-fast-path.c code to SSE2 - attached. I was able to get roughly
a 2x speedup:
> scale NEAREST BILINEAR GOOD BEST
> ---- ------ -------- ----- ------
> 1.1 656 2860 15163 149147
> 1.5 350 1574 7918 108820
> 1.9 220 965 7176 89324
> 2.0 51 873 6397 95665
> 2.5 34 114 4331 71975
> 3.0 25 392 4074 67211
> 3.5 66 290 3101 60168
> 4.0 15 223 3135 59207
> 4.5 12 37 2524 58586
> 5.0 34 143 2643 56798
scale NEAREST BILINEAR GOOD BEST MIPMAP
----- ------- -------- ---- ---- ------
1.1 478 3073 10267 74341 3022
1.5 264 1662 5549 53639 1680
1.9 171 1053 3467 50495 1058
2.0 60 943 3080 46139 151
2.5 40 115 2709 40281 729
3.0 29 431 1907 37104 556
3.5 55 324 1878 35542 439
4.0 19 244 1515 34569 173
4.5 16 39 1451 33327 364
5.0 30 157 1219 33239 330
I've added another column which is an implementation (in my test
program) of the technique of creating a temporary power-of-two scaled
down image and bilinear-filtering from that. As you can see, even
tossing away the mipmaps, the performance is much better than with
convolution - even though either way we have to touch all the pixels in
the source image, it's much easier to write fast, cache friendly
average-of-four-pixels code, then convolution code.
> Also, the pixman convolution filter is designed to be implemented in a
> separable way, but the current code doesn't do that. This optimization
> should be another major speedup for filters with wide support such as as
> Lanczos.
What's the idea? Just an optimization of the per-pixel convolution?
creating an intermediate image that is a 1D convolution of the source
image and then convolving that the other way? (That seems only be
possible for strict scales.)
Mostly, the timings above don't change my opinion - even if we're only
slowing down GOOD by 3-5x rather than 5-10x, that's still a huge
performance regression for image-based applications and platforms. And
even if only slowing down BEST by 30-50x rather than 50-100x, that's
still creating a trap for anybody who happened to specify it in their
application.
- Owen
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Add-a-SSE2-fast-path-for-separable-convolutions.patch
Type: text/x-patch
Size: 13432 bytes
Desc: not available
URL: <http://lists.cairographics.org/archives/cairo/attachments/20140324/3c3fb374/attachment.bin>
More information about the cairo
mailing list