[cairo] [Pixman] Supersampling - 2nd attempt

Tue Aug 17 10:54:44 PDT 2010

On 08/17/2010 12:44 AM, Soeren Sandmann wrote:

>          for each destination pixel
>
>                  Use transformation to find corresponding source point
>
>                  Find pixel locations surrounding that point
>
>                  Use repeat setting to map locations into the available
>                  samples
>
>                  Use filter setting to interpolate at the source point
>
>                  Composite with destination pixel into destination

You seem to have changed the word "samples" to "pixel locations" so 
perhaps this wording could describe what I want.

My understanding of the literal algorithm being attempted is that for a 
given destination pixel a set of floating-point x/y "sample locations" 
is calculated. These are then used to bilinearly-interpolate sample the 
input image and all these samples are weighed and summed to get the 
output pixel.

I think the perception is that the "set" of x/y sample locations and 
weights can be reused somehow for every output pixel and thus save on 
calculation time. It is true that this works quite well for scales 
greater than 1 with "set" of exactly one sample with a weight of 1 that 
consists of the center of the output pixel inverse transformed to the 
input image. This is what pixman is doing now.

The mistake is in thinking that this 1-sample set can be replaced with a 
larger pattern of samples and that this will somehow make scales down 
work. It will improve things, sure, and in fact the 1-sample pixman does 
a reasonable job down to .5 scale, although the phasing problem I 
describe can be seen.

> One way to collapse the three steps into one is indeed to generate a
> set of coefficients and use them to compute a weighted sum of the
> input pixels. Owen has described this process in these notes:
>
>          http://www.daimi.au.dk/~sandmann/pixbuf-transform-math.pdf

This paper is describing reconstruction filters (called F) and sampling 
filters (called G) I think correctly but it goes into the incorrect 
conclusions at the end.

The actual output is the integral of F*G over the entire area that G is 
not zero. For each pixel the ideal result is the integral of the entire 
square of the pixel. Note this is EXACTLY a square, the reconstruction F 
is handling any ideas of how a pixel contributes to the output image and 
bleeds into surrounding areas.

The function of interest is actually F*G, not F or G!

Calculating the output of F at any frequency less than 2x the image is 
going to throw away information (unless you do 1x at exactly the pixel 
centers which is what I propose). This is why any idea that involves 
calculating F independent of G will not work.

What I propose is that this be reworded so it clearly says the weights 
are an approximation of F*G for the pixel centers.

My proposal is to calculate an approximation of the integral of F*G for 
the square area of each pixel and remember these weights.

This sounds impractical at first, but the approximation is to assume F 
is centered on the pixel and sums to 1, and F*G is smooth enough that 
the approximation is equal to the value of G at the center of the pixel. 
Separability is done by reducing the projected sample area to a 
rectangle of the same area so we get two 1-d G functions. We calculate G 
at a high resolution and remember the results in an array, and then an 
input of the scale and the phase of the center of the pixel can be used 
to look up the entries that are closest to the pixel centers.

> In those notes, the resampling is based on an integral over an image
> defined on a continuous domain which in turn was the result of an
> interpolation operation. Ie., precisely what pixman does.

The error is in thinking you can sample that first interpolation.

> I'm aware that I'm not describing mip-mapping. It could make sense to
> add it at some point. To get acceptable quality out of it, you'd want
> to use trilinear filtering where sampling some location consists of
> finding the two nearest mipmaps, bilinearly interpolate in both, and
> then linearly interpolate between the two results.

Although I have not tested it for any quality, I think it may be 
possible to reuse the pixman bilinear sampling in a "sort of mipmap" 
way. A true mipmap is a waste of time if the majority of Cairo 
transforms are affine, and does a poor job if the scale is not square. 
The "sort of mipmap" way is:

1. For a given transform, decompose into two transforms: an integer 
scale where x and y scales are 1/N and 1/M, and a remaining transform 
where the determinant absolute value is as close to 1 as possible.

2. Perform the first transform as a box-filtered scale into a temporary 
buffer. This is very fast because an integer box filter means every 
pixel contributes to exactly one output pixel.

3. Use existing pixman code to apply the second transform from this 
image to the output.

4. Keep the temporary image around on the likely chance that the same 
integer scale will be used again.

I suspect this has some of the same phasing problems as what you are 
doing, and may be just as blurry. In fact it may be the equivalent 
except with the steps reversed.