[cairo] Reduce number of floating point operations

Jorn Baayen jorn at openedhand.com
Tue Nov 21 05:56:52 PST 2006


Hi,

On Mon, 2006-09-25 at 11:26 -0700, Carl Worth wrote:

> > Unfortunately, I'm very busy at work lately, so I don't have time to
> > work on this patch and won't have it in near future. It would be nice if
> > someone could continue to work on this patch.
> 
> OK. So, thanks for the patches. There's definitely lots of useful
> stuff here. I'll see if I can get some time to clean some of it up and
> get it in this week or next, (though, don't let that stop anybody from
> taking a whack at it).

I split out Aivars' patch into 3 patches as suggested and tried to
incorporate the changes you suggested in your review. I'm sending the
first two for review here, along with cairo-perf-diffs run on ARM.

 o is-identity.diff replaces the 6 FP operations to check whether
   a matrix is an identity matrix with a call to memset().

   This is not always faster.

 o glyphs-transform.diff extends is-identity.diff with glyph
   transformation optimizations.

   This is always faster, but less so then is-identity.diff when
   it is faster.

The cairo-perf-diff figures are rather surprising, but this does not
seem to be caused by differences in circumstances as re-running
cairo-perf repeatedly results in the same figures.

Thanks,

Jorn

> 
> -Carl
> _______________________________________________
> cairo mailing list
> cairo at cairographics.org
> http://cairographics.org/cgi-bin/mailman/listinfo/cairo
-- 
OpenedHand Ltd.
http://o-hand.com/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: is-identity.diff
Type: text/x-patch
Size: 1136 bytes
Desc: not available
Url : http://lists.freedesktop.org/archives/cairo/attachments/20061121/7d83c377/is-identity-0001.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: glyphs-transform.diff
Type: text/x-patch
Size: 8072 bytes
Desc: not available
Url : http://lists.freedesktop.org/archives/cairo/attachments/20061121/7d83c377/glyphs-transform-0001.bin
-------------- next part --------------
Speedups
========
image-rgba   text_similar_rgba_source-64    35.37 1.21% ->  30.28 2.76%:  1.17x speedup
?
image-rgba   text_similar_rgba_source-128   68.65 2.03% ->  59.57 1.94%:  1.15x speedup
?
image-rgb      text_similar_rgba_over-64    31.85 1.29% ->  27.72 2.39%:  1.15x speedup
?
image-rgba        text_image_rgb_over-64    27.22 2.15% ->  23.82 2.04%:  1.14x speedup
?
image-rgba       text_image_rgba_over-128   57.00 0.43% ->  50.00 0.80%:  1.14x speedup
?
image-rgb        text_solid_rgba_over-64    31.71 1.34% ->  28.23 1.18%:  1.12x speedup
?
image-rgb       text_similar_rgb_over-256  221.81 0.54% -> 197.62 0.77%:  1.12x speedup
?
image-rgb         text_image_rgb_over-64    31.25 0.66% ->  27.91 0.68%:  1.12x speedup
?
image-rgb        text_image_rgba_over-64    31.74 0.43% ->  28.37 0.81%:  1.12x speedup
?
image-rgb       text_similar_rgb_over-64    31.36 1.14% ->  28.05 0.43%:  1.12x speedup
?
image-rgb        text_image_rgba_over-256  216.93 0.57% -> 194.62 0.83%:  1.11x speedup
?
image-rgb       text_solid_rgb_source-64    39.21 0.74% ->  35.22 0.97%:  1.11x speedup
?
image-rgb         text_image_rgb_over-128   57.25 0.84% ->  51.64 1.14%:  1.11x speedup
?
image-rgb      text_solid_rgba_source-256  283.74 0.33% -> 256.60 1.60%:  1.11x speedup
?
image-rgb        text_image_rgba_over-128   57.31 0.70% ->  51.88 1.46%:  1.10x speedup
?
image-rgba        text_solid_rgb_over-256  217.21 0.29% -> 196.75 0.76%:  1.10x speedup
?
image-rgba     text_image_rgba_source-64    31.48 1.88% ->  28.53 0.83%:  1.10x speedup
?
image-rgba      text_solid_rgb_source-256  276.64 0.30% -> 250.76 0.83%:  1.10x speedup
?
image-rgb     text_linear_rgba_source-128   73.37 0.64% ->  66.57 2.05%:  1.10x speedup
?
image-rgba     text_solid_rgba_source-256  276.17 1.34% -> 250.75 0.64%:  1.10x speedup
?
image-rgba      text_image_rgb_source-64    34.67 1.99% ->  31.48 0.91%:  1.10x speedup
?
image-rgb      text_linear_rgb_source-256  284.46 0.44% -> 258.30 1.96%:  1.10x speedup
?
image-rgb      text_solid_rgba_source-64    39.17 0.99% ->  35.58 1.03%:  1.10x speedup
?
image-rgb         text_image_rgb_over-256  220.19 0.33% -> 200.12 0.75%:  1.10x speedup
?
image-rgb       text_linear_rgba_over-64    33.12 0.83% ->  30.13 0.90%:  1.10x speedup
?
image-rgb         text_solid_rgb_over-256  218.96 0.81% -> 199.24 0.89%:  1.10x speedup
?
image-rgba     text_similar_rgba_over-256  218.31 0.15% -> 199.04 0.72%:  1.10x speedup
?
image-rgb      text_similar_rgba_over-128   56.98 0.71% ->  52.02 0.34%:  1.10x speedup
?
image-rgb    text_similar_rgba_source-64    37.97 0.81% ->  34.71 0.31%:  1.09x speedup
?
 xlib-rgb        text_image_rgba_over-64    55.52 1.10% ->  50.83 0.96%:  1.09x speedup
?
image-rgb      text_similar_rgba_over-256  216.87 0.62% -> 198.71 0.84%:  1.09x speedup
?
image-rgb      text_image_rgba_source-64    37.35 1.94% ->  34.24 0.74%:  1.09x speedup
?
image-rgb       text_linear_rgba_over-256  239.66 0.16% -> 219.86 1.42%:  1.09x speedup
?
image-rgb       text_image_rgb_source-256  276.10 0.30% -> 253.51 1.29%:  1.09x speedup
?
image-rgb     text_radial_rgba_source-256  336.39 0.57% -> 309.42 1.53%:  1.09x speedup
?
image-rgb       text_image_rgb_source-64    37.97 0.45% ->  34.94 0.58%:  1.09x speedup
?
image-rgb     text_linear_rgba_source-256  286.50 0.13% -> 264.12 0.47%:  1.08x speedup
?
image-rgba       text_linear_rgb_over-128   61.44 1.15% ->  56.65 1.18%:  1.08x speedup
?
 xlib-rgb      text_similar_rgba_over-64    49.12 0.92% ->  45.29 0.87%:  1.08x speedup
?
image-rgb     text_linear_rgba_source-64    38.50 0.93% ->  35.51 0.75%:  1.08x speedup
?
image-rgb        text_linear_rgb_over-256  240.02 0.35% -> 221.59 0.47%:  1.08x speedup
?
 xlib-rgb       text_similar_rgb_over-64    49.09 0.86% ->  45.32 1.29%:  1.08x speedup
?
image-rgb        text_solid_rgba_over-256  218.28 0.40% -> 201.85 0.78%:  1.08x speedup
?
image-rgb       text_solid_rgb_source-256  281.42 0.48% -> 260.26 1.07%:  1.08x speedup
?
image-rgb       text_similar_rgb_over-128   56.90 0.59% ->  52.63 0.87%:  1.08x speedup
?
image-rgb      text_radial_rgb_source-256  336.50 0.36% -> 311.89 0.35%:  1.08x speedup
?
image-rgb       text_image_rgb_source-128   71.99 0.49% ->  66.75 1.88%:  1.08x speedup
?
image-rgb     text_similar_rgb_source-128   72.28 0.51% ->  67.08 1.04%:  1.08x speedup
?
image-rgba        text_image_rgb_over-256  218.91 0.24% -> 203.22 0.34%:  1.08x speedup
?
image-rgb        text_solid_rgba_over-128   55.82 0.64% ->  51.85 0.94%:  1.08x speedup
?
 xlib-rgba     text_image_rgba_source-64    68.36 0.31% ->  63.50 0.70%:  1.08x speedup
?
 xlib-rgba        text_solid_rgb_over-256  359.27 0.19% -> 333.96 0.60%:  1.08x speedup
?
 xlib-rgb        text_solid_rgba_over-64    52.70 0.75% ->  48.99 1.37%:  1.08x speedup
?
image-rgb      text_image_rgba_source-256  275.54 0.38% -> 256.39 0.77%:  1.07x speedup
?
image-rgb      text_image_rgba_source-128   70.61 0.44% ->  65.73 0.85%:  1.07x speedup
?
 xlib-rgba       text_solid_rgba_over-256  357.54 0.27% -> 332.91 0.31%:  1.07x speedup
?
image-rgba      text_radial_rgba_over-64    48.53 1.08% ->  45.20 0.63%:  1.07x speedup
?
 xlib-rgb         text_image_rgb_over-64    55.16 0.79% ->  51.40 0.84%:  1.07x speedup
?
image-rgba       text_linear_rgb_over-256  237.49 0.11% -> 221.34 0.71%:  1.07x speedup
?
image-rgba     text_linear_rgb_source-256  281.44 0.36% -> 262.69 0.55%:  1.07x speedup
?
image-rgb      text_solid_rgba_source-128   72.82 0.33% ->  68.01 0.87%:  1.07x speedup
?
image-rgba      text_linear_rgba_over-256  236.22 0.42% -> 220.80 0.27%:  1.07x speedup
?
image-rgb    text_similar_rgba_source-128   71.66 0.66% ->  67.03 1.06%:  1.07x speedup
?
 xlib-rgba      text_similar_rgb_over-256  329.08 0.18% -> 307.87 0.11%:  1.07x speedup
?
 xlib-rgba       text_solid_rgba_over-128   93.25 0.47% ->  87.29 0.35%:  1.07x speedup
?
image-rgba       text_image_rgba_over-256  215.88 0.69% -> 202.12 0.77%:  1.07x speedup
?
image-rgba    text_linear_rgba_source-256  280.03 0.51% -> 262.47 0.57%:  1.07x speedup
?
 xlib-rgba     text_similar_rgba_over-256  326.18 0.30% -> 305.98 0.36%:  1.07x speedup
?
 xlib-rgb        text_solid_rgba_over-128  108.22 0.16% -> 101.63 0.79%:  1.06x speedup
?
 xlib-rgba     text_solid_rgba_source-64    62.48 0.57% ->  58.75 0.55%:  1.06x speedup
?
image-rgba      text_similar_rgb_over-256  216.68 0.79% -> 204.09 0.22%:  1.06x speedup

image-rgb       text_linear_rgba_over-128   60.86 0.57% ->  57.35 0.38%:  1.06x speedup

 xlib-rgba      text_similar_rgb_over-128   85.31 0.52% ->  80.42 0.47%:  1.06x speedup

image-rgb        text_linear_rgb_over-128   61.00 0.63% ->  57.52 0.38%:  1.06x speedup

 xlib-rgba      text_solid_rgb_source-256  463.44 0.09% -> 437.17 0.22%:  1.06x speedup

image-rgba    text_radial_rgba_source-256  331.09 0.30% -> 312.80 0.54%:  1.06x speedup

 xlib-rgba    text_linear_rgba_source-64    71.65 0.70% ->  67.71 0.44%:  1.06x speedup

image-rgb     text_similar_rgb_source-256  276.20 0.59% -> 261.27 0.35%:  1.06x speedup

image-rgba     text_radial_rgb_source-256  331.32 0.43% -> 313.76 0.62%:  1.06x speedup

 xlib-rgba   text_similar_rgba_source-256  433.08 0.27% -> 410.24 0.56%:  1.06x speedup

image-rgb     text_radial_rgba_source-128   85.27 0.30% ->  80.79 1.38%:  1.06x speedup

image-rgb      text_linear_rgb_source-128   73.03 0.16% ->  69.25 0.29%:  1.05x speedup

 xlib-rgba     text_solid_rgba_source-256  461.81 0.48% -> 437.95 0.07%:  1.05x speedup

image-rgba       text_solid_rgba_over-256  215.13 0.45% -> 204.06 0.66%:  1.05x speedup

 xlib-rgba     text_similar_rgba_over-128   84.71 0.49% ->  80.36 0.57%:  1.05x speedup

image-rgb        text_radial_rgb_over-64    52.44 0.31% ->  49.79 0.58%:  1.05x speedup

image-rgba    text_similar_rgb_source-256  270.74 1.10% -> 257.06 0.19%:  1.05x speedup

 xlib-rgb         text_solid_rgb_over-256  419.24 0.37% -> 398.12 0.27%:  1.05x speedup

 xlib-rgba    text_similar_rgb_source-256  435.90 0.24% -> 414.03 0.21%:  1.05x speedup

 xlib-rgb     text_linear_rgba_source-64    79.89 0.63% ->  75.88 0.76%:  1.05x speedup

 xlib-rgb        text_solid_rgba_over-256  419.40 0.30% -> 398.44 0.25%:  1.05x speedup

 xlib-rgb      text_similar_rgba_over-256  389.60 0.21% -> 370.37 0.18%:  1.05x speedup

image-rgb       text_radial_rgba_over-64    51.91 0.90% ->  49.38 0.57%:  1.05x speedup

image-rgb     text_radial_rgba_source-64    41.02 1.22% ->  39.04 0.36%:  1.05x speedup

Slowdowns
=========
 xlib-rgb     paint_linear_rgb_source-512  314.67 0.06% -> 368.85 0.11%:  1.17x slowdown
?
 xlib-rgb    paint_linear_rgba_source-512  316.66 0.06% -> 368.74 0.24%:  1.16x slowdown
?
 xlib-rgb       paint_linear_rgb_over-512  336.98 0.11% -> 390.83 0.13%:  1.16x slowdown
?
 xlib-rgb      paint_linear_rgba_over-512  647.53 0.22% -> 699.27 0.22%:  1.08x slowdown
?
-------------- next part --------------
Speedups
========
 xlib-rgba       fill_image_rgba_over-256   41.33 0.67% ->  37.82 1.17%:  1.09x speedup
?
 xlib-rgba        fill_image_rgb_over-256   42.24 0.82% ->  38.90 1.12%:  1.09x speedup
?
 xlib-rgba     fill_image_rgba_source-256   55.00 0.81% ->  51.63 0.34%:  1.07x speedup
?
 xlib-rgba      fill_image_rgb_source-256   55.79 0.34% ->  52.83 0.73%:  1.06x speedup



More information about the cairo mailing list