[cairo] Regarding Cairo Optimization for ARM PXA320

Sandeep Agrawal sandeep.a at samsung.com
Thu Nov 1 20:52:44 PDT 2007

Hi David,


Thanks a lot for your help. 


Your hints can finally get me started on some useful optimizations. Yes, I
can make time measurements using gettimeofday, though the maximum accuracy
possible in that is in microseconds. But I believe that should suffice on
the ARM.



Sandeep Agrawal,

Wireless Terminal Division,

Samsung India Software Operations.


From: david.freetype at gmail.com [mailto:david.freetype at gmail.com] On Behalf
Of David Turner
Sent: Thursday, November 01, 2007 2:17 PM
To: Sandeep Agrawal
Cc: Dan Amelang; cairo at cairographics.org
Subject: Re: [cairo] Regarding Cairo Optimization for ARM PXA320


Hi Sandeep,

2007/10/28, Sandeep Agrawal <sandeep.a at samsung.com>:


Actually I am unable to generate profiling data because of certain problems
with the platform I am working on. I don't have permissions to create a new
device and so opcontrol --init fails.

I am new to the text rendering field. Since Monahans does not have an FPU, I

made the assumption that the affine transformations happening in floating
point may be optimized to work in fixed point. Am I wrong in my assumption?


you would probably be able to optimize the speed of the affine
transformation itself, let's say by a factor of 2 (at the cost of reduce
precision). However, if this operation only corresponds to say, 5% of your
running time, this will only gain you 2.5% of overall time; not necessarily
a big win for what could be a lot of work.


please, please use a profiling tool, it will guide you to optimization
opportunities. more precisely, it will guide you to the low-hanging fruits,
because profile-based optimization usually leads you to some sort of
"plateau" which can only be broken by changing your internals drastically. 


if you *absolutely* cannot run a profiling tool, I advise you to write
profiling tests that basically run the same operation in loops, only varying
one parameter at a time. (yes, I'm assuming that you can at least make time
measurements in your program). 


for example, what is the time to draw 1 glyph, then the time to draw 2, 5,
10, 50, 100 ?

do these number scale linearly if all glyphs are identical ? is there a
"setup time" that is consistently larger than the increment between two
glyph counts ? 


what is the time to to draw a string of text at 10pt, 12pt, 14, etc... given
that the amount of pixels to be filled grows roughly in n^2 (where n is the
size), does the timing follow a similar curve ? 


what is the cost of rotating text ? what's the function of
performance/angle, etc...


this kind of tests can give you significant information about what's slow
and what isn't in a given library, and they're usually pretty easy to write.


and since you also have the Cairo sources to play with, feel free to put
time measurements within the library itself to get smaller grained


maybe that doesn't sound glamorous, but given the conditions you describe,
I'm pretty certain it will prove to be more fruitful than a random quest for
local optimizations...


hope this helps,


- David Turner




If so can you please give some hints as to where I can perform
optimizations? I am sure that it would be possible to do some processor
specific optimizations as well as Cairo has been written to cater to a large

Sandeep Agrawal,
Wireless Terminal Division,
Samsung India Software Operations.

-----Original Message-----
From: Dan Amelang [mailto:daniel.amelang at gmail.com ]
Sent: Saturday, October 27, 2007 12:46 AM
To: Sandeep Agrawal
Cc: cairo at cairographics.org
Subject: Re: [cairo] Regarding Cairo Optimization for ARM PXA320

On 10/26/07, Sandeep Agrawal <sandeep.a at samsung.com> wrote:
> I am currently optimizing text rendering in Cairo for the PXA320 Monahans
> processor. 
> I just want to know that is it possible to convert the double computations
> (specifically affine) to 16.16 or 26.6 fixed point format (any chance of
> overflow or precision loss?)

Yes, there is a chance of overflow and precision loss. 

> or do I have to go with processor specific
> assembly optimizations?

I don't see how this is an either/or situation. How did converting
doubles to fixed-point and using assembly optimizations become your 
two options for text rendering improvement?

I assume that your profiles show that floating-point emulation is the
bottleneck? Can you share with us your data?


cairo mailing list
cairo at cairographics.org


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.cairographics.org/archives/cairo/attachments/20071102/0ffaf3e2/attachment.html 

More information about the cairo mailing list