[Xr] XrMatrixGetAffine() function

Thu Jul 3 13:19:18 PDT 2003

On Thu, 2003-07-03 at 12:16, Keith Packard wrote:
> Around 12 o'clock on Jul 3, Soorya Kuloor wrote:
> 
> Thanks very much for your comments:
> 
> > * Text handling does not seem to be as good as the rest of the drawing,
> > both API and speed-wise.
> 
> This is well known -- Xr is attempting to use Xft for text rendering and 
> the result is quite horrible.  I'm hoping to get a chance to write down 
> some ideas I've been mulling over to see if we can't start afresh and 
> build something that better matches the Xr style and can be implemented 
> efficiently.
> 

Great to hear that. In real time SCADA type of displays (we are using Xr
for this) text updates are done very frequently and we found that in Xr
text drawing takes up a significant portion of the update time. So this
slows down things.

> > Especially, the initial query for a given font-transform combination
> > seems to take long time too (the call into fontconfig).
> 
> Pango caches the results to make future queries faster, but fontconfig 
> itself is not likely to improve dramatically in the near future.
> 
> Drawing the text will get a lot faster as we get a chance to optimize the 
> underlying compositing routines.
> 
> > * We did a speed comparison of Xr against GDI+. For the ellipse drawing
> > test (posted by Owen) with anti-aliasing GDI+ is approx 2-2.5 times
> > faster than Xr.
> 
> That's actually very encouraging -- none of the Xr rendering codepaths 
> have seen any optimization at all, and the trapezoid filling code is 
> already scheduled for a complete rewrite which should make it a lot 
> faster.  Combine that with optimizations in the compositng code and the 
> tesselation code and getting a factor of 2 in performance should be very 
> easy.
> 

Yes, I was surprised to see this too. I would have guessed GDI+ would be
much faster than this with all the hardware acceleration MS might have
put into it. However, these timings are just on one machine.

I also wrote a test program that strokes and fills 600 ellipses
(attached below) and did a runtime profiling using valgrind. Here are
the screenshots of the profiling info (obtained using kcachegrind). The
first image cpuprof-1.jpg shows runtimes sorted based on self timing and
the other based on cumulative timings. I could not make kcachegrind to
dump to a file, so I am including screenshots. The test used an X server
with Render extension (Redhat 9, NVIDIA accelerated driver with
RenderAccel on).

Approx. 34.5% time is used by memcpy for sorting edge lists during
tesselation. Speeding up some of these parts (as you mentioned above)
for sure should sppedup Xr a lot.

-- Soorya

>
> -keith
> 
> 
> 
> _______________________________________________
> Xr mailing list
> Xr at xwin.org
> http://xwin.org/cgi-bin/mailman/listinfo/xr