[cairo] cairo-perf-trace may be stopping the timer too early for xlib backend?

Chris Wilson chris at chris-wilson.co.uk
Mon Jan 28 02:15:52 PST 2013


On Mon, Jan 28, 2013 at 06:33:33AM +0200, Siarhei Siamashka wrote:
> Hello,
> 
> I tried to benchmark some Xorg drivers on an ARM embedded system with
> cairo-perf-trace using xlib backend and got somewhat weird results.
> 
> First using the original "firefox-chalkboard" trace as a reference:
> 
> [ # ]  backend                         test   min(s) median(s) stddev. count
> [  0]     xlib           firefox-chalkboard 1527.617 1527.617   0.00%    1/1
> 
> real	25m28.359s
> user	2m15.020s
> sys	0m41.710s
> 
> There is nothing wrong with the result above, except that the
> time needed to run this benchmark is ridiculously high. It is
> one of the reasons why I'm actually using trimmed traces (done
> with "./csi-trace --trim") from the following repository:
> 
>     http://cgit.freedesktop.org/~siamashka/trimmed-cairo-traces/
> 
> And looking at the result for the trimmed trace we have a problem:
> 
> [ # ]  backend                         test   min(s) median(s) stddev. count
> [  0]     xlib         t-firefox-chalkboard    7.313    7.313   0.00%    1/1
> 
> real	0m21.237s
> user	0m0.200s
> sys	0m0.060s
> 
> Here cairo-perf-trace reports the time, which does not match reality by
> a huge margin. Digging a bit in the sources reveals that the extra time
> is spent doing "target->cleanup (args.closure)" here:
> 
>     http://cgit.freedesktop.org/cairo/tree/perf/cairo-perf-trace.c?id=1.12.8#n767
> 
> This cleanup is done after the timer has been stopped already. If we
> start profiling approximately after 10-th second, it shows that there
> are some real compositing operations still being performed in Xorg
> process, and not just pixmaps destruction:
> 
>     33.09%     Xorg  libpixman-1.so.0.28.2  [.] bits_image_fetch_general                               

Hmm, that means that the X server is replying to the Fill; GetImage;
out-of-order to the rest of the operations. Or I'm being foolish and
I've optimised away that clear inside Cairo:

diff --git a/perf/cairo-perf-trace.c b/perf/cairo-perf-trace.c
index 4b8c85e..7b103b7 100644
--- a/perf/cairo-perf-trace.c
+++ b/perf/cairo-perf-trace.c
@@ -171,10 +171,12 @@ done:
 }
 
 static void
-clear_surface (cairo_surface_t *surface)
+fill_surface (cairo_surface_t *surface)
 {
     cairo_t *cr = cairo_create (surface);
-    cairo_set_operator (cr, CAIRO_OPERATOR_CLEAR);
+    /* This needs to be an operation that the backends can't optimise away */
+    cairo_set_source_rgba (cr, 0.5, 0.5, 0.5, 0.5);
+    cairo_set_operator (cr, CAIRO_OPERATOR_OVER);
     cairo_paint (cr);
     cairo_destroy (cr);
 }
@@ -768,7 +770,7 @@ cairo_perf_trace (cairo_perf_t			   *perf,
 	    fill[i] = _cairo_time_from_s (1.e-9 * cairo_device_observer_fill_elapsed (observer));
 	    glyphs[i] = _cairo_time_from_s (1.e-9 * cairo_device_observer_glyphs_elapsed (observer));
 	} else {
-	    clear_surface (args.surface); /* queue a write to the sync'ed surface */
+	    fill_surface (args.surface); /* queue a write to the sync'ed surface */
 	    cairo_perf_timer_stop ();
 	    times[i] = cairo_perf_timer_elapsed ();
 	}

As far as I know, there is no reason for the surface destruction to
actually be synchronous, so I guess it has to that we sucessfully
optimised away the GetImage on the scratch surface and so never sent an
XRequest before trying to sync.

commit 3c18bae20ee2fea24d75f6986390ef8157d0207d
Author: Chris Wilson <chris at chris-wilson.co.uk>
Date:   Mon Jan 28 10:03:54 2013 +0000

    perf; Do not allow the backends to optimize away the clear before sync

Thanks,
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the cairo mailing list