[cairo] Malloc profiler/callgraph

Chris Wilson chris at chris-wilson.co.uk
Sun Mar 11 18:09:27 PDT 2007


Recently, Behdad has turned his attention to reducing the number of
allocations Cairo makes. In order to measure his progress, he wrote a
tool to hook into malloc and record the callers. Unfortunately in order
to get the best results, he needed to modify the source. As an
alternative, I present this valgrind skin. It is mostly based on the
massif skin, in that it overloads the mallocfree functions and records
the entire stacktrace and accumulates statistics for each unique trace.
At the end it will print a table of the allocators (the function that
called malloc, or rather the first function not listed among --alloc-fn
ala massif) and it will dump out the unique stack traces to a file. At
the moment, I have not translated this output into any common format (I
was thinking of writing it in a callgrind.out format so as to use it in
kcachegrind) and instead include a very simple mp-gui.py to read in the
stack traces and provide a means of reviewing the results.

The patch is relative to valgrind's svn trunk. Apply, reconfigure and
make install. Usage is similar to other valgrind skins:
$ valgrind --tool=memprof --help
$ valgrind --tool=memprof ./cairo-perf

And the output is:
==18877== 216 distinct allocators.
==18877== nBlocks	nBytes		nReallocs  Lifespan (ms)
	...
==18877== 484,619	1,030,781,440	0 1      _cairo_traps_add_trap_from_points [cairo-traps.c::193]
==18877== 528,888	21,155,520	0 0      _cairo_pixman_format_create_masks [icformat.c::102]
==18877== 528,916	69,816,912	0 62     pixman_image_createForPixels [icimage.c::76]
==18877== 598,300	4,786,400	0 2      _cairo_freelist_alloc [cairo-freelist.c::52]
==18877== 967,584	290,275,200	0 2      _cairo_path_fixed_move_to [cairo-path-fixed.c::199]
==18877== 1,408,396	361,163,776	0 0      _cairo_spline_add_point [cairo-spline.c::110]
==18877== 1,763,825	32,374,496	0 2      skip_list_insert [cairo-skiplist.c::293]
==18877== 10,943,515	4,330,076,529	0 145    (total)

The downside to this tool is that it incurs an order of magnitude
performance overhead, which is a nuisance as before it extracted the
stack for each unique callsite it was only about a factor of 3-4 slower.

I hope you find this a useful little tool.
Happy Profiling!
--
Chris Wilson
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vg-memprof.patch.gz
Type: application/octet-stream
Size: 9028 bytes
Desc: not available
Url : http://lists.freedesktop.org/archives/cairo/attachments/20070312/5bb0fafd/vg-memprof.patch.obj


More information about the cairo mailing list