[cairo] Overhead reduction
jonathan.morton at movial.com
Fri May 22 02:43:26 PDT 2009
> > The image patch, which implements an object pool, is probably more
> > controversial and certainly a lot bigger. However, under my tests it is
> > very very nearly as fast as the previous unprotected stack, while being
> > threadsafe (unless there are bugs I haven't spotted). The object pool
> > should be reusable for other fixed-size objects without change.
> First, all this image creation activity comes from the X server, which
> creates and destroys up to three pixman images per Render request. I
> think the best way to fix this would be to keep one pixman_image_t
> around per Render Picture, and just update it as the Picture
> properties change. This is a fair amount of work in the X server, but
> it avoids not only the malloc() overhead, but also the initialization
> of all the fields in the image.
If the X server kept the pixman_image_t on the stack and was able to
initialise it by reference, that would also be a win which avoids malloc
and doesn't depend too much on pixman internal behaviour.
But both of these ideas involve modifying the X server in potentially
> That said, I do think an image pool makes sense, but since
> pixman_image_t is the only dynamically allocated, fixed-size object we
> have at the moment, a complete framework for object pools feels like
> overkill to me.
It did strike me as quite a large patch for such a "simple" task, yes.
I think most of it is comments though, and it boils down to a very small
binary-size increment. :-) Simplifying the copyright notice would cut
it down significantly.
I think the reason I thought a dedicated object pool would be useful, is
that the X server does a lot of small, fixed-sized mallocs in certain
situations. Admittedly, I did a lot to eliminate them in my companion
patch, but I haven't determined for certain yet whether X needs to be
thread-safe internally. (I suspect not, in which case the trivial
implementation I started with would be better.)
I *did* think about using thread-local storage, but I shied away from it
because I wasn't sure how widely supported it was. The per-thread
leakage argument is another good reason to be cautious. I think TLS
would make a lot more sense if contention was typically high (because I
suspect it scales better), but I don't think it is in practice.
> Given that, I think I'd prefer to just have a free-list protected by a
> mutex. We can use the code from cairo-mutex-impl-private.h; everybody
> who committed to that file agreed to a relicense to MIT.
The problem with a mutex is that it's a surprisingly big overhead to do
frequently, and is a large part of the overhead on the more efficient
mallocs, which is what we're trying to avoid. The main advantage of a
mutex, of course, is that it's a simple concept to understand and use.
But if there's no contention, it's still two atomic operations per
lock/unlock cycle, which as some people have mentioned, is an especially
large overhead on some particular (badly designed but still common)
processors. The dedicated pool implementation only uses one of these.
Mind you, all this would be much easier if there was a library which did
the low-level part for us. Then we'd be able to do the Right Thing
without worrying about bloat, and other projects could re-use the work.
Pthreads doesn't count; the mutex and (particularly) semaphores in there
are notoriously slow. This might be less of a problem on recent
Linux/Glibc given NPTL, futex and the kernel atomic primitives ABI, but
it's likely that other platform users would lynch us.
It looks like the OpenPA library tries to do this. Unfortunately, I
quickly read through the PowerPC primitives (they don't have native ARM
ones) and found at least one potentially serious bug related to an
uninitialised variable. It's fixable, but it doesn't inspire
confidence, given that debugging that kind of error would be extremely
difficult if it showed up in-the-wild. (Yes, I reported the bug.)
From: Jonathan Morton
jonathan.morton at movial.com
More information about the cairo