[cairo] Fixing concurrency bugs in cairo's reference counting

Kristian Høgsberg krh at bitplanet.net
Fri Dec 15 08:08:43 PST 2006


On 12/14/06, Carl Worth <cworth at cworth.org> wrote:
...
> Similarly, we test the value for 0 after decrementing. It's harder to
> guarantee that that's safe, but I think it is. The thing to look out
> for is a race condition between acquiring a pointer to an object and
> incrementing the reference count. If another thread can decrement the
> reference counter to 0 during that window we lose. (And note that
> Monty's mutex locks wouldn't even help here.) So I see three different
> ways to acquire the pointer:
>
> 1. create()
...
> 2. from an "unshared" source, (another object that's referencing it
>    already)
...
> 3. from a "shared" source, (a cache say)

I think a simpler and more consistent (and thus easier to remember)
rule would be to just always take a ref on behalf of the caller.  Or
another way to phrase this is, you should never dereference a pointer
(including adding a reference) to an object that you don't have a
reference to.

The create case already does this - the object return has a reference
belonging to the caller.

In the unshared resource case, it requires you to destroy the resource
later, but the semantics are much cleaner.  If you don't take a
reference, the life span of the returned object is implicitly tied to
the life span of the unshared resource it came from.

Finally, in the shared resource, it's pretty much the only way to go.
As you say, it requires locking, but what lock is the user supposed to
take?  For example, in the font cache case, some thread creating a new
font might cause the font you just looked up to disappear.  To prevent
this, there has to be a lock that font creation and font cache lookup
shares, and you have to know which entry points in the cairo API can
evict a font.  This is not feasible.  The only way to deal with this
is to take a reference on behalf of the caller while holding the lock
protecting the cache (or other global data structure we're
traversing).  While holding the data structure lock, we know that all
objects in the structure have at least one reference owned by the data
structure, and we can safely add another for the caller.  Once we lift
the lock, all bets are off and it's too late to ref the looked up
object.

I guess this does break API, so I'm not really sure what to do here,
but I thought I'd bring it up.

cheers,
Kristian


More information about the cairo mailing list