[cairo-bugs] [Bug 90303] New: ATOMIC_OP_NEEDS_MEMORY_BARRIER doesn't play well with ThreadSanitizer

Mon May 4 08:38:48 PDT 2015

https://bugs.freedesktop.org/show_bug.cgi?id=90303

            Bug ID: 90303
           Summary: ATOMIC_OP_NEEDS_MEMORY_BARRIER doesn't play well with
                    ThreadSanitizer
           Product: cairo
           Version: unspecified
          Hardware: Other
                OS: All
            Status: NEW
          Severity: normal
          Priority: medium
         Component: general
          Assignee: chris at chris-wilson.co.uk
          Reporter: froydnj at gmail.com
        QA Contact: cairo-bugs at cairographics.org

ThreadSanitizer (TSan, https://code.google.com/p/thread-sanitizer/) is a tool
for detecting data races.  Unlike Helgrind
(http://www.valgrind.org/info/tools.html), TSan works by having the compiler
instrument the code as it's being compiled, which means that it natively
understand things like atomic instructions, and requires many less manual
annotations than Helgrind.

TSan's instrumentation-based approach doesn't work very well with Cairo's
atomic ops implementations, particularly _cairo_atomic_{int,ptr}_get:

http://cgit.freedesktop.org/cairo/tree/src/cairo-atomic-private.h#n62

For TSan's race detection to understand atomic loads and stores, it really
wants to see compiler intrinsics used to handle loads and stores to memory
locations.  The above implementations of _cairo_atomic_{int,ptr}_get fail in
two ways:

- For ATOMIC_OP_NEEDS_MEMORY_BARRIER implementations, TSan doesn't see that
there's a memory barrier (__sync_synchronize) prior to the load, so it
considers the loads after the memory barrier to be synchronized, causing
(possibly spurious) data races to be reported.  (It's not immediately obvious
to me why the barrier is prior to the load; most (all?) non-x86 architectures
put a memory barrier *after* the load for a load acquire, and memory barriers
before and after the load for sequentially consistent loads.)

- For architectures where ATOMIC_OP_NEEDS_MEMORY_BARRIER is false, the loads
are completely unsynchronized, again causing spurious data races.  TSan would
be happy here if the load went through something like __atomic_read_n(x,
__ATOMIC_RELAXED) or similar on newer GCC/Clang.

Unfortunately, the __sync* family of intrinsics doesn't provide enough control
to solve the first issue without some small speed penalty: there are no direct
load instructions in the __sync* builtins, so you'd have to do something like
__sync_fetch_and_add(x, 0).

If the compiler is new enough, it would be great if cairo-atomic-private.h
would use the __atomic* family of intrinsics, specifying the intended memory
model for these get operations, and then let the compiler take care of
generating efficient code for all architectures, rather than the fragile
ATOMIC_OP_NEEDS_MEMORY_BARRIER.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cairographics.org/archives/cairo-bugs/attachments/20150504/45c3f1f1/attachment.html>