[cairo] Overhead reduction
Jonathan Morton
jonathan.morton at movial.com
Mon May 18 00:27:31 PDT 2009
> > Okay, that's useful to know. An atomic-test-and-inc routine should
> > be roughly the same speed as a plain i++ on most modern hardware, it
> > just needs to be implemented properly, which is an utter pain.
> In a real-world benchmark, I lost about 90 cycles for cmpxchg based lock+unlock.
> On an old P4 its arround a few hundred cycles ;)
But that's for a full mutex, right? I just need one atomic operation,
not a whole critical section. Cmpxchg might not be the right operation
- the RISC-oriented load/store-exclusive trick would be ideal.
In any case 90 cycles would *still* be substantially faster than
malloc() or free(). I gave up caring about P4s many years ago...
--
------
From: Jonathan Morton
jonathan.morton at movial.com
More information about the cairo
mailing list