mike.steinert at gmail.com
Mon Apr 29 10:52:55 PDT 2013
On Thu, Apr 25, 2013 at 1:59 PM, Siarhei Siamashka <
siarhei.siamashka at gmail.com> wrote:
> I tried to look for some documentation about Broadcom 7420, but it
> appears to be scarce. This was possibly the best information I could
> And it says "This CPU implements the MIPS32 v2 instruction set with
> Broadcom-specific DSP and multi-threading extensions."
> Pixman library (cairo software rendering backend) supports some
> assembly optimizations for MIPS32r2 and DSP ASE thanks to the
> ongoing work of Nemanja Lukic. These optimizations may potentially
> provide some performance improvement on your hardware, but are only
> enabled for MIPS 74K cores in the runtime detection code:
> Tweaking the runtime detection code, you can also enable it on your
> CPU. I'm a bit worried about the "broadcom-specific" description of
> the DSP extensions in BCM7420. But it might still make sense to give
> it a try. Also some of the optimizations (simple blits and fills) in
> fact do not use DSP instructions and should also run fine on any
> MIPS32 core.
Thanks for your suggestions! Unfortunately your concerns about the chip are
valid. It's actually an r1 processor. Broadcom has supposedly implemented
some of the DSP extensions on a separate DSP chip, although I haven't had
any luck enabling them up to this point. I'll have to talk with support and
see what I can figure out. For the record I did enable this code on my
board but it crashed horribly with illegal instructions. I'll have to take
a look at sorting out which code will work on MIPS32.
> Mixing software rendering and hardware acceleration sometimes may not
> provide the best performance. If you have to do cache flush/invalidate
> operations too often, they can diminish any benefits of hardware
> accelerated solid fills and simple non-scaled blits.
> If you want a fair comparison of the hardware acceleration vs. software
> rendering backend, using assembly optimized blits and fills (with optimal
> prefetch, etc.) is quite important. Generic C implementation of blits
> and fills may be severely underperforming.
I agree, this is good advice. I've run similar tests on previous platforms
using the generic C implementations. In those cases using the 2D hardware
for blits and fills was necessary to get any reasonable performance from
Cairo, at least for my use case. I think the key will be for me to figure
out which parts of the ASM fast-path code I can use on my chip.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the cairo