[cairo-bugs] [Bug 105294] pdftocairo -pdf inverts image color in this PDF

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Fri Jun 15 19:03:12 UTC 2018


https://bugs.freedesktop.org/show_bug.cgi?id=105294

--- Comment #11 from Allan Haldane <ealloc at gmail.com> ---
So, I've figured out that almost all image viewers invert all CMYK jpegs,
regardless of Adobe header. This includes eog/gdk/imagemagik/Pillow/gimp. The
exception is poppler which does not invert. This is opposite of what I
expected. Here's how I figured it out:

I started with one of the cmyk jpegs we are discussing which appears with a
white background when inside a PDF without the "Decode" line. When I copy the
raw jpeg data from the pdf to a file and view it with eog/gimp/Pillow, it is
inverted and has a black background.

Next I looked at the raw output buffer of libjpeg, which all the programs use.
The libjpeg doc has a big warning that it absolutely never inverts cmyk, and it
is up to the user to invert adobe cmyk jpegs. The raw libjpeg output for my
image has values cmyk 0/0/0/0 which is plain white. So, surprisingly, my raw
jpeg actually has non-inverted colors! However all the image viewers show it as
black, so they must be inverting.

I was able to understand the GDK-pixbuf (used by eog) and Pillow (Python Image
Library) code. It is clear in both cases that all CMYK jpegs are interpreted as
"CMYK;I", that is CMYK inverted. GDK even has a comment in the code: /* We now
assume that all CMYK JPEG files use inverted CMYK, as Photoshop does See
https://bugzilla.gnome.org/show_bug.cgi?id=618096 */. 

Poppler is the odd one out because it does not invert, apparently. In other
words cmyk jpegs embedded in pdfs have no inversion. When you extract them (eg
using pdfimages), you obtain an uninverted jpeg, but when you view it in any
jpeg viewer I have tried, the viewer inverts it! It is still mysterious to me
why cmyk jpegs only in pdfs are uninverted, but for whatever reason, they are.

I think the lesson of all of this is: We should never invert based on the adobe
header, since all other imaging programs I looked at ignore it (for the purpose
of inversion, at least). Significantly this includes poppler. I think we should
revert the added Decode line.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cairographics.org/archives/cairo-bugs/attachments/20180615/d265e683/attachment.html>


More information about the cairo-bugs mailing list