[cairo] Lots of text API pushed
Behdad Esfahbod
behdad at behdad.org
Fri Aug 8 09:04:50 PDT 2008
On Sat, 2008-08-09 at 00:05 +0930, Adrian Johnson wrote:
> Behdad Esfahbod wrote:
> > Things that need a fix before releasing 1.7.2 (hopefully tomorrow):
> >
> > - test/user-font text is coming out with the wrong color in PDF. Only
> > happens with this test. So, should be some bug with Type3 fonts.
>
> This was recently fixed in poppler. I am using the latest git poppler
> and it is working for me.
Right. Now I remember.
> > - test/user-font-proxy text is coming out as bitmap glyphs in PDF.
>
> I made text in user-font glyph use the fallback path as currently cairo
> can not add glyphs to subsets at the same time as the subsets are being
> embedding in the PDF. Some refactoring is needed to make this possible.
Ah, I see.
> > - Decide what to do with zero-glyph clusters. PDF tries to generate
> > ActualText for them, but that can't really work without something inside
> > the ActualText. We can disallow zero-glyph clusters and then I will
> > handle them in Pango using a user-font with no drawings.
>
> I thought I had zero-glyph clusters working but on further investigation
> it appears it only works for extracting text with pdftotext (which I did
> most of my testing with). Copying and pasting the text from evince or
> acroread ignores the ActualText spans with no glyphs.
>
> I have a sample PDF file created by Adobe InDesign that uses ActualText
> to insert tabs and newlines in the extracted text to ensure tables are
> correctly extracted. Looking at it more closely the ActualText for each
> tab is inline with the content (like we do in cairo) and prints a space
> glyph. The ActualText for the newline is part of a tagged text structure
> and has no glyphs.
>
> So it looks like acroread only supports zero-glyph ActualText entries
> that are part of the tagged text structure. With some work the PDF
> backend could be changed to use tagged text for zero-glyph ActualText
> clusters. There are benefits to supporting tagged text in cairo and
> poppler such as allowing text reflow in PDF viewers. However I do not
> have any plans to do this work for in time for 1.8.0.
>
> Could Pango print a space glyph in each zero-glyph cluster and adjust
> the position of the next glyph? This would use a lot less space in the
> PDF file than changing the font twice and would potentially be more
> efficient for viewers as well.
I'll document that zero-glyph clusters don't work great then.
Humm, space doesn't work as the zero-glyph clusters have varying width.
We need a new glyph for each width. Right?
Also, does this commit look right to you:
http://cgit.freedesktop.org/cairo/commit/?id=38c5f0d49b2ce1a6146cbea5ec3376a52cac8e68
--
behdad
http://behdad.org/
"Those who would give up Essential Liberty to purchase a little
Temporary Safety, deserve neither Liberty nor Safety."
-- Benjamin Franklin, 1759
More information about the cairo
mailing list