[cairo] Lots of text API pushed
ajohnson at redneon.com
Fri Aug 8 18:15:46 PDT 2008
Behdad Esfahbod wrote:
>> Could Pango print a space glyph in each zero-glyph cluster and adjust
>> the position of the next glyph? This would use a lot less space in the
>> PDF file than changing the font twice and would potentially be more
>> efficient for viewers as well.
> I'll document that zero-glyph clusters don't work great then.
> Humm, space doesn't work as the zero-glyph clusters have varying width.
> We need a new glyph for each width. Right?
I am not understanding what the issue is. cairo_show_text_glyphs()
specifies the position of each glyph so you can set the position of the
next glyph after the zero-glyph cluster to make the zero-glyph cluster
whatever width you want.
For example if we are displaying the glyphs "abde" but want the text
extracted to be "abcde", using a zero-glyph cluster to insert the "c" in
the extracted text, the pdf would be:
(ab) Tj /Span << /ActualText (c) >> BDC EMC (de) Tj
If we instead use a cluster with one space glyph that maps to the "c"
and adjust the position of the "d" glyph so that the "abde" is displayed
correctly the pdf would be:
(ab) Tj /Span << /ActualText (c) >> BDC ( ) Tj EMC [250(de)] TJ
I tested this and it works perfectly in acroread. Poppler does not
extract this correctly (it drops the "c") but Poppler bugs can be fixed.
This is probably the same bug Poppler has with accented characters
created from two glyphs .
> Also, does this commit look right to you:
The second part that fixes the "subset_glyph->utf8_is_mapped = ..." is
correct and fixes the problem where ActualText was being used for
The first part that only calls _cairo_sub_font_glyph_lookup_unicode() if
utf8_len < 0 does not look right.
What I intended the code to do is to always use the index_to_ucs4 for
toUnicode if it is available. This is to ensure the scenario you
describe in the commit message does not occur.
More information about the cairo