[cairo] Improving PDF output

Bill Spitzak spitzak at d2.com
Tue Jan 9 12:53:01 PST 2007


What Apple is doing is exactly what I proposed, which is to use Unicode 
for the glyph id's at any time possible, and then allocate id's from the 
private-use area for glyphs that are not Unicode. I still feel this is 
going to make things a lot easier to use, and is probably the only way 
cut & paste of presentation forms is going to work at all.

It would appear Apple is allocating the id's in order as needed, which 
means that decoding the result is impossible without extra information. 
I think it may be possible to make decoding more likely, by trying to 
allocate the same id for a glyph on each render. One way to do this is 
to hash together the most-likely unicode that contributed to the glyph 
to generate the id, rehashing on any collisions with with an existing 
glyph or previous allocation.

My impression of the Unicode standard is that it would be pretty safe to 
allocate the codes from the 0xD800 through 0xffff range, which is the 
UTF-16 surrogate pairs, the private use area, and a lot of precomposed 
characters. This should be large enough that collisions are avoided so 
the hashed glyph id may stay the same for quite awhile. Another 
possibility is to use 0xf0000 through 0x10ffff which are private-use 
planes, but then the backend must handle more than 16 bits.

Baz wrote:
> On 09/01/07, Baz <brian.ewins at gmail.com> wrote:
>> ligatures on and off, then saved it as PDF. Copy-and-paste of the text
>> only worked with ligatures off ("The fifty spiffy apples." twice came
>> out as "The fifty spiffy a The fifty spiffy apples.". The pp ligature
>> seemed to be the point of failure)
> 
> Forgot there's fi, ff ligatures in there too. There's a unicode code
> point for fi and ff, but not for pp. If you map glyph->text by
> inverting the font's cmap table you'd get something like 'The
> \uFB01fty spi\uFB00y a', because the pp glyph id only appears in the
> mort table. That sounds like exactly what you did Alp? It suggests
> that apple aren't keeping the original text around to do the
> glyph->text mapping, though, which is what we wanted to know.
> _______________________________________________
> cairo mailing list
> cairo at cairographics.org
> http://cairographics.org/cgi-bin/mailman/listinfo/cairo
> 


More information about the cairo mailing list