[cairo] [PATCH] script: More compatible with C89 and VC++.

Tor Lillqvist tml at iki.fi
Fri Jun 25 00:22:24 PDT 2010


>> The so-called double-byte code pages used in East Asian Windows
>> localisations have characters that use two bytes of which the trailing
>> one might be '/' or '\\' .

> True but is anybody using this any more?

We are talking about code that is to be run on Windows only, and that
gets plain char strings representing existing file names as input,
aren't we? What makes you think these code pages wouldn't be in use
any longer? Every C program on Windows that calls functions like
open(), stat(), fopen() uses the same old so called "ANSI" code pages
for the file names passed to such functions. There is no UTF-8
involved at the C library level if that is what you are thinking of.

> I would do the one-character-at-a-time api instead of allocating a buffer.
> This will allow you to skip errors in the encoding,

But where would such encoding errors come from? Aren't we talking
about file names representing existing files? Then the actual
operating system keeps file names (and practically everything else) as
wide strings, and converts them to "narrow" (multibyte, i.e. in the
system code page) strings only when passing them to a program that
called a "narrow" API. If the actual file name contains characters not
representable in the system code page, the program won't see some
wrongly encoded garbage, it will either see a question mark or not get
that file name at all.

>> If you start with a multibyte string, it can be translated to a wide
>> string and back without loss.

> Not if the multibyte string contains an encoding error!

True, but is the use case here such that one would throw random
garbage bytes at the basename() function?

--tml


More information about the cairo mailing list