[cairo] Unicode error causing Cairo to crash.

Bill Spitzak spitzak at d2.com
Wed Apr 27 12:29:36 PDT 2005


There seems to be a misunderstanding of what I am requesting. I 
ABSOLUTELY DO NOT want any extra interface for ASCII or any kind of 
"character set id". I DO NOT want to use "iconv". The purpose of my 
request is to make it possible to use a UTF-8 API only, and *encourage* 
it's use.

Apparently my suggestion that it print ISO8859-1 and CP1252 legibly has 
hit a nerve. It is not a requirement, though I still think it is the 
obvious and best solution. So I am going to change my request to 
something that is politically correct:

*PLEASE* make the UTF-8 drawing print one visible glyph (such as an 
error box) for each byte of misencoded UTF-8, and *continue to parse and 
print the rest of the string*. Do NOT print nothing. Do NOT truncate at 
the error. Do NOT skip the error so it's invisible. Do NOT print ASCII 
(<128) characters (for security reasons). Do NOT think the string is 
something other than UTF-8. Do NOT return an "error". NO, NO, NO, NO!!!

For some reason graphics API implementers do not realize that we MUST 
get output no matter what piece of garbage is sent as the string. If I 
am forced to examine the string to make sure it is "legal UTF-8" before 
printing, then you have removed every single bit of incentive there is 
to use UTF-8 internally or in my files, since it is no harder to convert 
from my own data than to do this examination. This just means we will 
not see UTF-8 adopted by enough programmers and we will have another 
decade (at least) of multiple characters sets, "wide character" 
duplicates of every interface, and programs that cannot handle anything 
except ISO8859-1.

Sorry about ranting, but this is very frustrating to see the exact same 
mindset I have been fighting for 15 years, that has prevented real I18N 
from working on Unix, and has polluted Windows with 16-bit "wide 
characters" and triplicate interfaces (including MB). I believe that 
because of politically-correct guilt that US and Europe don't have to 
rewrite their software, the simple and obvious change from ASCII to a 
single UTF-8 interface are rejected, while solutions that require 
*everybody* to rewrite their software are fair and thus more acceptable. 
The fact that this discourages use of I18N entirely is unimportant, 
because only appearances matter, not results.



More information about the cairo mailing list