Re: Surrogates and character representation John.Cowan 24 Jul 2005 22:12 UTC
Alan Watson scripsit: > Hmm. That would seem to prevent an implementation representing strings > internally using UTF-8. This is convenient in some contexts as Scheme > strings can be trivially converted to UTF-8 C strings. Not at all. There is a well-defined UTF-8 encoding for every Unicode code point (which is not the case for UTF-16). See Table 3-6 in the Unicode Standard 4.0. -- Here lies the Christian, John Cowan judge, and poet Peter, http://www.reutershealth.com Who broke the laws of God http://www.ccil.org/~cowan and man and metre. xxxxxx@reutershealth.com