Re: Surrogates and character representation Alan Watson 24 Jul 2005 18:14 UTC
Okay, thanks for clearing up my misunderstanding. > but in general using UTF-8 as an internal representation is > a bad idea. Using UTF-8 internally for a Scheme on a Plan 9 system is not obviously a bad idea. Sure, you don't have direct indexing, but you avoid conversion when you talk to the C library and OS. Using UTF-16 internally doesn't give you direct indexing because of characters outside the BMP, but it might make sense on Windows boxes for precisely the same reason. Using UCS-32 internally in these cases would involve translation to talk to the library and OS and would further make my emacs use about four times as much memory as it does now (which brings us back the the representation for infinity). In general, any single representation is a bad idea in some circumstances. Regards, Alan -- Dr Alan Watson Centro de Radioastronomía y Astrofísica Universidad Astronómico Nacional de México