Unicode surrogates rgburger@xxxxxx (13 Mar 2006 12:11 UTC)
Re: Unicode surrogates Tom Emerson (13 Mar 2006 13:43 UTC)
Re: Unicode surrogates John Cowan (13 Mar 2006 13:56 UTC)
Re: Unicode surrogates Robby Findler (13 Mar 2006 14:03 UTC)
Re: Unicode surrogates bear (13 Mar 2006 17:04 UTC)
Re: Unicode surrogates Per Bothner (13 Mar 2006 17:13 UTC)
Re: Unicode surrogates bear (22 Mar 2006 23:52 UTC)
Re: Unicode surrogates Tom Emerson (13 Mar 2006 17:16 UTC)

Re: Unicode surrogates Tom Emerson 13 Mar 2006 17:16 UTC

bear writes:
> That doesn't matter, really.  The fact that it's in violation of
> the unicode standard does not make it cease to exist or solve the
> problem it creates.

True enough.

> To put it another way, Windows allows characters that are not part
> of Unicode to be used to name files.  If we restrict our character
> set for filenames to Unicode-only, we will not be able to open
> those files.  That problem is real.

The problem is real, but how often does it happen? The question is
whether the character representation for the language should be
dictated by the broken behavior of a particular operating system,
regardless of how ubiquitous that OS is.

To my mind an unpaired surrogate used in a file name is an
exception. As long as a method exists to specify the name explicitly,
this can be handled.

> Hmmm.... can we use read-byte and write-byte to read and write
> filenames?

I doubt it. I thinkt he pathname type in PLT and Common LISP may be
the way to go to handle these cases.

--
Tom Emerson                                          Basis Technology Corp.
Software Architect                                 http://www.basistech.com
 "You can't fake quality any more than you can fake a good meal." (W.S.B.)