Unicode surrogates rgburger@xxxxxx (13 Mar 2006 12:11 UTC)
Re: Unicode surrogates Tom Emerson (13 Mar 2006 13:43 UTC)
Re: Unicode surrogates John Cowan (13 Mar 2006 13:56 UTC)
Re: Unicode surrogates Robby Findler (13 Mar 2006 14:03 UTC)
Re: Unicode surrogates bear (13 Mar 2006 17:04 UTC)
Re: Unicode surrogates Per Bothner (13 Mar 2006 17:13 UTC)
Re: Unicode surrogates bear (22 Mar 2006 23:52 UTC)
Re: Unicode surrogates Tom Emerson (13 Mar 2006 17:16 UTC)

Re: Unicode surrogates Robby Findler 13 Mar 2006 14:03 UTC

At Mon, 13 Mar 2006 08:55:49 -0500, John Cowan wrote:
> It is indeed invalid Unicode.  Unfortunately, Win32 filenames are not Unicode
> strings; they are vectors of almost-arbitrary 16-bit values (certain values
> are prohibited).  Similarly, Posix filenams are not strings either; they
> are vectors of almost-arbitrary 8-bit values.
>
> Vectors, though, are not a sensible interface to file systems; filenames are
> thought of as strings, accessed as strings, and almost always do correspond
> to strings.   The occasional deficiencies in this model just have to be
> swallowed.

PLT Scheme uses a `path' type, distinct from the string type (but
nearly always convertable) to deal with this problem (and to help
support platform independent path construction). I'm not sure of the
extent of this SRFI, but it may be worth having a look to see how we
dealt with this exact problem:

http://download.plt-scheme.org/doc/301/html/mzscheme/mzscheme-Z-H-11.html#node_sec_11.3

Robby