Re: Surrogates and character representation

Show/hide message thread

Re: the "Unicode Background" section Thomas Lord (22 Jul 2005 03:28 UTC)

Surrogates and character representation Tom Emerson (22 Jul 2005 03:55 UTC)

Re: Surrogates and character representation John.Cowan (22 Jul 2005 04:09 UTC)

Re: Surrogates and character representation Tom Emerson (22 Jul 2005 04:26 UTC)

Re: Surrogates and character representation Thomas Bushnell BSG (23 Jul 2005 07:19 UTC)

Re: Surrogates and character representation Tom Emerson (23 Jul 2005 17:38 UTC)

Re: Surrogates and character representation John.Cowan (24 Jul 2005 05:37 UTC)

Re: Surrogates and character representation Shiro Kawai (24 Jul 2005 08:15 UTC)

Re: Surrogates and character representation Tom Emerson (24 Jul 2005 13:25 UTC)

Re: Surrogates and character representation Alan Watson (24 Jul 2005 17:32 UTC)

Re: Surrogates and character representation Tom Emerson (24 Jul 2005 17:54 UTC)

Re: Surrogates and character representation Alan Watson (24 Jul 2005 18:15 UTC)

Re: Surrogates and character representation Tom Emerson (24 Jul 2005 20:18 UTC)

Re: Surrogates and character representation Per Bothner (24 Jul 2005 18:25 UTC)

Re: Surrogates and character representation John.Cowan (24 Jul 2005 23:02 UTC)

Re: Surrogates and character representation Per Bothner (24 Jul 2005 23:26 UTC)

Re: Surrogates and character representation Alan Watson (25 Jul 2005 17:24 UTC)

Re: Surrogates and character representation bear (27 Jul 2005 16:16 UTC)

Re: Surrogates and character representation John.Cowan (24 Jul 2005 22:12 UTC)

Re: Surrogates and character representation Ken Dickey (24 Jul 2005 09:35 UTC)

Re: Surrogates and character representation Michael Sperber (24 Jul 2005 11:47 UTC)

Re: the "Unicode Background" section Matthew Flatt (22 Jul 2005 04:30 UTC)

Re: the "Unicode Background" section Alex Shinn (22 Jul 2005 05:42 UTC)

Re: the "Unicode Background" section bear (22 Jul 2005 15:45 UTC)

Re: the "Unicode Background" section Tom Emerson (22 Jul 2005 15:56 UTC)

Re: Surrogates and character representation Ken Dickey 23 Jul 2005 15:05 UTC

On Saturday 23 July 2005 00:19, Thomas Bushnell BSG wrote:
> Tom Emerson <xxxxxx@basistech.com> writes:
> > Surrogate codepoints have a character property. They should be usable
> > in a string, and individually can be considered a character.
>
> This is exactly part of the reason why char=codepoint is such a lose.
> Most code doesn't *want* to see this kind of garbage; it's an encoding
> issue.  I want chars where the *computer* takes care of the coding.  I
> want chars that are fully-understood characters, not little pieces of
> a character.

This points out a tension underlying this thread.

There are two dicsussions intertwined here.  [1] The access to and use of
Unicode within Scheme (e.g. to process internationalized web pages) and [2]
bringing Unicode into Scheme (extending Symbol & String datatypes).

SRFI-75 specifically addresses the second of these goals and (wisely) states
that the first goal is left to another SRFI.

I for one would be satisfied to be able to portably manipulate Unicode using
Scheme source encoded in ASCII (or UTF-8). In particular, I would be willing
use have a separate datatype (or datatypes) and libraries to accomplish this.

Would anyone care to post a Unicode Encoding & I/O SRFI, so that the *other*
discussion can be moved from this thread to that one?

$0.02,
-KenD