Re: Encodings. Paul Schlie (13 Feb 2004 02:18 UTC)
Re: Encodings. Bradd W. Szonye (13 Feb 2004 03:35 UTC)
Re: Encodings. Paul Schlie (13 Feb 2004 05:59 UTC)
Re: Encodings. Bradd W. Szonye (13 Feb 2004 06:36 UTC)
Re: Encodings. Paul Schlie (13 Feb 2004 08:00 UTC)
Re: Encodings. Robby Findler (13 Feb 2004 15:01 UTC)
Re: Encodings. Paul Schlie (13 Feb 2004 17:16 UTC)
Re: Encodings. Paul Schlie (13 Feb 2004 18:19 UTC)
Re: Encodings. Robby Findler (16 Feb 2004 01:03 UTC)
Re: Encodings. Paul Schlie (16 Feb 2004 03:21 UTC)
Re: Encodings. Paul Schlie (16 Feb 2004 04:18 UTC)
Re: Encodings. Robby Findler (16 Feb 2004 04:33 UTC)
Re: Encodings. bear (13 Feb 2004 17:40 UTC)
Re: Encodings. Per Bothner (13 Feb 2004 18:34 UTC)
Re: Encodings. Paul Schlie (13 Feb 2004 19:02 UTC)
Re: Encodings. Bradd W. Szonye (13 Feb 2004 19:05 UTC)
Re: Encodings. Paul Schlie (13 Feb 2004 19:48 UTC)
Re: Encodings. Per Bothner (13 Feb 2004 19:11 UTC)
Re: Encodings. Paul Schlie (13 Feb 2004 19:44 UTC)
Re: Encodings. bear (13 Feb 2004 21:42 UTC)
Re: Encodings. Bradd W. Szonye (13 Feb 2004 21:54 UTC)
Re: Encodings. Paul Schlie (13 Feb 2004 23:45 UTC)
Re: Encodings. Bradd W. Szonye (14 Feb 2004 00:04 UTC)
Re: Encodings. bear (14 Feb 2004 01:06 UTC)
Re: Encodings. Bradd W. Szonye (14 Feb 2004 01:08 UTC)
Re: Encodings. Paul Schlie (14 Feb 2004 02:35 UTC)
Re: Encodings. Bradd W. Szonye (14 Feb 2004 03:00 UTC)
Re: Encodings. Paul Schlie (14 Feb 2004 03:04 UTC)
Re: Encodings. Bradd W. Szonye (14 Feb 2004 03:08 UTC)
Re: Encodings. Paul Schlie (14 Feb 2004 03:29 UTC)
Re: Encodings. Paul Schlie (14 Feb 2004 02:19 UTC)
Re: Encodings. Bradd W. Szonye (14 Feb 2004 03:04 UTC)
Re: Encodings. Paul Schlie (14 Feb 2004 03:10 UTC)
Re: Encodings. Bradd W. Szonye (14 Feb 2004 03:12 UTC)
Re: Encodings. Paul Schlie (13 Feb 2004 22:41 UTC)
Re: Encodings. Bradd W. Szonye (13 Feb 2004 17:55 UTC)
Re: Encodings. Paul Schlie (13 Feb 2004 18:42 UTC)
Re: Encodings. Bradd W. Szonye (13 Feb 2004 18:53 UTC)
Re: Encodings. Ken Dickey (13 Feb 2004 21:53 UTC)
RESET [was Re: Encodings] Ken Dickey (14 Feb 2004 16:19 UTC)
Re: RESET [was Re: Encodings] bear (14 Feb 2004 18:02 UTC)
Re: RESET [was Re: Encodings] Bradd W. Szonye (14 Feb 2004 19:38 UTC)

RESET [was Re: Encodings] Ken Dickey 14 Feb 2004 07:27 UTC

I'm not explaining myself well.

Let me try to define a category by examples.

[1]

Let's say there are two scheme source files, each of which uses the "same"
identifier in the same global (module global) scope/context.  We say that in
a RNRS Scheme the identifier names or denotes the same value.

Let's say the two files are stored in different encodings (say utf-8 and
ucs-2) and processed by different but conforming Unicode systems (text
editors, Scheme read/write, whatever) so that identifiers still appear the
same when displayed but are stored in different encodings.

A Scheme implementation which properly reads the two files should end up with
the identifier occurrences denoted above represented by symbols which are eq?
(NB: _not_ eqv?) to each other.  If not, I term this "broken".

[2]

[In the absence of reflection] one should be able to consistently replace all
occurrences of an identifier in the same scope without changing the meaning/
behavior of a program.  If not, I term the situation "broken".

[3]

There are many concepts which come in paired/binary parts: on/off, up/down, et
cetera, which have no meaning without both parts.  Up without down does not
make sense.

[Q: If you call a tail a leg, how many legs does a dog have?
 A: Four. Calling a tail a leg does not make it one].

[Q: I have a pencil which is at 60 degrees C.  What is the temperature of an
atom in the pencil?
 A: The question is meaningless.  Temperature is an aggregate property of
molecules, it is not applicable to a single atom].

So if a glyph/character does not have a case variant, considering it to be
lower case makes no logical sense.  I view this as an abuse of terminology.
Being outside of normal logic, I term this "bizarre" and if pressed, probably
"broken" as well.

So in all this discussion of multiple canonical forms (another misuse of
terminology, IMHO) multiple normal forms, et cetera, I am looking for a
description of how to keep [1] and [2] from being broken.

If satisfying the Unicode Standard means breaking [1], then I say "Don't do
that!".

Scheme is a programming language, not a "natural" language.  Define a single
acceptable canonical/normal form in which the "same" identifiers, represented
as symbols, are always eq? to each other.  Or define what an acceptable
encoding is, only accept that form, and let external tool(s) do the
processing.

I have as yet not developed an operational model (a story) of how the above
works.  This is the source of my confusion.

Can you help me out?

Thanks,
-KenD