Email list hosting service & mailing list manager

case mappings Alex Shinn (13 Jul 2005 03:57 UTC)
Re: case mappings Thomas Bushnell BSG (13 Jul 2005 05:49 UTC)
Re: case mappings Michael Sperber (13 Jul 2005 06:41 UTC)
Re: case mappings Thomas Bushnell BSG (13 Jul 2005 06:47 UTC)
Re: case mappings Michael Sperber (13 Jul 2005 07:12 UTC)
Re: case mappings Thomas Bushnell BSG (13 Jul 2005 07:21 UTC)
Re: case mappings bear (13 Jul 2005 17:24 UTC)
Re: case mappings Thomas Bushnell BSG (13 Jul 2005 19:35 UTC)
Re: case mappings Alex Shinn (13 Jul 2005 07:55 UTC)
Re: case mappings Alex Shinn (13 Jul 2005 07:40 UTC)
Re: case mappings Thomas Bushnell BSG (13 Jul 2005 19:36 UTC)
Re: case mappings Alex Shinn (14 Jul 2005 02:39 UTC)
Re: case mappings Thomas Bushnell BSG (14 Jul 2005 07:15 UTC)
Re: case mappings Alex Shinn (14 Jul 2005 07:42 UTC)
Re: case mappings Thomas Bushnell BSG (14 Jul 2005 08:07 UTC)
Re: case mappings Alex Shinn (14 Jul 2005 08:24 UTC)
Re: case mappings bear (14 Jul 2005 16:47 UTC)
Re: case mappings Thomas Bushnell BSG (14 Jul 2005 20:29 UTC)
Re: case mappings bear (15 Jul 2005 18:23 UTC)
Re: case mappings Thomas Bushnell BSG (15 Jul 2005 19:52 UTC)
Re: case mappings Matthew Flatt (13 Jul 2005 13:05 UTC)
Re: case mappings Thomas Bushnell BSG (13 Jul 2005 19:39 UTC)
Re: case mappings Alex Shinn (14 Jul 2005 02:31 UTC)

Re: case mappings Alex Shinn 14 Jul 2005 07:41 UTC

On 7/14/05, Thomas Bushnell BSG <xxxxxx@becket.net> wrote:
>
> So please, just deal with the reality.  There is no such thing as
> character-by-character case mapping.  Please do not say "everyone will
> want one even though it's buggy, so we'll require it."  Everyone will
> not want one.  If it's not standard, then programmers will use the
> string-by-string procedures, and be quite happy.

We're really not arguing here, we want exactly the same thing with
respect to Unicode case mappings.  I don't think character-level case
mappings should be provided at all.

However, if I'm to parse MIME and HTML and perhaps 90% of the
network protocols out there, I do need the simple, consistent case
mapping they use wrt ASCII characters.  This level of case mapping
is so prevalent in computing that R6RS would be foolish not to
provide it, no matter what we decide on regarding Unicode.  I just
want to make this clear to the authors, in case they decide to drop
Unicode-aware case mappings.

The difference then for Unicode case mapping is that it is used as
a linguistic utility.  This is only meaningful at the string-level.  Any
algorithm that uses Unicode case mappings at the character-level
either really wants to be using ASCII-level case mappings (as for
the above examples) or is a fundamentally broken algorithm that
can never correctly perform Unicode string-level case mappings.

One option is to provide only the string-level operations, and to
require them to work with ASCII.  These operations could optionally
provide the full Unicode mappings, special cases and all.

It would be nice to provide at least a place-holder for locales, but
this does open another can of worms.  What is a locale?  In the
implementation I provide for Chicken and Gauche it's just a string,
but some schemes might want locale objects.  Furthermore, there's
probably a (current-locale).  Given that, does

  (string-ci=? s1 s2)

mean the same thing as

  (string-ci=? s1 s2 (current-locale))

or the same as

  (string-ci=? s1 s2 (independent-locale))

--
Alex