Email list hosting service & mailing list manager

case mappings Alex Shinn (13 Jul 2005 03:57 UTC)
Re: case mappings Thomas Bushnell BSG (13 Jul 2005 05:49 UTC)
Re: case mappings Michael Sperber (13 Jul 2005 06:41 UTC)
Re: case mappings Thomas Bushnell BSG (13 Jul 2005 06:47 UTC)
Re: case mappings Michael Sperber (13 Jul 2005 07:12 UTC)
Re: case mappings Thomas Bushnell BSG (13 Jul 2005 07:21 UTC)
Re: case mappings bear (13 Jul 2005 17:24 UTC)
Re: case mappings Thomas Bushnell BSG (13 Jul 2005 19:35 UTC)
Re: case mappings Alex Shinn (13 Jul 2005 07:55 UTC)
Re: case mappings Alex Shinn (13 Jul 2005 07:40 UTC)
Re: case mappings Thomas Bushnell BSG (13 Jul 2005 19:36 UTC)
Re: case mappings Alex Shinn (14 Jul 2005 02:39 UTC)
Re: case mappings Thomas Bushnell BSG (14 Jul 2005 07:15 UTC)
Re: case mappings Alex Shinn (14 Jul 2005 07:42 UTC)
Re: case mappings Thomas Bushnell BSG (14 Jul 2005 08:07 UTC)
Re: case mappings Alex Shinn (14 Jul 2005 08:24 UTC)
Re: case mappings bear (14 Jul 2005 16:47 UTC)
Re: case mappings Thomas Bushnell BSG (14 Jul 2005 20:29 UTC)
Re: case mappings bear (15 Jul 2005 18:23 UTC)
Re: case mappings Thomas Bushnell BSG (15 Jul 2005 19:52 UTC)
Re: case mappings Matthew Flatt (13 Jul 2005 13:05 UTC)
Re: case mappings Thomas Bushnell BSG (13 Jul 2005 19:39 UTC)
Re: case mappings Alex Shinn (14 Jul 2005 02:31 UTC)

case mappings Alex Shinn 13 Jul 2005 03:57 UTC

I agree with Bear that case-mappings are poorly defined on single
codepoints.

Michael Sperber wrote:
> I don't quite understand what you're saying: the locale-independent
> case mappings in UnicodeData.txt always map a single scalar value to a
> single scalar value.  Sure it doesn't always do what your locale
> thinks (as you point out), but this case mapping doesn't require
> "multi-codepoint characters."

This isn't just a "locale-awareness" problem.  True, the mappings in
UnicodeData.txt are for simplicity only the 1-1 mappings, but
SpecialCasing.txt includes a large number of mappings that aren't 1-1
regardless of locale.  The Unicode concept of locale-independent
case-mapping includes these special cases.  Without handling these
cases, R6RS would be using an incomplete case mapping rule,
which is therefore not usable in the general sense.  I don't think anyone
wants 90% compatibility thrown into the core language.

Because the proper definition is so complicated and slow, yet there
are many uses of strict ASCII case mapping in computer languages
and protocols, I think it makes sense to define the core case-mapping
procedures as ASCII-specific.  Full linguistic case-handling should be
provided by specialized library procedures which optionally accept locale,
and only work at the string level, since single-char case-mappings are
ill-defined.

char-title-case? would then no longer be needed.

--
Alex