the discussion so far Matthew Flatt (16 Jul 2005 12:41 UTC)
(missing)
(missing)
Re: the discussion so far Alex Shinn (20 Jul 2005 02:50 UTC)
Re: the discussion so far Thomas Bushnell BSG (20 Jul 2005 02:56 UTC)
Re: the discussion so far Alex Shinn (20 Jul 2005 03:15 UTC)
Re: the discussion so far Thomas Bushnell BSG (20 Jul 2005 03:24 UTC)
Re: the discussion so far Alex Shinn (20 Jul 2005 03:38 UTC)
Re: the discussion so far Thomas Bushnell BSG (20 Jul 2005 03:49 UTC)
Re: the discussion so far John.Cowan (20 Jul 2005 04:24 UTC)
Re: the discussion so far Thomas Bushnell BSG (20 Jul 2005 04:27 UTC)
Re: the discussion so far John.Cowan (20 Jul 2005 04:58 UTC)
Re: the discussion so far Thomas Bushnell BSG (20 Jul 2005 05:04 UTC)
(missing)
(missing)
Re: the discussion so far bear (20 Jul 2005 02:45 UTC)
Re: the discussion so far John.Cowan (20 Jul 2005 03:56 UTC)
Re: the discussion so far Jorgen Schaefer (16 Jul 2005 13:05 UTC)
Re: the discussion so far Matthew Flatt (16 Jul 2005 13:21 UTC)
Re: the discussion so far Jorgen Schaefer (16 Jul 2005 13:58 UTC)
Re: the discussion so far Thomas Bushnell BSG (17 Jul 2005 02:42 UTC)
Re: the discussion so far Thomas Bushnell BSG (17 Jul 2005 02:57 UTC)
Re: the discussion so far Jorgen Schaefer (17 Jul 2005 03:33 UTC)
Re: the discussion so far bear (16 Jul 2005 18:07 UTC)
Re: the discussion so far John.Cowan (17 Jul 2005 04:49 UTC)
Re: the discussion so far Thomas Bushnell BSG (17 Jul 2005 02:40 UTC)

Re: the discussion so far Matthew Flatt 16 Jul 2005 13:21 UTC

At Sat, 16 Jul 2005 15:05:06 +0200, Jorgen Schaefer wrote:
> In contrast, case folding is available for Unicode as a simple
> table which maps codepoints to the case-folded variant. There are
> two tables: The simple case folding maps a single codepoint to a
> single codepoint, while the full case folding table maps a single
> codepoint to one or more codepoints.

Thank you for this clarification (for repeating and expanding it,
actually; I had not yet worked through your earlier message).

So, the `char-ci' operations should use the "simple case folding" table
from CaseFolding.txt, and the `string-ci' operations should use the
"full case folding" table from CaseFolding.txt. After folding, the
comparison result is determined character-by-character.

Meanwhile, `string-upcase' and `string-downcase' reflect the same
improved handling at the string level (compared to the character level)
by using SpecialCasing.txt in addition to UnicodeData.txt.

Have I got that right?

> Since Unicode support requires such lookup tables for about
> anything - including downcasing -, using the case folding table is
> not much of an extra burden.

Yes, I agree.

Thanks,
Matthew