the discussion so far Matthew Flatt (16 Jul 2005 12:41 UTC)
(missing)
(missing)
Re: the discussion so far Alex Shinn (20 Jul 2005 02:50 UTC)
Re: the discussion so far Thomas Bushnell BSG (20 Jul 2005 02:56 UTC)
Re: the discussion so far Alex Shinn (20 Jul 2005 03:15 UTC)
Re: the discussion so far Thomas Bushnell BSG (20 Jul 2005 03:24 UTC)
Re: the discussion so far Alex Shinn (20 Jul 2005 03:38 UTC)
Re: the discussion so far Thomas Bushnell BSG (20 Jul 2005 03:49 UTC)
Re: the discussion so far John.Cowan (20 Jul 2005 04:24 UTC)
Re: the discussion so far Thomas Bushnell BSG (20 Jul 2005 04:27 UTC)
Re: the discussion so far John.Cowan (20 Jul 2005 04:58 UTC)
Re: the discussion so far Thomas Bushnell BSG (20 Jul 2005 05:04 UTC)
(missing)
(missing)
Re: the discussion so far bear (20 Jul 2005 02:45 UTC)
Re: the discussion so far John.Cowan (20 Jul 2005 03:56 UTC)
Re: the discussion so far Jorgen Schaefer (16 Jul 2005 13:05 UTC)
Re: the discussion so far Matthew Flatt (16 Jul 2005 13:21 UTC)
Re: the discussion so far Jorgen Schaefer (16 Jul 2005 13:58 UTC)
Re: the discussion so far Thomas Bushnell BSG (17 Jul 2005 02:42 UTC)
Re: the discussion so far Thomas Bushnell BSG (17 Jul 2005 02:57 UTC)
Re: the discussion so far Jorgen Schaefer (17 Jul 2005 03:33 UTC)
Re: the discussion so far bear (16 Jul 2005 18:07 UTC)
Re: the discussion so far John.Cowan (17 Jul 2005 04:49 UTC)
Re: the discussion so far Thomas Bushnell BSG (17 Jul 2005 02:40 UTC)

Re: the discussion so far Jorgen Schaefer 16 Jul 2005 13:05 UTC

Matthew Flatt <xxxxxx@cs.utah.edu> writes:

> A similar line of reasoning applies to the other operations. In
> contrast, a `string-ci=?' based on the the Unicode collation algorithm,
> while certainly a better approximation, seems like too much of an
> implementation burden to be in the SRFI.

Note that collation is for string sorting - i.e. STRING<? and
friends - while STRING-CI=? should use case folding.

String collation is very complex, as the "preferred" order of
characters depends on the locale. But since STRING<? and friends
are often used for things like binary search trees where the exact
order is irrelevant and the only important thing is the existance
of any kind of total order, defining them the way this SRFI does -
by using the codepoint sequence - is good, because it is fast. If
the implementation wants to provide the locale-dependent string
collation, fine, but that's not useful for this SRFI to define.

In contrast, case folding is available for Unicode as a simple
table which maps codepoints to the case-folded variant. There are
two tables: The simple case folding maps a single codepoint to a
single codepoint, while the full case folding table maps a single
codepoint to one or more codepoints.

Since Unicode support requires such lookup tables for about
anything - including downcasing -, using the case folding table is
not much of an extra burden.

Greetings,
        -- Jorgen

--
((email . "xxxxxx@forcix.cx") (www . "http://www.forcix.cx/")
 (gpg   . "1024D/028AF63C")   (irc . "nick forcer on IRCnet"))