Re: String comparison under Latin-1 and Unicode

Show/hide message thread

String comparison under Latin-1 and Unicode Ben Goetter (10 Mar 2000 18:27 UTC)

Re: String comparison under Latin-1 and Unicode Sergei Egorov (10 Mar 2000 19:40 UTC)

Re: String comparison under Latin-1 and Unicode Dave Mason (10 Mar 2000 20:00 UTC)

Re: String comparison under Latin-1 and Unicode Dave Mason (10 Mar 2000 20:06 UTC)

Re: String comparison under Latin-1 and Unicode Sergei Egorov (10 Mar 2000 20:32 UTC)

Re: String comparison under Latin-1 and Unicode Dave Mason 10 Mar 2000 20:00 UTC

>>>>> On Fri, 10 Mar 2000 14:43:05 -0500, "Sergei Egorov" <xxxxxx@informaxinc.com> said:

> I don't agree with this proposal: it seems to me that STRING<? and
> others are better left for trivial tasks like sorting strings of
> digits; they have simple definition based on CHAR<? that, in its
> turn, is based on internal encoding (ASCII or UNICODE). It is still
> very useful as ordering predicate with no language-dependent
> meaning; for example, if you want to implement string sets as sorted
> lists, it's much better to use fast ordering predicate, even if the
> induced ordering doesn't make any sense. From the other hand, some

A reasonable argument.

> I would suggest using new names for collation predicates, especially
> because collation is actually a complex process involving generation
> of "collation keys" which can be reused:

> (string->collation-key str language-specifier) => c-key
> (collation-key<? c-key1 c-key2) => bool
> (collation-key<=? c-key1 c-key2) => bool
> ...  and then you can define your own collation predicates:

I would much prefer either:
	(collation->predicate language-specifier ordering) -> pred?
	(pred? string1 string2) -> bool

where LANGUAGE-SPECIFIER is as Ben Goetter <xxxxxx@angrygraycat.com>
suggested and ORDERING is one of the strings "<", "<=", or "="

This seems far more useful, and efficient that converting any string
you want to compare to a collation-key!

../Dave