Re: String comparison under Latin-1 and Unicode
Dave Mason 10 Mar 2000 20:00 UTC
>>>>> On Fri, 10 Mar 2000 14:43:05 -0500, "Sergei Egorov" <xxxxxx@informaxinc.com> said:
> I don't agree with this proposal: it seems to me that STRING<? and
> others are better left for trivial tasks like sorting strings of
> digits; they have simple definition based on CHAR<? that, in its
> turn, is based on internal encoding (ASCII or UNICODE). It is still
> very useful as ordering predicate with no language-dependent
> meaning; for example, if you want to implement string sets as sorted
> lists, it's much better to use fast ordering predicate, even if the
> induced ordering doesn't make any sense. From the other hand, some
A reasonable argument.
> I would suggest using new names for collation predicates, especially
> because collation is actually a complex process involving generation
> of "collation keys" which can be reused:
> (string->collation-key str language-specifier) => c-key
> (collation-key<? c-key1 c-key2) => bool
> (collation-key<=? c-key1 c-key2) => bool
> ... and then you can define your own collation predicates:
I would much prefer either:
(collation->predicate language-specifier ordering) -> pred?
(pred? string1 string2) -> bool
where LANGUAGE-SPECIFIER is as Ben Goetter <xxxxxx@angrygraycat.com>
suggested and ORDERING is one of the strings "<", "<=", or "="
This seems far more useful, and efficient that converting any string
you want to compare to a collation-key!
../Dave