Re: predicate->char-set considered harmful
shivers@xxxxxx 18 Dec 2000 15:50 UTC
From: xxxxxx@math.purdue.edu
Date: Sun, 17 Dec 2000 22:44:02 -0500 (EST)
What about predicate->char-set on large (Unicode or larger) character sets?
I'd certainly not want to call a function 65536 times (or 2^32 times) just
to construct a char-set. And a user may not know that a Scheme
implementation has two-byte or four-byte characters. (How many people know
that Gambit has 2-byte chars by default?) I just don't see how it's really
helpful to have this function, and I think it should be eliminated.
I have similar, but less strongly pronounced, difficulties with
char-set-invert.
Valid concerns, but it can't be helped. People frequently describe sets of
things by predicate; you need a way to convert that into a true set.
You absolutely need a complete set of set operators, and set complement
is no more inefficient that set intersection or union or any of the other
ops.
There is a role for these ops, even in implementations where they are
expensive. For example, to define sets that are essentially read-only
once defined. It's acceptable to crunch a little bit defining the char set
at program initialisation time.
This is simply more evidence that textual computing in a rich-text world
is not an easy thing.
-Olin