words, punctuation, and whitespace Aubrey Jaffer (20 Jul 2005 02:50 UTC)
Re: words, punctuation, and whitespace Thomas Bushnell BSG (20 Jul 2005 03:09 UTC)
Re: words, punctuation, and whitespace Alex Shinn (20 Jul 2005 03:32 UTC)
Re: words, punctuation, and whitespace John.Cowan (20 Jul 2005 04:02 UTC)

Re: words, punctuation, and whitespace Thomas Bushnell BSG 20 Jul 2005 03:09 UTC

Aubrey Jaffer <xxxxxx@alum.mit.edu> writes:

> The first task in writing text-processing programs is to separate the
> input text into words, punctuation, and whitespace.  Could R6RS deal
> with Unicode text as words, punctuation, and whitespace?
>   Unicode-read port
> would return a word, punctuation, or whitespace object; or an
> eof-object.

An interesting idea.  But I surely hope that you aren't assuming that
text consists of a bunch of words separated by whitespace and/or
punctuation.  In some languages there is essentially no whitespace.
(For example, this is how Japanese books are traditionally printed.)