Re: Review of draft #2

Show/hide message thread

Review of draft #2 John Cowan (20 Sep 2019 00:28 UTC)

Re: Review of draft #2 Lassi Kortela (20 Sep 2019 09:41 UTC)

Re: Review of draft #2 John Cowan (20 Sep 2019 11:06 UTC)

Re: Review of draft #2 Lassi Kortela (20 Sep 2019 22:05 UTC)

Re: Review of draft #2 Lassi Kortela (20 Sep 2019 22:12 UTC)

Re: Review of draft #2 Lassi Kortela 20 Sep 2019 09:41 UTC

Thanks for a thoughtful review.

> 1) Because the term "fixnum" means something different in R6RS and
> R7RS-large, namely an exact integer whose smallest range is 24 bits and
> whose largest practical range is 63 bits at present, I think it should be
> avoided here.  Rather, the term "ASCII codepoint", analogous to "Unicode
> codepoint" should be employed.  This would be defined as an exact integer
> in the range 0 to 127 inclusive.

Not guaranteeing they are fixnums is a bit problematic because the
library mixes integers and characters freely (the offset arguments to
the transformation procedures; and accepting integers in place of
characters). The offsets can be arbitrary fixnums; for example, the
following maps ASCII to Unicode Mathematical Alphanumeric Symbols:

(string-for-each (lambda (char) (display (integer->char
(ascii-lower-case-value char #x1D400 26)))) "helloworld")

In the R6RS code of the sample implementation, I use fx+ and the other
fx procedures for arithmetic and comparison.

> 2) Remove all the repetitions of "char can be a fixnum or character object"
> and put it at the top:  "Unless otherwise specified, char can be a fixnum
> or character object" once and for all.

OK.

> 3) The remaining uses of "fixnum" in the section on transformation
> procedures should be changed to "exact integer" or "exact non-negative
> integer" as appropriate.  All other references shold be changed to "ASCII
> codepoint".

Since R6RS has fixnum-specialized arithmetic procedures, I'd like to
make sure people can rely on using them. They blow up (exception) on
bignums.

> 4) The sentence "The first ASCII standard was published in 1963" refers to
> a character encoding incompatible with today's ASCII.  Leave it out or
> change it to "The present ASCII standard was first published in 1967."

Good catch. Will change.

> 5) Add predicates ascii-codepoint? and ascii-string?.  The latter will be
> very valuable.  ASCII strings should not be a disjoint datatype, but simply
> strings containing ASCII characters only.

How would `ascii-codepoint?` differ from the current `ascii-char?`?

I think it's a bit dubious that the current `ascii-char?` doesn't
recognize an integer argument as an ASCII character. I was never
entirely happy with it.

> 6) Change the name of ascii-space-or-tab? to ascii-horizontal-whitespace?.
> This term is heavily used in RFCs.

I like the term "horizontal whitespace" and have used it in many of my
own programs (as the abbreviation "horz-white"). But I think the name
"ascii-horizontal-whitespace?" is far too long; hence it should remain
"space-or-tab?" or be abbreviated somehow.