On Sun, 26 Dec 2004, Aubrey Jaffer wrote: > Arrays are a fundamental data organizing paradigm from the origins of > computing; FORTRAN has arrays; APL has arrays. I hope arrays will > become part of Scheme in R6RS. For a construct which generalizes two > of Scheme's three aggregate data types, a succinct read-syntax does > not seem overly burdensome. Need it be so succinct as to add eleven new octothorpe reader macros, each dispatching further for the large number of different types of arrays? It would be much simpler, I think, and it would not lose much brevity, to use SRFI 10; indeed, SRFI 10 was designed in response to this issue as it arose in SRFI 4. > | In particular, I suggest that it be: > | > | #,(ARRAY [<rank>] <type> <elements> ...) > > Rank cannot be deduced from <element> nesting for heterogeneous > arrays. I suggest that <rank> be required. Sorry, I was not sufficiently clear there. I meant to specify that the rank defaults to 1, like #Axxx(...) in the current proposal. > | So, for example, the two-by-two array of unsigned 16-bit integers from > | the document might be written as #,(ARRAY 2 u16 (0 1) (2 3)). > | General object arrays' types would be OBJECT (so #(FOO 1 #T ()) > | could also be written #,(ARRAY OBJECT FOO 1 #T ())) and character > | arrays' types would be CHAR (so "foo" could alternatively be > | written #,(ARRAY CHAR #\f #\o #\o)). > > This appears to introduce type symbols like U16 and CHAR which are not > part of srfi-47. The prototype functions in srfi-47 return arrays. > > | [...] > > I am not opposed to also having SRFI-10 syntax for arrays. This would > seem to require reserving a set of symbols for type specification, > which is an unschemely way of doing things. Scheme goes to some > lengths to avoid using symbols as cookies; witness NULL? and > EOF-OBJECT? Perhaps I'm confused, but I don't see much difference between my usage of symbols -- which exist only at read-time, never at run-time, unlike nil and the EOF object -- and your usage of the suffixes of the new #A syntax. Could you elaborate on how my proposal is any worse in that respect than yours? > | (I'd also prefer that the names be longer & much more descriptive, like > | UNSIGNED16 or BOOLEAN, but I suppose that's a little too late, now that > | SRFI 47 has already been finalized & the incomprehensible abbreviations > | of array types have been set into stone...) > > SRFI-47 defines procedures to return prototype arrays. Additional > procedures can be added to alias the abbreviated ones. This works for SRFI 47, but not necessarily this SRFI: one cannot define one's own aliases for existing array types in the reader syntax. > But explicitly > complete descriptions for numeric types are rather long: > > [...long list...] > > These long names present more of a burden for the memories of > non-English-speakers than the short names, which are the same for > everyone. I'm not suggesting names so long that they induce tedium in typists, but rather names somewhat longer than are excessively obscure, such as INTEGER-U16, COMPLEX-64, BIT, et cetera. Furthermore, the single- character mnemonics are derived from English, and there is certainly the possibility that their names would begin with different initial letters in other languages; however, everything in Scheme is from English anyway, so I see nothing wrong with using English words for array element type names. > There is Scheme precedent for abbreviated names in > identifiers like CADR an CDADAR and in the radix and exactness > prefixes #B, #O, #D, #X, #E, #I. For very fundamental primitives such as CAR & CDR that are frequently used, and where the ability to stack them is convenient (in the case of CAR & CDR, not, for example, HEAD & TAIL or FIRST & REST), this is quite reasonable; however, arrays are much less fundamental to Scheme, and, even if one wishes to debate that, literal arrays are much less frequently written than CAR & CDR. A better analogue would be ARRAY-REF, but I haven't seen any objections to that as opposed to AREF, and I much prefer ARRAY-REF rather than AREF. Regarding prefixes for radices & exactness: I still dislike them, but numbers are so concisely expressed anyway that they would bloat their significance in a literal number to expand the prefixes for radices & exactness. On the other hand, literal arrays' contents will usually be much larger than just the initial characters denoting the element type, so the length of the prefix is no more unnecessarily significant if increased slightly. Let me also point out here that much of Scheme's naming conventions and lexemes originated from T. In T, there was no built-in facility for multi-dimensional arrays, but there were still object representation names used by Orbit's representation analyzer and for the C & Pascal FFIs. These were named semi-verbosely, as I suggest above; e.g., the representation descriptor of unsigned, sixteen-bit integers was named REP/INTEGER-16-U. Many of the names in T were intended to be long enough to be understandable and not obscure, but not so long as to be excessive; this has tended to hold in Scheme as well. I think it would be good to preserve that in the array element type names as well. > | Also, one more comment on the draft: it doesn't actually say, as far > | as I can tell, anything about the actual syntax of arrays. It just > | gives an example & a reader. This is a rather glaring omission. > > Thanks for pointing this out. I have replaced the example with: Thanks. That is much better. > [...] > > A two-by-three array of unsigned 16-bit integers is written: > > #2au16((0 1 2) (3 5 4)) > > This array could have been created by (make-array (Au16) 2 3). Insignificant point: I think it would probably be a bit better to follow that call to MAKE-ARRAY with code to initialize the new array.