| Date: Sun, 26 Dec 2004 23:14:00 -0800 (PST) | From: xxxxxx@autodrip.bloodandcoffee.net | | On Sun, 26 Dec 2004, Aubrey Jaffer wrote: | | > Arrays are a fundamental data organizing paradigm from the origins of | > computing; FORTRAN has arrays; APL has arrays. I hope arrays will | > become part of Scheme in R6RS. For a construct which generalizes two | > of Scheme's three aggregate data types, a succinct read-syntax does | > not seem overly burdensome. | | Need it be so succinct as to add eleven new octothorpe reader macros, | each dispatching further for the large number of different types of | arrays? It would be much simpler, I think, and it would not lose much | brevity, to use SRFI 10; indeed, SRFI 10 was designed in response to | this issue as it arose in SRFI 4. | | > | In particular, I suggest that it be: | > | | > | #,(ARRAY [<rank>] <type> <elements> ...) | > | > Rank cannot be deduced from <element> nesting for heterogeneous | > arrays. I suggest that <rank> be required. | | Sorry, I was not sufficiently clear there. I meant to specify that the | rank defaults to 1, like #Axxx(...) in the current proposal. In the updated srfi-58.html I sent to the editor I have eliminated the #Axxx syntax. The rank digit(s) will be required. | > | So, for example, the two-by-two array of unsigned 16-bit integers from | > | the document might be written as #,(ARRAY 2 u16 (0 1) (2 3)). | > | General object arrays' types would be OBJECT (so #(FOO 1 #T ()) | > | could also be written #,(ARRAY OBJECT FOO 1 #T ())) and character | > | arrays' types would be CHAR (so "foo" could alternatively be | > | written #,(ARRAY CHAR #\f #\o #\o)). | > | > This appears to introduce type symbols like U16 and CHAR which are not | > part of srfi-47. The prototype functions in srfi-47 return arrays. | > | > | [...] | > | > I am not opposed to also having SRFI-10 syntax for arrays. This would | > seem to require reserving a set of symbols for type specification, | > which is an unschemely way of doing things. Scheme goes to some | > lengths to avoid using symbols as cookies; witness NULL? and | > EOF-OBJECT? | | Perhaps I'm confused, but I don't see much difference between my usage | of symbols -- which exist only at read-time, never at run-time, unlike | nil and the EOF object -- and your usage of the suffixes of the new #A | syntax. Could you elaborate on how my proposal is any worse in that | respect than yours? To keep symbols-as-cookies out of Scheme to this point probably means that some RRRS-author(s) is severly allergic to it. I want arrays in R6RS. I don't want to jeopardize array's chances by making a proposal which looks like symbols-as-cookies, even if it is not exactly true in a technical sense. SRFI-10 mandates parentheses (eg. #,(infinity) instead of #,infinity). This makes its SRFI-10 objects look like expressions to be evaluated. SRFI-58 objects will be used as prototype array objects in calls to MAKE-ARRAY: (make-array '#1Ar64(1.0) 2 3) ; Current SRFI-58 syntax (make-array '#,(Array 1 ar64 [1.0]) 2 3) ; SRFI-10 style (make-array '#,(Ar64 [1.0]) 2 3) ; compact-SRFI-10 style. ; [] nesting gives rank. (make-array (Ar64 1.0) 2 3) ; Current SRFI-47 functions ==> #2Ar64((1.0 1.0 1.0) (1.0 1.0 1.0)) The SRFI-10 style above looks like symbols-as-cookies. The compact-SRFI-10 style does not. Do you like the compact-SRFI-10 style; or would it take too much of SRFI-10s namespace? Having the read prefix use the same coding as the prototype functions halves the (human) memory load. If we move to nomenclature like REAL-64, then I want prototype functions to be available with those names: (make-array '#,(Array 1 real-64 [0.0]) 2 3) ; longer SRFI-10 Style (make-array '#,(real-64 [0.0]) 2 3) ; longer compact-SRFI-10 (make-array (real-64 0.0) 2 3) ; analogous SRFI-47 function | > | (I'd also prefer that the names be longer & much more descriptive, like | > | UNSIGNED16 or BOOLEAN, but I suppose that's a little too late, now that | > | SRFI 47 has already been finalized & the incomprehensible abbreviations | > | of array types have been set into stone...) | > | > SRFI-47 defines procedures to return prototype arrays. Additional | > procedures can be added to alias the abbreviated ones. | | This works for SRFI 47, but not necessarily this SRFI: one cannot | define one's own aliases for existing array types in the reader | syntax. Yes. That is why we are dicussing this now; before SRFI-58 is finalized. | > But explicitly complete descriptions for numeric types are rather | > long: | > | > [...long list...] | > | > These long names present more of a burden for the memories of | > non-English-speakers than the short names, which are the same for | > everyone. | | I'm not suggesting names so long that they induce tedium in typists, | but rather names somewhat longer than are excessively obscure, such as | INTEGER-U16, COMPLEX-64, BIT, et cetera. This is requiring users to internalize assumptions that integers are exact; and reals and complexes are not. Scheme has a strong propensity for calling things exactly what they are, witness CALL-WITH-CURRENT-CONTINUATION, EOF-OBJECT?, LIST?, and PAIR?. | Furthermore, the single-character mnemonics are derived from | English, and there is certainly the possibility that their names | would begin with different initial letters in other languages; | however, everything in Scheme is from English anyway, so I see | nothing wrong with using English words for array element type | names. English doesn't much help remember Scheme exponent markers: The letters `s', `f', `d', and `l' specify the use of SHORT, SINGLE, DOUBLE, and LONG precision, respectively. I don't usually think of a DOUBLE as shorter than a LONG. And where did `f' for SINGLE come from? Maybe it is a C-ism. In any case, it is one of five characters (with 'e') rather than one of five longer sequences to remember. | > There is Scheme precedent for abbreviated names in identifiers | > like CADR an CDADAR and in the radix and exactness prefixes #B, | > #O, #D, #X, #E, #I. | | ... A better analogue would be ARRAY-REF, but I haven't seen any | objections to that as opposed to AREF, and I much prefer ARRAY-REF | rather than AREF. I am not opposed to longer names, but they must work together and they must integrate well with Scheme. | Let me also point out here that much of Scheme's naming conventions | and lexemes originated from T. In T, there was no built-in | facility for multi-dimensional arrays, but there were still object | representation names used by Orbit's representation analyzer and | for the C & Pascal FFIs. These were named semi-verbosely, as I | suggest above; e.g., the representation descriptor of unsigned, | sixteen-bit integers was named REP/INTEGER-16-U. Many of the names | in T were intended to be long enough to be understandable and not | obscure, but not so long as to be excessive; this has tended to | hold in Scheme as well. I think it would be good to preserve that | in the array element type names as well. I found my T2.7 manual, but it doesn't have FFIs in it. If I come up with longer names and they aren't better than the current system (used by SCM for many years), then I would be making a straw-man. Please replace the first column of this table with a set of better names, so we can discuss this change in more concrete terms. prototype procedure exactness element-type ========= ========= ============ vector any (conventional vector) ac64 inexact 64-bit+64-bit complex ac32 inexact 32-bit+32-bit complex ar64 inexact 64-bit real ar32 inexact 32-bit real as64 exact 64-bit signed integer as32 exact 32-bit signed integer as16 exact 16-bit signed integer as8 exact 8-bit signed integer au64 exact 64-bit unsigned integer au32 exact 32-bit unsigned integer au16 exact 16-bit unsigned integer au8 exact 8-bit unsigned integer string char (string) at1 boolean (bit-vector)