Re: Procedures (interfaces)

Re: Procedures (interfaces) Bradd W. Szonye 22 Oct 2003 22:15 UTC
> Bradd W. Szonye wrote:
>> Set! compatible with R5RS - The original specification for *-set! was
>> incompatible with R5RS vector-set!, which modifies the vector in-place
>> and returns an unspecified value. I changed *-set! to match that, and I
>> changed *-set (the functional version) to return the modified
>> collection
>> and the previous value.

Taylor Campbell wrote:
> This doesn't work at all, because *-SET! is _linear_update_, not
> purely mutable.  I am _adamantly_ against removing linear update
> procedures and replacing them with purely mutable ones ....

OK, that's fine, but then the *-set! procedure needs a different name:

1. In R5RS, *-set! is always purely mutable.
2. Vector-set! already exists with pure-mutable semantics.

>> *=? and <=?: I added question marks for consistency with functions
>> like string=?. I also removed the elt= argument and changed the
>> definition so that it was no longer necessary. I was reluctant to
>> make this change, but I felt that the original version just wasn't
>> viable.

> I don't like this at all.  Removing ELT= is removing a _lot_ of
> functionality.  The reason the ? wasn't there was _because_ the
> procedures were not comparable to STRING=?, CHAR=?, et cetera.

OK, I agree. How about this instead:

1. Rename the procedure to *-equal? so that you don't end up with both
   string= (SRFI-44) and string=? (R5RS). In general, it's best to avoid
   identifiers that differ by only one character.

2. For dictionaries, if elt=? is a procedure, use it to compare both
   keys and values, and if elt=? is a pair of procedures, use (car
   elt=?) to compare keys and (cdr elt=?) to compare values. For all
   other collections, use elt=? or (cdr elt=?).

(Using the dictionary's key-equivalence function wouldn't work --
consider what happens when you try to compare two dictionaries with
different KEFs.)

By the way, one of the reasons I wanted to get rid of elt=? was to make
this more convenient for higher-order functions. If somebody can come up
with a reasonable default for elt=? and a syntax for omitting it, that'd
be great. If not, we can always use (lambda a (apply *-equal? eqv? a)).

> I don't understand the rationale for having ordering functions on
> _collections_.  Can you explain this in more detail?  (or in any
> detail at all: I can't find any)

Mainly because sets, sequences, and ordered collections all have
well-defined total ordering functions (proper subsets and
lexicographical comparisons). Also because it gives users a built-in
ordering function for creating ordered collections of collections (which
is fairly common in some programming styles).

>> procedure: % {key<? | key=?} {val<? | val=?} value ... => %
>> defined for: all collections

> The ordering functions here sort of make sense, I guess...but I don't
> like how they work with these two constructors.  It's ambiguous what
> argument corresponds to what; for example, is (list = 1 2 3) a list
> with = for a comparison function and (1 2 3) for contents, or a list
> (#{Procedure} 1 2 3)?  (where '#{Procedure}' indicates an abstract
> procedure value)

Sorry, I didn't explain the syntax. The {opt} indicates an optional part
of the interface, not an optional argument. For example:

    procedure: bag val=? value ... => bag
    procedure: dictionary key=? val=? pair ... => dictionary
    procedure: ordered-dict key<? val=? pair ... => ordered-dict

The arguments aren't optional, but not all constructors use them.

>> procedure: *-copy * => *
>> defined for: all collections
>>
>> Returns a new collection of the same type as the argument with the same
>> contents but distinct storage. The copy is shallow; that is, it copies
>> values in a way that preserves object identity.

> I should like a *-COPY for sequences that also accepts optional start
> and end arguments.

Yes, that would be good. If provided, I'd prefer "start to end-1"
semantics, like C arrays and C++ iterators.

>> TYPE PREDICATES
>>
>> procedure: collection? value => value
>> procedure: %? value => value
>> defined for: all collections
>>
>> Returns a true value iff the provided value supports the interface for
>> the type (%) named in the predicate.

> Why %? and not *??

I didn't think it made a difference in this case. It's defined for all
collections, and it has "isa" semantics. If you think * makes more
sense, feel free to change it.

>> CARDINALITY OPERATORS
>>
>> procedure: *-empty? collection => boolean

> Don't forget the 'defined for: all collections' bit.

Oops, thanks.

>> procedure: *-size collection => exact integer
>> defined for: all collections
>>
>> If the collection has a concept of size, this function returns the
>> number of values or mappings in the collection. If it does not, #f must
>> be returned. If possible, this should return the number of times
>> collection-fold will call the fold-function before halting.

> Have we decided on the issues of infiniteness and circularity yet?

I think so, but I'm not certain. We did decide that *-size was a
sufficient replacement for finite-collection?. I think we also agree
that *-size should return the number of enumerations and that a circular
data structure is not necessarily infinite, but I'm not sure.

>> procedure: *-equivalence-function * => procedure
>> defined for: bags, sets, and dictionaries
>>
>> procedure: *-ordering-function * => procedure
>> defined for: ordered bags, sets, and dictionaries

> Where's the rationale for defining these _and_ [the keyed versions] on
> dictionaries?

To support the "simple" version of *=. Therefore, you can drop
dictionaries from the list if you stick with the original *= interface.

However, it just occurred to me that there's some ambiguity for
ordered-collection -- dictionaries are usually in key order, but other
collections are in value order. I think we actually need to separate
them into "ordered-collection" and "key-ordered-collection." Otherwise,
you'll have problems with polymorphism.

>> procedure: *-update * position f [thunk] => *
>> procedure: *-update! * position f [thunk] => unspecified
>> defined for: sequences and ordered collections
>>
>> procedure: *-update-key * key f [thunk] => *
>> procedure: *-update-key! * key f [thunk] => unspecified
>> defined for: sequences and ordered collections

> I don't understand why you're differentiating between these two, and
> why they aren't defined for dictionaries.

First, there's a typo: *-update is for sequences, *-update-key is for
dictionaries.

The reason for having two different functions is because key lookup is
*not* the same thing as indexing. I gave them similar names because
they're similar, but I didn't merge them because they're orthogonal.

Specifically, consider a dictionary that is also a sequence -- hash
tables, binary search vectors, and alists are good examples. By having
two functions, we can look up values by index or by key.

>> procedure: *-add-key[!] * new-key mapped-value [thunk] => *
>> defined for: dictionaries
>>
>> Returns a collection of the same type with the new key mapped to the
>> given value.

> Why have *-ADD with key/value pairs _and_ this?

Syntactic sugar, mostly. Also for consistency with the other -key
functions.

>> procedure: *-remove-every[!] * value => * list
>> procedure: *-remove-each-from[!] * source-collection [thunk] => % list
>> procedure: *-remove-every-from[!] * source-collection [thunk] => % list
>> procedure: *-remove-key[!] * key [thunk] => * pair

> Why differentiate between removing and removing keys?

Because they're two different operations: *-remove takes a value out of
a bag, and *-remove-key takes a key out of a dictionary. If you have a
dictionary that is also a bag, those need to be two different methods.

In other words, this is the same issue as update/update-key above.

>> procedure: *-remove-every-key[!] * key [thunk] => * list
>> procedure: *-remove-each-key-from[!] * source [thunk] => * list
>> procedure: *-remove-every-key-from[!] * source [thunk] => * list

> Er, why all of the name changes?

Hm, thought I already explained this. I couldn't get my head around
"all" and "any" -- kept thinking that "all" was the one that got rid of
all the copies. I dunno whether "every" and "each" is better, but I
*strongly* dislike any & all. I chose "each" to mimic "for-each."

>> procedure: *-insert-right * new-value [thunk] => *
>> defined for: reversible sequences
>>
>> procedure: *-delete-right * [thunk] => *
>> defined for: reversible sequences and reversible ordered collections

> Don't you mean 'reversible _flexible_ sequences?'

The original version allowed insert-right (but not insert-left) even for
non-flexible sequences. And the definition of non-flexible (limited)
sequences does permit addition at the end for some sequences.

My assumption here would be that fixed-size sequences (and others which
do not support end-extension) would call thunk.
--
Bradd W. Szonye
http://www.szonye.com/bradd