Splitting foreign-error:code Lassi Kortela (27 Jul 2020 06:46 UTC)
Re: Splitting foreign-error:code Lassi Kortela (27 Jul 2020 08:26 UTC)
Re: Splitting foreign-error:code Lassi Kortela (27 Jul 2020 08:51 UTC)
Re: Splitting foreign-error:code John Cowan (28 Jul 2020 19:25 UTC)
Flat vs nested alist Lassi Kortela (27 Jul 2020 23:50 UTC)
Re: Flat vs nested alist Lassi Kortela (27 Jul 2020 23:53 UTC)
Re: Flat vs nested alist John Cowan (28 Jul 2020 03:06 UTC)
Pre-SRFI for property list utilities Lassi Kortela (28 Jul 2020 07:35 UTC)
Re: Pre-SRFI for property list utilities hga@xxxxxx (28 Jul 2020 11:00 UTC)
Plist utilities and SRFI 198 Lassi Kortela (28 Jul 2020 11:08 UTC)
Re: Plist utilities and SRFI 198 John Cowan (28 Jul 2020 18:12 UTC)
plist pre-SRFI hga@xxxxxx (12 Aug 2020 15:14 UTC)
Re: plist pre-SRFI John Cowan (12 Aug 2020 15:21 UTC)
Re: plist pre-SRFI John Cowan (12 Aug 2020 15:53 UTC)
Re: plist pre-SRFI hga@xxxxxx (12 Aug 2020 15:58 UTC)
Re: plist pre-SRFI John Cowan (12 Aug 2020 16:59 UTC)
Re: plist pre-SRFI hga@xxxxxx (12 Aug 2020 17:34 UTC)
Re: plist pre-SRFI John Cowan (12 Aug 2020 19:37 UTC)
Use of SRFI 198 in SRFI 170 hga@xxxxxx (12 Aug 2020 20:04 UTC)
Re: Splitting foreign-error:code for SRFI 198, Foreign Errors Lassi Kortela (28 Jul 2020 09:10 UTC)

Re: Splitting foreign-error:code for SRFI 198, Foreign Errors Lassi Kortela 28 Jul 2020 09:10 UTC

> Personal experience in designing many APIs, and some user interfaces
> as well, which in their own way are APIs.  I like a rule of thumb of 3s,
> ideally no more than 3 choices at every level, and they nest no more
> than 3 deep.

Very interesting. I always wonder whether these preferences are just a
matter of habit or prior exposure, or whether people fundamentally
process information in different ways which would make one or the other
kind of representation easier.

>> On a related note, when you have a data structure, adding more functions
>> to work on that data structure makes it stronger. Perlis: "It is better
>> to have 100 functions operate on one data structure than 10 functions on
>> 10 data structures."
>
> This is perhaps a difference between an API, *of which there are
> many*, and pretty much each with its own organization, and a language.
> In the latter, it's reasonable make the Perlis argument as Clojure does.
> On the other hand, I prefer quite a bit less choice in both, one major
> reason I'm back to Scheme after several years with Clojure.
>
> I like to be able to learn and use as little of an API as possible,
> the investment in a language is more generally useful.

One of the key discoveries that were enabled by Lisp is that languages
and APIs are somewhat interchangeable. An API is just a language and a
language is just an API.

It's true that a general-purpose programming language is useful for more
things than most APIs. But companies are growing gigantic APIs nowadays,
and some of them will eventually be more useful than some programming
languages. Mathematica / Wolfram Alpha is a particularly good example of
the blending of language and API into one.

> One other factor: by inclination, I'm a scientist, external factors
> plus not being bad at software and systems is why I'm here today.
> So arranging things into hierarchies is more natural to me than
> making them flat lists.

It's great to have more people from a scientific background in software.
We don't have enough diligent people. Maybe our field still has hope :)

More below on science (theory) vs design.

> John mentioned the lower numbers have been stable for decades.  Not
> that we can depend on that....

I have the errno numbers for multiple OSes in a CSV file so I crunched
it: there are at least 84 conflicting numbers (i.e. the same errno
number refers to a different EFOOBAR identifier on different OSes).

John is probably right that the low numbers are the same everywhere.
Errno 35 is the lowest conflicting number in my data set.

<https://misc.lassi.io/2020/errno-conflicts.text>

> I'm a BIG proponent of REPLs, which have
> enabled some of my biggest time constrained programming feats.

Same here. They are indispensable for exploring a new system and new
problems. Things like (apropos "foo") and a good pretty-printer help a
lot. That's why the error-set enumeration is nice too -- it's like
`apropos` for errors.

>> So the example from the current draft:
>>
>> (make-foreign-error
>>    '((error-set . errno)
>>      (code . ((number . 2)
>>               (symbol . errno/ENOENT)))
>>      (scheme-procedure . open-file)
>>      (foreign-interface . open)
>>      (message . "open-file called open: ...")
>>      (data . ((arguments . ("not-a-valid-filename" 0 428))
>>               (heritage . "SRFI 170")))))
>
> I claim the above two levels are OK because the first level
> simply tells make-foreign-error which slot to put the following
> data in.  If a key other than error-set is missing, it gets #f.
>
> This is something only the programmer using SRFI 198 has to master,
> the user of his library will only see one flat level per slot.
> Something I've been doing implicitly, since as you note it's the
> right thing, but that we should probably make explicit in the SRFI.

I don't fully understand. E.g. if a user of SRFI 170 (POSIX API) wants
to get the errno symbol for a foreign error, wouldn't he need to do the
following (when we have nested alists):

(let ((e (let ((pair (cdr (assoc 'symbol (or (foreign-error:data ferr
'code) '())))))
            (and pair (cdr pair)))))
   ...do something with e...)

instead of this (when we have one flat alist):

(let ((e (foreign-error:data ferr 'symbol)))
   ...do something with e...)

It's true that SRFI 198 can always deal with just one flat alist, and
not care whether the programmers constructing error objects put nested
alists or some other types of data in there.

But abstraction is all about service, and if SRFI 198 returns alists to
users, it provides less of a service than if it were to return leaf
values that the caller can manipulate directly without first traversing
a collection to get at them.

A key tenet of interface design is that indirection is not abstraction.
An interface that offers a choice of things (e.g. different data types
to put in the error slots) is not more abstract than those things by
themselves. A good abstraction is conceptually simpler than the things
it hides. Hence the user has the luxury of keeping fewer things in mind,
which is what it means to provide a service.

For example, (foreign-error:data ferr 'key) -> value or #f is an
abstraction since missing and false keys are conflated into the same
return value #f and the caller doesn't need to care which it is.

If lambdas can be stored in error fields and the accessor transparently
calls those lambdas and returns their return values, that's a further
abstraction since the user doesn't have to care about lambdas vs
non-lambdas.

An (foreign-error:data ferr 'code 'number) accessor to traverse nested
alists would be an abstraction since it would cope gracefully with a
missing 'code slot and just return #f. But if we don't use nested
alists, both the interface and the implementation can be simpler.

> One thing this does is place a greater emphasis on the exact alist you
> feed to make-foreign-error, or having it reorder the alist so for
> example 'error-set is always first.

Order of keys shouldn't matter; instead of an alist, a hash-table could
be used internally.

Maybe it's nice to have a particular order for display, but that should
be the printer's concern IMHO.

> Are their cases where two or more
> items are logically tightly bound, and should be grouped together??

Yes, there are. For example, the 'code key is necessarily coupled with
the 'error-set key. Different error sets can have different meanings for
the same code (e.g. the number 2 can be a completely different error).

A rigorous approach would indeed have the data structure shaped like a
dependency graph, which nested alists resemble. But here we get to the
difference between theory and design. While theory prizes faithful
representation, design simply has to be good to use. Good design is
usually principled, but there comes a point of diminishing returns where
principle turns into pedantry and gets in the way of the user's goals.
It's always debatable where exactly this point is reached, but it seems
to be a hallmark of good design that the designer knows something to be
true without making a point of it. (Impure functional languages are a
good illustration of this principle - the designers know that side
effects are not pure, but faithfully representing all effects would
necessitate decades of type system research and lots of type annotations.)

In this case we know that some fields are coupled but should keep them
separate anyway, because encoding all coupling in the structure would
require a nested instead of flat structure, which complicates the
interface and gets in the way of providing a service which is ultimately
always the goal of abstraction.

(I wish programming courses simply taught students that "abstraction is
like going to a restaurant instead of cooking your own meal" instead of
confusing the hapless students with the nuts and bolts of indirection
and encapsulation in an object-oriented language. Some Java courses are
particularly abysmal in this respect. They not only fail to teach that
indirection is not abstraction, but actively teach that indirection is
the only kind of abstraction there is. /tangential rant :p)

> I'm not sure that's a real issue, or at least it's subsumed by having
> conventions require a certain order for human consumption of the
> resultant single alist.  We could add that to the API fairly easily,
> for example have the symbol that's the value of the error-set
> contain a list of top level symbols in the correct order.

A list in the printer giving the right display order is probably a good
idea.

In general, by cultural convention, alist keys are not expected to be in
a particular order. `assoc` gets the first matching key, but that only
matters in the rare case that there are duplicate keys. If the order of
non-duplicate keys matters, that would probably surprise most Lispers.

> See above, the end resulting error object has nesting of slots
> containing alists, but no nested alists.

We use an alist to provide the service of a key-value collection - the
same service that the `(foreign-object:data ...)` accessor provides. A
concrete error object and a concrete alist are the same abstraction from
this point of view, so an alist in the object's slot is morally
equivalent (:p) to a nested alist.

> I've already decided all slots are optional except error-set....

Enthusiastically agreed, "everything optional" is the right principle.

Even error-set can probably be optional. It just makes it hard to
interpret any other info, but you can still display the message, etc.

Enforcing things is not that simple in dynamic languages, so forcing
users to give an error-set might not work in practice anyway.

The principle in static languages is "don't violate the types". The
equivalent principle in dynamic languages is something like "don't go
against the grain". When in Rome, do like Romans do, etc. On the whole,
dynamic languages don't grow good systems by enforcing constraints but
by building things that have good synergy with existing things.

> I predict I'm going to like this.

Thank you very much for listening to the arguments. We could at least
see how it goes. At least it has nice synergy with other Lisp customs
like property lists (as per John's pre-SRFI), which again are somewhat
symmetrical to keyword arguments.

>>> [ hey, it's your API design you're throwing in the trash can ^_^. ]

"Write one to throw away" is one design principle the Unix and Lisp
communities happily agree on :) Hindsight is better than foresight.