On Fri, Sep 27, 2019 at 12:24 PM Lassi Kortela <xxxxxx@lassi.io> wrote:
 
What the reader CL can have is a parameter to pick which representations
of ()/NIL/false/null will returned at each call to (read). Since we ship
the reader in our own library, it's not much effort to add options.

Remember though that although we provide a reader, we don't require it to be used, which is why we need an API standard as well as a format standard, at least for Scheme.  The rewrite-on-read procedure I described earlier achieves the right result if given some context.
 
Yes. The reader might also need user choice for what to do upon
encountering a non-existent package.

Which is why it's a restartable exception in CL.
 
I would read |cluser:foobar| as a symbol named "cluser:foobar" in the
default package (i.e. the symbol has no package prefix).

The trouble now is that ||s are now semantic instead of just syntactic.  cluser:foobar and |cluser:foobar| are the same on all systems except CL, but different on CL.  Maybe we do need a separate type for package-qualified symbols, but I'm not personally interested in spending time on it, as they mostly appear just in code, never in general data.
 
Again, I'll argue that Lisp can simply give you uninterned symbol from
somewhere. It's easier to support them as an extension than have people
filter them out from all data they ever write.

Can you give examples of data that contain them, other than the output of macroexpansion?
 
What about using underscore as the digit separator? Dash brings to mind
subtraction and Lisp symbols / Scheme identifiers, though there is
probably no serious risk of confusion.

I was thinking of UUIDs, but I don't care that much.  Overall I think hyphens are more human-readable.
 
Well, I hate to admit this but base64 is still convenient. One of the
latest examples is storing image files in data: URIs.

Again because URIs were ASCII-only until fairly recently and are still not 8-bit clean.
 
Ah, ok. I'll wait for your #name syntax before evaluating the #u8"".

I talked about this above and you LGTMed it.  The format is # followed by name or hexcode, followed by list, string, or bytevector. 
 
OK, maybe loss of float precision is fine.

Actually there is no loss.  Scheme guarantees that when you read and write binary floats as text you don't lose precision, much less get incorrect results.  Many libraries now guarantee the same properties (I believe glibc does).  There is no exact representation of 1.1 in binary32 floats, but there is a particular float that is closer to mathematical 1.1 than any other float is, and it may be safely printed as "1.1"
 
The big concern is representing exact quantities of money. Databases
have had to deal with this problem for ages; what do they do?

They use integers with decimal scaling factors.  We can do that by representing 1.1 as 11/10.  It gets lengthy if the scaling factor is big, but not that lengthy.



John Cowan          http://vrici.lojban.org/~cowan        xxxxxx@ccil.org
Female celebrity stalker, on a hot morning in Cairo:
"Imagine, Colonel Lawrence, ninety-two already!"
El Auruns's reply:  "Many happy returns of the day!"