I'd like you (Lassi, but anyone else who wants to, of course) to closely review my ASN.1 LER pre-pre-SRFI. This should make it clear that I am *not* against binary encoding in all contexts. I'm cc-ing Schemepersist faute de mieux. (The API is trivial: one procedure to write a Scheme object to a port, one to read a Scheme object from a port, and predicate for "uncodable object".)

LER is *almost* a superset of X.690 ASN.1 DER (Distinguished Encoding Rules). Each object is encoded as follows:

1) Type. 1-2 byte code which identifies the type of the serialized object and whether it is bytes or sub-objects. Some are X.690 standard, others are "private specification" codes (as distinct from "private use" which we don't define, leaving it to application programmers). There is only one 2-byte code (ISO 8601 duration) and I wouldn't weep too hard if we left it out.

2) Length. A length of 1-127 inclusive is encoded in one byte. A greater length is encoded as one byte with value 128+k, where k is the number of bytes (between 1 and 126 inclusive) that represent the actual length. The next k bytes are the actual length as a big-endian base-256 value. Practical values of k are probably 1, 2, 4, 8.

3) The content itself, either raw bytes or contained objects. Note that all numeric content is big-endian.

The Google Sheet at http://tinyurl.com/asn1-ler gives the proposed type codes and what they mean.

The only deviation from DER is that sets do not have to be sorted into binary lexicographic order.