Core lexical syntax
Lassi Kortela
(25 Sep 2019 10:15 UTC)
|
||
Re: Core lexical syntax
John Cowan
(25 Sep 2019 14:09 UTC)
|
||
Machines vs humans
Lassi Kortela
(25 Sep 2019 14:25 UTC)
|
||
Re: Core lexical syntax
Alaric Snell-Pym
(25 Sep 2019 15:44 UTC)
|
||
Re: Core lexical syntax
John Cowan
(25 Sep 2019 14:13 UTC)
|
||
Re: Core lexical syntax
John Cowan
(25 Sep 2019 19:18 UTC)
|
||
Mechanism vs policy
Lassi Kortela
(25 Sep 2019 19:58 UTC)
|
||
Re: Mechanism vs policy
Arthur A. Gleckler
(25 Sep 2019 21:17 UTC)
|
||
Re: Mechanism vs policy
Lassi Kortela
(26 Sep 2019 07:40 UTC)
|
||
Re: Mechanism vs policy
John Cowan
(25 Sep 2019 22:25 UTC)
|
||
Re: Mechanism vs policy
Arthur A. Gleckler
(26 Sep 2019 01:34 UTC)
|
||
Limits, symbols and bytevectors, ASN.1 branding Lassi Kortela (26 Sep 2019 08:23 UTC)
|
||
Re: Limits, symbols and bytevectors, ASN.1 branding
Alaric Snell-Pym
(26 Sep 2019 08:56 UTC)
|
||
Re: Limits, symbols and bytevectors, ASN.1 branding
John Cowan
(27 Sep 2019 02:38 UTC)
|
||
ASN.1 branding
Lassi Kortela
(27 Sep 2019 14:56 UTC)
|
||
Re: ASN.1 branding
Alaric Snell-Pym
(27 Sep 2019 15:24 UTC)
|
||
Re: ASN.1 branding
Lassi Kortela
(27 Sep 2019 18:54 UTC)
|
||
Re: Limits, symbols and bytevectors, ASN.1 branding
John Cowan
(27 Sep 2019 01:57 UTC)
|
||
Re: Limits, symbols and bytevectors, ASN.1 branding
Lassi Kortela
(27 Sep 2019 16:24 UTC)
|
||
Re: Limits, symbols and bytevectors, ASN.1 branding
John Cowan
(27 Sep 2019 17:37 UTC)
|
||
Re: Limits, symbols and bytevectors, ASN.1 branding
Lassi Kortela
(27 Sep 2019 18:28 UTC)
|
||
Re: Limits, symbols and bytevectors, ASN.1 branding
John Cowan
(27 Sep 2019 18:39 UTC)
|
||
Re: Limits, symbols and bytevectors, ASN.1 branding
Lassi Kortela
(27 Sep 2019 18:46 UTC)
|
||
Re: Limits, symbols and bytevectors, ASN.1 branding
John Cowan
(27 Sep 2019 21:19 UTC)
|
||
Re: Mechanism vs policy
Alaric Snell-Pym
(26 Sep 2019 08:45 UTC)
|
||
Implementation limits
Lassi Kortela
(26 Sep 2019 08:57 UTC)
|
||
Re: Implementation limits
Alaric Snell-Pym
(26 Sep 2019 09:09 UTC)
|
||
Re: Implementation limits
Lassi Kortela
(26 Sep 2019 09:51 UTC)
|
||
Meaning of the word "format"
Lassi Kortela
(26 Sep 2019 10:31 UTC)
|
||
Stacking it all up
Lassi Kortela
(26 Sep 2019 11:05 UTC)
|
||
Brief spec-writing exercise
Lassi Kortela
(26 Sep 2019 11:46 UTC)
|
||
Re: Brief spec-writing exercise
John Cowan
(26 Sep 2019 15:45 UTC)
|
||
Standards vs specifications
Lassi Kortela
(26 Sep 2019 21:24 UTC)
|
||
Re: Standards vs specifications
John Cowan
(27 Sep 2019 04:29 UTC)
|
||
Re: Standards vs specifications
Lassi Kortela
(27 Sep 2019 13:47 UTC)
|
||
Re: Standards vs specifications
John Cowan
(27 Sep 2019 14:53 UTC)
|
||
Re: Meaning of the word "format"
John Cowan
(26 Sep 2019 20:59 UTC)
|
||
Re: Meaning of the word "format"
Lassi Kortela
(26 Sep 2019 21:09 UTC)
|
||
Re: Meaning of the word "format"
John Cowan
(27 Sep 2019 02:44 UTC)
|
||
Length bytes and lookahead in ASN.1
Lassi Kortela
(27 Sep 2019 13:58 UTC)
|
||
Re: Length bytes and lookahead in ASN.1
John Cowan
(27 Sep 2019 14:22 UTC)
|
||
Re: Length bytes and lookahead in ASN.1
Alaric Snell-Pym
(27 Sep 2019 15:02 UTC)
|
||
Re: Length bytes and lookahead in ASN.1
hga@xxxxxx
(27 Sep 2019 15:26 UTC)
|
||
(missing)
|
||
Fwd: Length bytes and lookahead in ASN.1
John Cowan
(27 Sep 2019 16:40 UTC)
|
||
Re: Fwd: Length bytes and lookahead in ASN.1
Alaric Snell-Pym
(27 Sep 2019 16:51 UTC)
|
||
Re: Fwd: Length bytes and lookahead in ASN.1
John Cowan
(27 Sep 2019 17:18 UTC)
|
||
Length bytes and lookahead in ASN.1
hga@xxxxxx
(27 Sep 2019 16:58 UTC)
|
||
Re: Length bytes and lookahead in ASN.1
John Cowan
(27 Sep 2019 17:21 UTC)
|
||
Re: Mechanism vs policy
John Cowan
(27 Sep 2019 03:52 UTC)
|
||
Re: Core lexical syntax
Alaric Snell-Pym
(26 Sep 2019 08:36 UTC)
|
>> And the reason the text and binary formats should have 100% equal data >> models, is simplicity for users - the proper aim of abstraction. > > Agreed. > >> So I would like the formats to provide "mechanism, not policy". > > I agree with this policy. :-) We are in agreement about the most important points :) I value these discussions a lot. They take a lot of time and energy, but few complex inventions bear fruit without getting a number of people in agreement. It's just not possible to do much for the world by oneself. > All right. The numerical stuff is only a warning anyway; I'm willing to > make similar recommendations/warnings for the others. "You can ignore > this, but things may go wrong at the other end; there are no guarantees." > A similar recommendation that strings and symbols not be longer than 2^31-1 > characters would be good as well. > > The only thing that continues to trouble me is the symbol nil > (case-insensitive). The overload of #f and () is bad enough without a > symbol that normally nobody ever uses *as* a symbol. Nil is indeed going to be a problem no matter what we do. I already ported the current reader to Common Lisp (https://github.com/lispunion/universal-encoding-cl), and plan to keep the port up to date with our Scheme work, porting the binary format as well. Perhaps the CL writer should write out NIL and T as something like #!cl:nil and #!cl:t. Or many kinds of symbols are problematic, there could be a way to encode a symbol with a "From:" field, and the writer would fill it in for the problematic ones. E.g. #symbol-from{cl NIL} and #symbol-from{cl T}. Something like that. As we discussed, CL symbols also have packages. I think they should be representable on "throw some code into the format at the end of a tired workday" grounds. But again it can be a special field: #package-symbol{"CL-USER" FOOBAR}. Uninterned symbols could be #package-symbol{#!null FOOBAR} or #uninterned-symbol{FOOBAR}. Some Schemes also have uninterned symbols so a common solution needs to be found. Keywords could be :keyword or #keyword{FOOBAR}. The :keyword syntax is nice but it's a bit misleading because in CL, packaged symbols are package:symbol. That looks as if keywords are in the package whose name is "". But they are actually in a package named "KEYWORD". The Scheme writer can just write #t and #f (or #!true and #!false or whatever we pick, as well as ordinary booleans for the binary format). > I continue to think that not letting (read) limit the amount of input is > Very Bad Indeed. Not all programming languages are memory safe, far from > it. Not even all Scheme or CL implementations if you set the compiler > options correctly. Agreed. I was probably unclear: options are great, but they should be options, not required :) If the format can represent anything, that lets us offer simple (read) and (write). If the reader and writer can take lots of options via some unobtrusive means like keyword arguments, that's a good thing. Limits on recursion depth, number size, etc. are probably good for production apps. For example, PHP's standard JSON parser has a depth limit: <https://www.php.net/manual/en/function.json-decode.php>. > Here's my current idea. > > First of all, I want a more compact syntax for bytevectors. My current > notion is for them to match/\[([0-9A-Fa-f][0-9A-Fa-f][-])*\]/. That is, > hex digits with optional hyphens between each byte so you can group things > as you like, and then wrapped in square brackets. I'm not particular about > the square brackets. I like the hex digits thing. I might even go with the base64; quite neutral on it. What's your opinion of simply using strings for the hex? #u8"abcdef1234" > After that, the content of each ASN.1 LER object is one of three things: This comes across like nitpicking but there's a deeper point behind it: I'd change the name "LER" to something else. I know it's consistent with other ASN.1 names like BER and DER, but to a normal person those are hard to remember and (correctly) give a bureaucratic impression. If you want to spread ASN.1 to mortals, every aspect of it needs to be made more approachable than it currently is, even if it means breaking with convention. One natural name would be "Lisp ASN.1". The name "ASN.1" itself is oddly cool for something that was presumably birthed in large conference rooms. It sounds like Formula 1, itself a cool name for an auto-racing contest. They did something right :) > bytes, characters, or sub-objects. So let's write # followed by either a > registered name or hex digits that represent the type code, followed by one > of a string, a bytevector, or a list. So a vector would be #vec(1 2 3) or > #20(1 2 3), a duration would be #dur"1Y2M35D" or #1F22"1Y2M35D", and float > 0.0 would be #float[0000-0000-0000-0000] or #DBt[0000-0000-0000-0000], > although a decimal float would be more interoperable. I have some > registered names in the new column B of <http://tinyurl.com/asn1-ler>, but > this would allow private-use typecodes, which don't have registered names, > to be encoded as text. I like this. I don't mind if vectors are #vec(...) instead of #(...). Both binary and decimal floats would be nice to have. It's good to have binary floats in the text format too, since the binary format has them. Likewise, people are going to write 123.45 in the text format, so it's good to have decimal floats in the binary format. The problem with the [0000-0000] encodings is that we need to introduce extra square-bracket lexical syntax for something that could already be represented as a string: "0000-0000". 0000-0000 could also be lexed as a symbol, but if we want to forbid non-vertical-bar-escaped symbols that start with a digit, that will present a problem. > To make this work on the procedure side, read can be passed a procedure > that accepts a type code and a bytector/string/list and returns the proper > internal representation; on the write side, it would accept an object and > return two values, type code and bytevector/string/list. The invocations > would have to be bottom-up. LGTM.