SQLite subprocess working
Lassi Kortela
(17 Sep 2019 17:50 UTC)
|
||
Buffaloed and dogpiled (was: SQLite subprocess working)
John Cowan
(17 Sep 2019 19:07 UTC)
|
||
Re: Buffaloed and dogpiled
Lassi Kortela
(17 Sep 2019 21:01 UTC)
|
||
(missing)
|
||
Re: Buffaloed and dogpiled Lassi Kortela (19 Sep 2019 09:09 UTC)
|
||
Re: Buffaloed and dogpiled
John Cowan
(20 Sep 2019 15:25 UTC)
|
>> I think the root of our disagreement, to the extent there is such, is >> whether or not text formats are simple. Personally I'm of the opinion >> that there is no such thing as a simple text format > > There are, actually. Microsoft .INI format is simple, for example, though > it has grown a lot of optional cruft over the years. MicroXML is far more > capable, is specified in 10 pages of prose at < > https://dvcs.w3.org/hg/microxml/raw-file/tip/spec/microxml.html>, and is > implemented in 428 SLOC of JavaScript at < > https://github.com/jclark/microxml-js> (along with a lot of tests and test > drivers). With all due respect, since you are the editor of the MicroXML specification, those two formats are good illustrations of where we disagree. To a diligent implementor, both are far from simple. To take .ini for example, some concerns off the top of my head: Is this .ini section valid? What is its name: [section name with a bracket[] What about backslashes: [section name with a backslash\] What about whitespace before or within the brackets? [ section ] Whitespace before variable name, beside equals sign: name= value name = value Is trailing whitespace trimmed from values or kept in? What if the value contains an equals sign? name==value What if the value contains quotes? name="value " What if there's text after the quotes? name="value" and more value, or is there? What if the name contains quotes? "name is this" = value is that Are comments permitted? # is this a comment = or a name-value pair? What about after the value? name = value # comment or value? etc. There are answers to all of the above questions, and they depend on who you ask. But my point is that none of the above concerns should arise in the first place if we are talking about a program sending data to another program. There is no reason why a program should be concerned with byte-order marks, whitespace, comments, escaping, line and token delimiters to such a degree. Delimiters in any format (text or binary) are suspect by default. Why doesn't the format say up front how much data is about to come? Most of the time, the sender knows well how much data it's sending. .ini is simple, and MicroXML is moderately simple, if you start from the assumption that text formats are simple to begin with. I start from bits and bytes; from that perspective, text brings a great deal of complexity for no reason. For specific tasks where people need to read and write the stuff, there can be an equivalent text format representing the same data model, as for example with binary and text S-expressions. IPC doesn't have that requirement, so text just brings more complexity. In the rare cases you need to talk text to a binary-IPC program, just add a text filter to the pipeline. The filter is optional complexity; I'd keep required complexity to a minimum. > But then James Clark is beyond brilliant, and even for him it > isn't easy to design something simple. This is the shortest proof of my point. By contrast, anyone who understands varints can instantly design a simple binary format with none of the above concerns that puzzle conscientious implementors of something as pedestrian as .ini. I want to thank you that this discussion has made me realize something that I've never before realized in 15 years of thinking about this stuff: a generic, repurposable binary format should have a text-based dual with the same data model. That will satisfy scenarios where data needs to be text-edited. The dual of text and binary S-expressions made it a breeze to implement the IPC in the current database subprocesses. The C side can be simple, dealing only with binary, and yet Scheme code in text can be translated directly to IPC commands, and IPC responses can be printed as Scheme.