Re: NUL-terminated strings and eof-object-terminated generators

Show/hide message thread

NUL-terminated strings and eof-object-terminated generators John Cowan (09 Dec 2021 22:44 UTC)

Re: NUL-terminated strings and eof-object-terminated generators Marc Nieper-Wißkirchen (10 Dec 2021 07:26 UTC)

Re: NUL-terminated strings and eof-object-terminated generators Jakub T. Jankiewicz (10 Dec 2021 10:20 UTC)

Re: NUL-terminated strings and eof-object-terminated generators Marc Nieper-Wißkirchen (10 Dec 2021 13:02 UTC)

Re: NUL-terminated strings and eof-object-terminated generators Marc Nieper-Wißkirchen (10 Dec 2021 13:07 UTC)

Re: NUL-terminated strings and eof-object-terminated generators John Cowan (10 Dec 2021 22:22 UTC)

Re: NUL-terminated strings and eof-object-terminated generators John Cowan (10 Dec 2021 21:38 UTC)

Re: NUL-terminated strings and eof-object-terminated generators Marc Nieper-Wißkirchen (10 Dec 2021 22:12 UTC)

Re: NUL-terminated strings and eof-object-terminated generators Ray Dillinger (13 Dec 2021 00:43 UTC)

Re: NUL-terminated strings and eof-object-terminated generators John Cowan (20 Feb 2022 03:41 UTC)

Re: NUL-terminated strings and eof-object-terminated generators Daphne Preston-Kendal (17 Feb 2022 13:17 UTC)

Re: NUL-terminated strings and eof-object-terminated generators Marc Nieper-Wißkirchen (17 Feb 2022 13:40 UTC)

Re: NUL-terminated strings and eof-object-terminated generators John Cowan (20 Feb 2022 05:06 UTC)

Re: NUL-terminated strings and eof-object-terminated generators Daphne Preston-Kendal (20 Feb 2022 08:25 UTC)

Re: NUL-terminated strings and eof-object-terminated generators Marc Nieper-Wißkirchen (20 Feb 2022 15:05 UTC)

Re: NUL-terminated strings and eof-object-terminated generators John Cowan (24 Feb 2022 23:53 UTC)

Re: NUL-terminated strings and eof-object-terminated generators Marc Nieper-Wißkirchen (25 Feb 2022 07:29 UTC)

Re: NUL-terminated strings and eof-object-terminated generators John Cowan (25 Feb 2022 17:09 UTC)

Re: NUL-terminated strings and eof-object-terminated generators Marc Nieper-Wißkirchen (25 Feb 2022 17:24 UTC)

Re: NUL-terminated strings and eof-object-terminated generators John Cowan (21 Feb 2022 17:53 UTC)

Re: NUL-terminated strings and eof-object-terminated generators Marc Nieper-Wißkirchen (21 Feb 2022 18:10 UTC)

Re: NUL-terminated strings and eof-object-terminated generators Marc Nieper-Wißkirchen (22 Feb 2022 08:13 UTC)

Re: NUL-terminated strings and eof-object-terminated generators Daphne Preston-Kendal 20 Feb 2022 08:24 UTC

On 20 Feb 2022, at 6:06, John Cowan wrote:

>> With strings, Large should either make its mind up, or force implementers
>> to make their minds up, between two options: NULs in strings are allowed;
>> or they are not allowed.but a catchable, distinguishable error is
>> signalled if you try to make one.
>
>
> Fair enough, and I'd be happy to add a SRFI that provides this guarantee.

Does this really need to be a SRFI? Trying to clean up *everything* about the small language which is not in line with the goals of the Large language by the SRFI procedure will take many years. Indeed, I think the SRFI-based process is unsuitable for such cleanup issues.

If we keep up the pace we’ve been at over the last few months, adopt more efficient ways to clean up consistency, clarity, and safety issues than always having to go through the whole SRFI process (or add procedurally-dodgy PFNs for such cleanups, something which has been happening far too often imo), and adopt a clearer idea of what proposals are in scope for R7RS, I think we could have the spec done and dusted by 2025 or 2026. If we have to go through the SRFI process for *everything*, we’ll make Perl 6’s development look fast. (Even the revised WG2 charter only suggests the use of SRFIs ‘insofar as practical’; and with a ‘should’, not a ‘must’.)

(I still intend to post constructive suggestions for improving the R7RS process sometime in the next week or two, especially given how the pace is picking up at the moment. I think the above points are a reasonable summary of the issues I intend to make proposals for.)

>> Silent failure, mysterious truncation, and nasal demons are unacceptable
>> in ‘mainstream software development’, where untrusted input from malicious
>> sources is a daily reality.
>
>
> But where do we stop?  Do we insist that all "is an error" conditions (of
> which only a fraction are actually marked with that phrase) and all "an
> error is signaled" conditions be converted into "an error is signaled
> satisfying"?

Pretty much, yes. Especially the latter. I consider it moderately catastrophic that we claim to be developing a practical language for mainstream software development, have got to the stage we’re at, and still don’t even have a condition system on the agenda.

> We can certainly do that, but then we re-create the R6RS
> situation, namely that no existing implementations can conform to
> R7RS-large, which stirred up huge amounts of disputation when R6RS came
> out?  I would consider that a failed outcome.  Or as Dennis Ritchie
> supposedly said when people complained about the lack of features in C: "If
> you want PL/I, you know where to find it."  The same is true of R6RS.

I would rather say to implementers: if you want the small language, you know where to find it. Full support only for the small language is a legitimate choice. You can’t make an omelette without breaking eggs; I’m increasingly convinced that the goal of trying to make a ‘practical language for mainstream software development’ while keeping the same laissez-faire attitude to error conditions, and making the same numerous compromises, as R7RS small is just not going to work.

Anyone who is seriously maintaining an implementation which isn’t willing to make major changes for the sake of Large support was alienated, at the latest, on Tuesday this week by the adoption of syntax-case. (I don’t see Chicken or MIT adopting more of Large now, for example.)

I prefer a long-term, carrot-based approach to getting implementers on board: at some point, I hope the number of R7RS Large-compatible portable libraries — especially ones which really, with good reason, *need* the Large language and its guarantees — will be sufficient to encourage holdouts to adopt the Large language.

That’s not to say I’m not concerned about implementations. It’s a problem that we don’t have an implementation which is committed to keep up to date with implementing the whole thing; nothing, in other words, that takes the rôle Chibi had during the small language’s development process. (Alex said in the release notes to 0.9 that Tangerine would likely be the last edition to be fully supported by Chibi.) I’d be willing to start one if enough fundamental questions about language design and cleanup were already answered, but those questions aren’t on the horizon.

> Scheme is not a pure mathematical abstraction: it is a tool for
> programmers, and tools have uses.  What is the *use* of such a sequence?
> Nobody has answered that.

André Sá pointed one out: ‘If I have a list of ports `l` and want to read the contents of all of them, `(map (cute read-string #f <>) l)` is one "obvious" way to go.’ <https://srfi-email.schemers.org/srfi-discuss/msg/18005658/>

One thing I would point out is that there’s nothing wrong with in-band signalling of the end of stream; the problem is that the EOF object both (a) is accessible to all Scheme code and (b) has other meanings and may legitimately arise in other situations like the one pointed out above.

> (Note also that eof-objects are not magic in
> SRFI 41, so you *can* generate and consume such streams if you want them.)

But without the performance benefits which are supposedly the purpose of generators.

>> nor that generators can only return one value per iteration, etc. etc.
>>
>
> That would be easy to remove by erratum or PFN (change "return a value" to
> "return one or more values") but it would not mean that the SRFI's
> procedures should, much less must, be accommodated to such multiple-value
> situations..

Which would be another abuse of PFNs to make changes to the Large language by fiat of the SRFI authors, imo.

My compromise proposal is this: we define a new iteration protocol without the eof-object problem, with multiple values, and possibly with the option to traverse the sequence in pure-functional style. We keep generators but attach a warning to them that they are only designed for IO-based/parsing use cases, where the ultimate source of data is an input port. I would still rather see them gone entirely, but in cases where EOF really can only mean EOF, I guess they aren’t harmful.

Daphne