On Fri, Dec 10, 2021 at 2:26 AM Marc Nieper-Wißkirchen <
xxxxxx@gmail.com> wrote:
If a "string" means a piece of text, then excluding the NUL character is reasonable. On the other hand, if "string" means a sequence of (Unicode) characters,
It doesn't, in fact; it only means a sequence of Unicode characters not longer than a certain value. The R6RS tries to defend against this difficulty in section 5.6:
As defined by this document, the Scheme programming
language is safe in the following sense: The execution of
a safe top-level program cannot go so badly wrong as to
crash or to continue to execute while behaving in ways
that are inconsistent with the semantics described in this
document, unless an exception is raised.
Violations of an implementation restriction must raise
an exception with condition type &implementationrestriction, as must all violations and errors that would
otherwise threaten system integrity in ways that might result in execution that is inconsistent with the semantics
described in this document.
But this demands more of a conforming implementation than it can possibly supply. Even apart from crude counterexamples like "if the computer catches on fire, it can't raise an exception even though the program crashed", in modern OSes a program normally is killed externally when global memory is scarce with no chance of recovery, and it may not even be the process most at fault that is killed. So this passage is the most preposterous of all the preposterous MUSTard that appears in R6RS. A pure mathematical model can ignore such Real World issues, but the specification of a machine (which is what a programming language is) cannot.
In short, at most we have an *approximation* to the abstract idea of a sequence of Unicode characters. (Note that an R7RS implementation, unlike an R6RS one, may also support non-Unicode characters in strings, though I know of none that do so at present; Chicken does support 983,041 non-Unicode characters with codepoints from #x110000 to #x1FFFFF.)