Email list hosting service & mailing list manager

New draft of SRFI 130: Cursor-based string library Arthur A. Gleckler (14 May 2016 16:07 UTC)
Re: New draft of SRFI 130: Cursor-based string library Alex Shinn (14 May 2016 22:44 UTC)
Re: New draft of SRFI 130: Cursor-based string library William D Clinger (21 May 2016 06:53 UTC)
Re: New draft of SRFI 130: Cursor-based string library Alex Shinn (21 May 2016 16:38 UTC)
Re: New draft of SRFI 130: Cursor-based string library John Cowan (21 May 2016 17:01 UTC)
Re: New draft of SRFI 130: Cursor-based string library William D Clinger (21 May 2016 17:36 UTC)
Re: New draft of SRFI 130: Cursor-based string library John Cowan (22 May 2016 04:23 UTC)
Re: New draft of SRFI 130: Cursor-based string library William D Clinger (21 May 2016 17:23 UTC)
Re: New draft of SRFI 130: Cursor-based string library John Cowan (22 May 2016 06:38 UTC)
Re: New draft of SRFI 130: Cursor-based string library Alex Shinn (23 May 2016 02:49 UTC)
Re: New draft of SRFI 130: Cursor-based string library John Cowan (23 May 2016 03:50 UTC)
Re: New draft of SRFI 130: Cursor-based string library William D Clinger (23 May 2016 04:30 UTC)
Re: New draft of SRFI 130: Cursor-based string library Alex Shinn (23 May 2016 04:56 UTC)
Re: New draft of SRFI 130: Cursor-based string library John Cowan (23 May 2016 13:19 UTC)
Re: New draft of SRFI 130: Cursor-based string library William D Clinger (23 May 2016 15:45 UTC)
Re: New draft of SRFI 130: Cursor-based string library John Cowan (23 May 2016 16:52 UTC)
Re: New draft of SRFI 130: Cursor-based string library William D Clinger (23 May 2016 18:01 UTC)
Re: New draft of SRFI 130: Cursor-based string library John Cowan (23 May 2016 20:32 UTC)

Re: New draft of SRFI 130: Cursor-based string library John Cowan 23 May 2016 16:52 UTC

William D Clinger scripsit:

> Yes, texts would be a new and disjoint data type.  That's why I called
> them texts instead of strings.

So not merely disjoint, but separately built from the ground up.  My
spans were layered on top of strings.

> If Scheme programmers are rejecting all new data types,

That's obviously not the case.  The objection was to having two
different data structures which were both basically sequences of
characters and overlapped in almost all of their use cases, and still
less did they want such a thing baked into the standard.  But who knows
whether SRFI 13, SRFI 130, or neither will become part of R7RS-large?
Voters may rebel against cursors as they rebelled against spans.

> Although cursors might be the same as indexes, they might also be
> different.  The only correct way for application programs to use
> SRFI 130 is to treat cursors as a new disjoint data type, which is a
> major reason why it's so hard to use SRFI 130.  If you're doing your
> development in a system that makes cursors the same as indexes, you
> can't find your mistakes through testing in that system.

All quite true.  The only reason not to require cursors to be disjoint
is that the only portable way to create disjoint types is to wrap them
in records, and many Schemes make unwrapping records very slow.

> Once the community understands the advantages of immutable texts
> with O(1) random access via character indexes, and how well they
> interoperate with mutable strings (because you can use the same
> indexes into each), I think the community would be willing to allow
> someone like you or me to write a new SRFI specifying immutable texts.

Maybe.  It won't be me: I have other plans, as noted in my previous
email.

> You're inventing an imaginary problem.

Yes, now that I understand that texts are distinct from the ground up,
not merely formally disjoint.

> From your comment above, I infer your belief is the opposite of that
> fact.

"That turns out not to be the case."  --Kevin Renner

> Yes, conversions between formats would be necessary, just as
> conversions between multiple formats are necessary in programs that
> use both Java and C (for example), or in Java programs that run on
> Linux systems where Unicode files tend to use UTF-8, or in C programs
> that run on Windows systems where Unicode files tend to use something
> approximating UTF-16.

The performance of interoperation between Java and C is so slow anyway
for reasons not related to strings that the extra cost is probably
not important.  That's why the gcj compiler introduced the much more
efficient CNI, while still being required to support JNI anyway.
Unfortunately, CNI never caught on.

> [The foof implementation] works now that I've fixed it.

Sure.  I meant "modulo bugs" and should have said so.  The version is
after all very new, but I wanted it in the repository precisely so that
it would be vetted by non-Chibi users.

> I suspect you yourself have done little testing of the foof
> implementation, and that what testing you may have done has been in
> non-R7RS systems where such bugs are easily overlooked.

Quite right.  I can't do everything, and fortunately the implementations
of SRFIs are not *reference* implementations (despite being called
that): it is the prose, not the implementation, that is definitive.  I
do appreciate all your assistance with both.

> SRFI 130 says it's "an error unless start and end are both cursors or
> both indexes", but SRFI 130 doesn't say the four optional parameters
> of (say) string-prefix-length have to be all cursors or all indexes.

That's an oversight I will fix.

> I don't have any problem with the adoption of SRFI 130 as is, but I do
> have a problem with promoting it as though its use of cursors is going
> to make string processing run faster.

"I only said 'if'!"  --Alice in Through the Looking Glass

Actually, the non-normative rationale merely says it's *possible* to use
cursors to iterate more efficiently through strings.

--
John Cowan          http://www.ccil.org/~cowan        xxxxxx@ccil.org
The competent programmer is fully aware of the strictly limited size of his own
skull; therefore he approaches the programming task in full humility, and among
other things he avoids clever tricks like the plague.  --Edsger Dijkstra