Email list hosting service & mailing list manager

SRFI 130 - "span" prefix Per Bothner (04 Dec 2015 03:34 UTC)
Re: SRFI 130 - "span" prefix John Cowan (04 Dec 2015 15:54 UTC)
Re: SRFI 130 - "span" prefix Per Bothner (04 Dec 2015 16:10 UTC)
Re: SRFI 130 - "span" prefix taylanbayirli@xxxxxx (04 Dec 2015 16:49 UTC)
Re: SRFI 130 - "span" prefix John Cowan (05 Dec 2015 07:05 UTC)
Re: SRFI 130 - "span" prefix John Cowan (06 Dec 2015 06:44 UTC)
Re: SRFI 130 - "span" prefix Shiro Kawai (04 Dec 2015 18:49 UTC)
Re: SRFI 130 - "span" prefix John Cowan (05 Dec 2015 07:06 UTC)
Re: SRFI 130 - "span" prefix Shiro Kawai (05 Dec 2015 07:21 UTC)
Re: SRFI 130 - "span" prefix John Cowan (05 Dec 2015 16:51 UTC)
Re: SRFI 130 - "span" prefix Per Bothner (05 Dec 2015 17:20 UTC)
Re: SRFI 130 - "span" prefix Shiro Kawai (05 Dec 2015 17:39 UTC)
Re: SRFI 130 - "span" prefix John Cowan (05 Dec 2015 20:00 UTC)
Re: SRFI 130 - "span" prefix Alex Shinn (04 Dec 2015 16:52 UTC)
Re: SRFI 130 - "span" prefix Shiro Kawai (04 Dec 2015 20:27 UTC)
Re: SRFI 130 - "span" prefix John Cowan (07 Dec 2015 00:02 UTC)
Re: SRFI 130 - "span" prefix Shiro Kawai (07 Dec 2015 07:57 UTC)
Re: SRFI 130 - "span" prefix John Cowan (07 Dec 2015 13:09 UTC)
Re: SRFI 130 - "span" prefix John Cowan (06 Dec 2015 02:32 UTC)
Re: SRFI 130 - "span" prefix Alex Shinn (07 Dec 2015 19:26 UTC)
Re: SRFI 130 - "span" prefix John Cowan (07 Dec 2015 19:48 UTC)
Re: SRFI 130 - "span" prefix Shiro Kawai (07 Dec 2015 20:08 UTC)
Re: SRFI 130 - "span" prefix John Cowan (07 Dec 2015 20:25 UTC)
Re: SRFI 130 - "span" prefix Shiro Kawai (07 Dec 2015 20:44 UTC)

Re: SRFI 130 - "span" prefix John Cowan 05 Dec 2015 20:00 UTC

Shiro Kawai scripsit:

> A simple-minded approach is copy the string body in any mutation.
> Which is what Gauche does---(string-set! foo ...) actually copies
> 1000000 character string content so the sharing is broken at that point.

Chicken with the utf8 egg installed does that only if the size in bytes
has changed, but then it doesn't do string sharing.  However, since
there is no indirection within the string object, it has to force a GC
so that all pointers to the string are changed.  (Without the utf8 egg,
you just have Latin-1 strings, no sharing, trivial mutation.)

> I'm betting that in future string-set! will fade out and it won't be an
> issue.

I think everyone is.

> One possible trick is to flag the string body if it is ever shared.
> It works like 1-bit reference counting; if it is flagged, it *may* be
> shared so we copy.  Assumption is that majority of strings, especially
> transient ones, aren't shared at all so copying is avoided.

Guile supports all three cases:  unshared substrings, shared substrings with
copy-on-write semantics, and shared substrings where mutation shows up
on the other side.

> The span approach allows to experiment string representation strategy
> in Scheme layer without requiring touching the underlying string
> representation.  Which is technically good, but I'm afraid that it
> leaves too much cruft in future.

Eh, considering this SRFI as just another SRFI, I don't think that's
an issue.  Considered as part of R7RS-large (which will depend on what
the WG decides, of course), it's not that large an overlap.

> But portable code must use spans

Unless it needs mutable strings, yes.

> Having multiple string-like thingies in one language has long
> repercussions; in C++ I still deal with multiple string classes,
> Python 2 with ascii and unicode strings dies hard.

It's a straight trade-off as mentioned above: mutable (but typically slow
and space-consuming) vs. immutable (but typically fast and space-saving).
The problem in Python 2 is the regular strings can't do Unicode,
which is a difference in architecture, not merely in implementation.
I'm not really familiar with the C++ story; is it because strings weren't
standardized soon enough?

--
John Cowan          http://www.ccil.org/~cowan        xxxxxx@ccil.org
The Unicode Standard does not encode idiosyncratic, personal, novel,
or private use characters, nor does it encode logos or graphics.