constant-time access to variable-width encodings
Per Bothner
(13 Jul 2005 18:13 UTC)
|
Re: constant-time access to variable-width encodings
Ray Blaak
(13 Jul 2005 18:48 UTC)
|
Re: constant-time access to variable-width encodings Shiro Kawai (13 Jul 2005 20:16 UTC)
|
Re: constant-time access to variable-width encodings
Per Bothner
(13 Jul 2005 20:36 UTC)
|
Re: constant-time access to variable-width encodings
Shiro Kawai
(13 Jul 2005 23:07 UTC)
|
Re: constant-time access to variable-width encodings
bear
(14 Jul 2005 00:23 UTC)
|
Re: constant-time access to variable-width encodings
Per Bothner
(14 Jul 2005 00:39 UTC)
|
Re: constant-time access to variable-width encodings
bear
(14 Jul 2005 01:52 UTC)
|
Re: constant-time access to variable-width encodings
Thomas Bushnell BSG
(14 Jul 2005 07:18 UTC)
|
Re: constant-time access to variable-width encodings
Thomas Bushnell BSG
(14 Jul 2005 07:16 UTC)
|
Re: constant-time access to variable-width encodings Shiro Kawai 13 Jul 2005 20:15 UTC
>From: Per Bothner <xxxxxx@bothner.com> Subject: constant-time access to variable-width encodings Date: Wed, 13 Jul 2005 11:12:57 -0700 > The proposal is to allow string-ref to return #\partial for some indexes > representing non-initial bytes or low-surrogate values. Interesting proposal, and I agree with the need of length-changing mutation (see my other post). I feel a bit uncomfortable, though, with the fact that indexes and string-length differ among different implementations, or even in the same implementations with different character encodings. It makes a datastructure that holds a string and its indexes non-portable, for example. I'd agree the proposal if it introduces a different means of indexing, other than character count used for string-ref. Call it 'offset' for now. string-offset-ref, substring-offset etc. would provide offset-based operation, while string-ref, substring etc. work on character-based op. Though it may be too cumbersome for core language. And this is too much variable-length-character centric API, which fixed-length character implementation or other implementations (such as tree of segments) wouldn't care much. --shiro