strings draft
Tom Lord
(22 Jan 2004 04:58 UTC)
|
Re: strings draft
Shiro Kawai
(22 Jan 2004 09:46 UTC)
|
Re: strings draft
Tom Lord
(22 Jan 2004 17:32 UTC)
|
Re: strings draft
Shiro Kawai
(23 Jan 2004 05:03 UTC)
|
Re: strings draft
Tom Lord
(24 Jan 2004 00:31 UTC)
|
Re: strings draft
Matthew Dempsky
(24 Jan 2004 03:00 UTC)
|
Re: strings draft
Shiro Kawai
(24 Jan 2004 03:27 UTC)
|
Re: strings draft
Tom Lord
(24 Jan 2004 04:18 UTC)
|
Re: strings draft
Shiro Kawai
(24 Jan 2004 04:49 UTC)
|
Re: strings draft
Tom Lord
(24 Jan 2004 18:47 UTC)
|
Re: strings draft
Shiro Kawai
(24 Jan 2004 22:16 UTC)
|
Octet vs Char (Re: strings draft)
Shiro Kawai
(26 Jan 2004 09:58 UTC)
|
Strings, one last detail.
bear
(30 Jan 2004 21:12 UTC)
|
Re: Strings, one last detail.
Shiro Kawai
(30 Jan 2004 21:43 UTC)
|
Re: Strings, one last detail.
Tom Lord
(31 Jan 2004 00:13 UTC)
|
Re: Strings, one last detail.
bear
(31 Jan 2004 20:26 UTC)
|
Re: Strings, one last detail.
Tom Lord
(31 Jan 2004 20:42 UTC)
|
Re: Strings, one last detail.
bear
(01 Feb 2004 02:29 UTC)
|
Re: Strings, one last detail.
Tom Lord
(01 Feb 2004 02:44 UTC)
|
Re: Strings, one last detail.
bear
(01 Feb 2004 07:53 UTC)
|
Re: Octet vs Char (Re: strings draft)
bear
(26 Jan 2004 19:04 UTC)
|
Re: Octet vs Char (Re: strings draft)
Matthew Dempsky
(26 Jan 2004 20:12 UTC)
|
Re: Octet vs Char (Re: strings draft)
Matthew Dempsky
(26 Jan 2004 20:40 UTC)
|
Re: Octet vs Char (Re: strings draft)
Ken Dickey
(27 Jan 2004 04:33 UTC)
|
Re: Octet vs Char
Shiro Kawai
(27 Jan 2004 05:12 UTC)
|
Re: Octet vs Char
Tom Lord
(27 Jan 2004 05:23 UTC)
|
Re: Octet vs Char
bear
(27 Jan 2004 08:35 UTC)
|
Re: Octet vs Char (Re: strings draft)
bear
(27 Jan 2004 08:33 UTC)
|
Re: Octet vs Char (Re: strings draft)
Ken Dickey
(27 Jan 2004 15:43 UTC)
|
Re: Octet vs Char (Re: strings draft) bear (27 Jan 2004 19:06 UTC)
|
Re: Octet vs Char
Shiro Kawai
(26 Jan 2004 23:39 UTC)
|
Re: strings draft
bear
(22 Jan 2004 19:05 UTC)
|
Re: strings draft
Tom Lord
(23 Jan 2004 01:53 UTC)
|
READ-OCTET (Re: strings draft)
Shiro Kawai
(23 Jan 2004 06:01 UTC)
|
Re: strings draft
bear
(23 Jan 2004 07:04 UTC)
|
Re: strings draft
bear
(23 Jan 2004 07:20 UTC)
|
Re: strings draft
Tom Lord
(24 Jan 2004 00:02 UTC)
|
Re: strings draft
Alex Shinn
(26 Jan 2004 01:59 UTC)
|
Re: strings draft
Tom Lord
(26 Jan 2004 02:22 UTC)
|
Re: strings draft
bear
(26 Jan 2004 02:35 UTC)
|
Re: strings draft
Tom Lord
(26 Jan 2004 02:48 UTC)
|
Re: strings draft
Alex Shinn
(26 Jan 2004 03:00 UTC)
|
Re: strings draft
Tom Lord
(26 Jan 2004 03:14 UTC)
|
Re: strings draft
Shiro Kawai
(26 Jan 2004 04:57 UTC)
|
Re: strings draft
Alex Shinn
(26 Jan 2004 04:58 UTC)
|
Re: strings draft
tb@xxxxxx
(23 Jan 2004 18:48 UTC)
|
Re: strings draft
bear
(24 Jan 2004 02:21 UTC)
|
Re: strings draft
tb@xxxxxx
(23 Jan 2004 02:10 UTC)
|
Re: strings draft
Tom Lord
(23 Jan 2004 02:29 UTC)
|
Re: strings draft
tb@xxxxxx
(23 Jan 2004 02:44 UTC)
|
Re: strings draft
Tom Lord
(23 Jan 2004 02:53 UTC)
|
Re: strings draft
tb@xxxxxx
(23 Jan 2004 03:04 UTC)
|
Re: strings draft
Tom Lord
(23 Jan 2004 03:16 UTC)
|
Re: strings draft
tb@xxxxxx
(23 Jan 2004 03:42 UTC)
|
Re: strings draft
Alex Shinn
(23 Jan 2004 02:35 UTC)
|
Re: strings draft
tb@xxxxxx
(23 Jan 2004 02:42 UTC)
|
Re: strings draft
Tom Lord
(23 Jan 2004 02:49 UTC)
|
Re: strings draft
Alex Shinn
(23 Jan 2004 02:58 UTC)
|
Re: strings draft
tb@xxxxxx
(23 Jan 2004 03:13 UTC)
|
Re: strings draft
Alex Shinn
(23 Jan 2004 03:19 UTC)
|
Re: strings draft
Bradd W. Szonye
(23 Jan 2004 19:31 UTC)
|
Re: strings draft
Alex Shinn
(26 Jan 2004 02:22 UTC)
|
Re: strings draft
Bradd W. Szonye
(06 Feb 2004 23:30 UTC)
|
Re: strings draft
Bradd W. Szonye
(06 Feb 2004 23:33 UTC)
|
Re: strings draft
Alex Shinn
(09 Feb 2004 01:45 UTC)
|
specifying source encoding (Re: strings draft)
Shiro Kawai
(09 Feb 2004 02:51 UTC)
|
Re: strings draft
Bradd W. Szonye
(09 Feb 2004 03:39 UTC)
|
Re: strings draft
tb@xxxxxx
(23 Jan 2004 03:12 UTC)
|
Re: strings draft
Alex Shinn
(23 Jan 2004 03:28 UTC)
|
Re: strings draft
tb@xxxxxx
(23 Jan 2004 03:44 UTC)
|
Parsing Scheme [was Re: strings draft]
Ken Dickey
(23 Jan 2004 17:02 UTC)
|
Re: Parsing Scheme [was Re: strings draft]
bear
(23 Jan 2004 17:56 UTC)
|
Re: Parsing Scheme [was Re: strings draft]
tb@xxxxxx
(23 Jan 2004 18:50 UTC)
|
Re: Parsing Scheme [was Re: strings draft]
Per Bothner
(23 Jan 2004 18:56 UTC)
|
Re: Parsing Scheme [was Re: strings draft]
Tom Lord
(23 Jan 2004 20:26 UTC)
|
Re: Parsing Scheme [was Re: strings draft]
Per Bothner
(23 Jan 2004 20:57 UTC)
|
Re: Parsing Scheme [was Re: strings draft]
Tom Lord
(23 Jan 2004 21:44 UTC)
|
Re: Parsing Scheme [was Re: strings draft]
Ken Dickey
(23 Jan 2004 21:47 UTC)
|
Re: Parsing Scheme [was Re: strings draft]
Tom Lord
(23 Jan 2004 23:22 UTC)
|
Re: Parsing Scheme [was Re: strings draft]
Ken Dickey
(25 Jan 2004 01:03 UTC)
|
Re: Parsing Scheme [was Re: strings draft]
Tom Lord
(25 Jan 2004 03:01 UTC)
|
Re: Parsing Scheme [was Re: strings draft]
Tom Lord
(23 Jan 2004 20:07 UTC)
|
Re: Parsing Scheme [was Re: strings draft]
tb@xxxxxx
(23 Jan 2004 21:22 UTC)
|
Re: Parsing Scheme [was Re: strings draft]
Tom Lord
(23 Jan 2004 22:38 UTC)
|
Re: Parsing Scheme [was Re: strings draft]
tb@xxxxxx
(24 Jan 2004 06:48 UTC)
|
Re: Parsing Scheme [was Re: strings draft]
Tom Lord
(24 Jan 2004 18:41 UTC)
|
Re: Parsing Scheme [was Re: strings draft]
tb@xxxxxx
(24 Jan 2004 19:34 UTC)
|
Re: Parsing Scheme [was Re: strings draft]
Tom Lord
(24 Jan 2004 21:48 UTC)
|
Re: strings draft
Matthew Dempsky
(25 Jan 2004 06:59 UTC)
|
Re: strings draft
Tom Lord
(25 Jan 2004 07:16 UTC)
|
Re: strings draft
Matthew Dempsky
(26 Jan 2004 23:52 UTC)
|
Re: strings draft
Tom Lord
(27 Jan 2004 00:30 UTC)
|
On Tue, 27 Jan 2004, Ken Dickey wrote: >On Tuesday 27 January 2004 09:32 am, bear wrote: >> On Mon, 26 Jan 2004, Ken Dickey wrote: >> >Well color me dumb, but I don't see why getting O(1) is such a big deal. >... >> O(1) reference or character setting comes at the expense of O(n) >> insertions, deletions, and non-identical-sized replacements. >> >> EG, if I change "the" to "a" at the beginning of a long string, and >> I've represented it as a vector to get O(1) reference time, the rest of >> the string has to be copied to move it two character spaces in memory. >I was puzzled by the ropes discussion here because it seemed to be orthogonal >to the Unicode discussion. I now see that its because it _is_ orthogonal to >the Unicode discussion. The only thing that unicode has to do with it is that unicode makes non-identical sized replacements more likely, and makes it more likely that the programmer will not realize that a given operation involves non-identical sized replacements. Replacing one codepoint with another may wind up being a replacement of a character that takes 1 octet of UTF-8 to express with a character that takes 3 octets of UTF-8 to express, or vice versa. This sort of thing is amenable to your proposed approach of indexed fallback into another vector. But replacing a character with a combining sequence of multiple codepoints, or vice versa, is also likely; in fact the Unicode Consortium's canonicalization algorithms do this all the time. In this case you're looking at things like replacing U+212B ANGSTROM SIGN with U+41 LATIN CAPITAL LETTER A , U+30A COMBINING RING ABOVE and if your implementation treats the former as one character and the latter as two characters, which most do, you wind up with the same need to copy the rest of the string that changing "a" to "the" caused in ASCII strings. This is not amenable to your proposed approach of indexed fallback into another vector. What this means is that, while on an absolute level Unicode and rope representation are orthogonal issues, Unicode has patterns of likely use that rely heavily on the most expensive operations of vector representations. And of course both came up here because the first draft of the FFI SRFI wanted a C pointer to a mutable memory area containing the internal representation of a scheme string, and has to know this kind of "detail" to even make sense of what it finds there. As a result of the discussions here, I'm now considering adding more types of string values, each with its own read syntax and conversions: For example, #,(Latin-1-vector "hello world") would be an octet vector where each octet is a latin-1 character. This would make binary I/O using string-like constructions possible and give C programs the kind of FFI value they wanted. No characters outside Latin-1 would be allowed, of course. #,(UTF32-vector "hello world") would be a "string" indexed by unicode codepoint rather than by character. Handy for FFI, and also allows people to create invalid or non-canonical combining sequences, assign values that aren't even mapped codepoints to arbitrary locations, or do other linguistically wrong operations. However, converting it to a regular string would canonicalize it, and would fail if it contained non-characters. Bear