Email list hosting service & mailing list manager


Re: Strings/chars Shiro Kawai 23 Dec 2003 21:33 UTC

>From: Michael Sperber <xxxxxx@informatik.uni-tuebingen.de>
Subject: Re: Strings/chars
Date: Tue, 23 Dec 2003 11:56:07 +0100

> What's your take on combining characters?

I don't have clear idea at the application level, and can only
imagine that we need several layers.   As Tom Lord mentioned,
eventually we'd have such layers, and R5RS character would fade
away in long long term.

Bear's appoach (as far as I understand, each "character" consists
of base character + zero or more combining characters; correct me
if I'm wrong) looks suitable for most linguistic text processing.
An application may need more data per character, such as
how it is represented in the original data, or which language
it belongs to---it's application dependent, so if we ever want
to expose it to C FFI, a "character" wouldn't just map to an
integer; instead, it would be an opaque object with full of APIs
to extract various information.

So far I haven't reached that stage; I'm still dealing with the
layer that tweaks codepoints, and programs I'm writing haven't
require more than that so far (fortunately).  My next step will
be to build a layer similar to Bear's on top of the codepoint API.

--shiro