Re: Case-mapping, Unicode & internationalisation

Show/hide message thread

Case-mapping, Unicode & internationalisation shivers@xxxxxx (24 Jan 2000 13:37 UTC)

Re: Case-mapping, Unicode & internationalisation Sergei Egorov (24 Jan 2000 17:09 UTC)

text processes vs. string procedures shivers@xxxxxx (24 Jan 2000 21:52 UTC)

Re: text processes vs. string procedures Sergei Egorov (24 Jan 2000 22:39 UTC)

Re: text processes vs. string procedures shivers@xxxxxx (25 Jan 2000 01:19 UTC)

Re: Case-mapping, Unicode & internationalisation Sergei Egorov 24 Jan 2000 17:10 UTC

I believe that UPCASE-STRING, DOWNCASE-STRING, and
TITLECASE-STRING belong to a separate domain of 'text processes'
that should be addressed in separate SRFIs. I think that the best approach
in Unicode context is to treat Scheme strings as just arrays of characters
('code points') with no special well-formedness constaints; for example,
it should be legal to have a string consisting of combining characters
with no preceding base character, or a string with low-half surrogate
character not followed by high-half surrogate character.
   A "string" library can contain relatively simple procedures that are
useful in traditional applications; it can also serve as a basis for
building
'text processes' described in the Unicode standard.
   A "char" library can contain procedures to access character properties
described in the Unicode database.
  A "text" library can include the 'text' data type representing well-formed
character sequences and allowing effective implementation of text processes
plus all the necessary primitives to work with this data type.
  A "basic text processes" library can contain specification/implementation
of canonical and compatibility decomposition  based on text primitives.
  Other libraries can implement other text processes, including case
mapping, locating text element boundaries, and collation for different
languages.

-- Sergei