Email list hosting service & mailing list manager

Re: strings draft Paul Schlie (23 Jan 2004 02:48 UTC)
Re: strings draft tb@xxxxxx (23 Jan 2004 03:45 UTC)
Re: strings draft Paul Schlie (23 Jan 2004 12:16 UTC)
Re: strings draft (musings) Paul Schlie (23 Jan 2004 14:15 UTC)
Re: strings draft tb@xxxxxx (23 Jan 2004 18:53 UTC)
Re: strings draft Paul Schlie (23 Jan 2004 21:26 UTC)
Re: strings draft (premature, need first class type definition support first?) Tom Lord (24 Jan 2004 22:05 UTC)

Re: strings draft (premature, need first class type definition support first?) Tom Lord 24 Jan 2004 22:18 UTC


    > From: Paul Schlie <xxxxxx@comcast.net>

    > (maybe scheme should first be refined to enable the definition of new first
    >  class types/subtypes, prior to requesting any particular new type support?)

    > Upon further consideration of the proposed Unicode character support
    > enhancements, which I presume are desired to enable more generalized
    > language/script-system text processing;

That's a goal -- but only an indirect goal.  The proposed changes
should remove obstacles to writing Scheme programs that do Unicode
text processing using STRING? and CHAR? as basic types.  However, the
proposed changes are _not_ intended to provide sufficient mechanism to
write Unicode text processes in a portable way.   It will take
additional standards (for which SRFIs are appropriate, I think) to
enable portable implementation of Unicode text processes.

The proposed R6RS changes are also intended to clarify some points
that are arguably ambiguous about R5RS -- the contents of the portable
character set and identifier equivalence over that character set, for
example.

The proposed R6RS changes clarify the meaning of integer string
indexes over the portable character set and give guidence about how to
extend the meaning of string indexes in Unicode-supporting
implementations.   This point in particular is critical for a portable
FFI (and also for data-exchange between Scheme environments).

The proposed R6RS changes are intended to ensure the possibility of
writing portable Scheme programs that can process source texts of
their host implementations where those source texts consist of
standard syntax but an extended character set.

    > Where the interface level of abstraction I would guess should be
    > more capable of being able to manipulate words, sentences,
    > capitalization, punctuation, justification, etc., and determine
    > if whole words are lexically and/or syntactically equivalent,
    > plural, etc. within a given language and script system for
    > example (including the SCHEME language);

Things such as words and sentences are linguistic concepts.  R6RS
should enable people to write, for example, SRFIs about these things
-- it should not provide those things itself.

The kinds of processing needed to handle Scheme source texts is quite
a bit simpler than text processing.  It would be weird and needlessly
restrictive to try to define Scheme source text processing as a
specific example of a more general linguistic text processing.

    > This I suspect is possibly really what folks should be spending their time
    > to refine, because if scheme more natively supported the ability to define
    > new first-class data types/sub-types, and correspondingly extend it's core
    > procedures to be aware of them; numerous new facilitates and features could
    > be experimented with and refined, without having to require a language
    > revision or new implementation to enable it.

I think you are on the scent of a red herring.

-t