Re: Should SRFI-115 character sets match extended grapheme clusters? John Cowan (12 May 2014 03:06 UTC)

Re: Should SRFI-115 character sets match extended grapheme clusters? John Cowan 12 May 2014 03:06 UTC

Alex Shinn scripsit:

> Normalization was in the early issues and dismissed because of lack
> of implementation support and unclear costs in new implementations.
> I think good recommended practice for now is to just normalize both
> inputs and patterns separately.

Okay, I can live with that.  But normalizing an SRE is not a matter of
normalizing the strings in the SRE: indeed, that will break it.  So at
the very least I think a normalize-sre procedure must be provided that
takes an SRE and does the nitty-gritty of selectively expanding charsets
into disjunctions of sequences.  That would not be incompatible
with PCRE, because its effect is global.

--
John Cowan          http://www.ccil.org/~cowan        xxxxxx@ccil.org
I Hope, Sir, that we are not mutually Un-friended by this Difference
which hath happened betwixt us.
     --Thomas Fuller, Appeal of Injured Innocence (1659)