Re: the "Unicode Background" section
bear 22 Jul 2005 15:45 UTC
On Thu, 21 Jul 2005, Thomas Lord wrote:
>You are concerned about sequences containing isolated (unpaired)
>surrogates and their implications for string algebra. Your
>concerns are entirely reducible to a concern with UTF-16 --
>in all other encodings, there is no ambiguity.
I want to know something: what does a string containing an
unpaired surrogate mean? What is represented by it? How
can anything handle it sensibly in rendering or reading or
writing?
As far as I can tell, the only use for a string containing
an unpaired surrogate is an abuse, where you're using strings
to store some other kind of data.
So I don't regard it as being at all important, or even
appropriate, to allow unpaired surrogates in strings.
Bear