Re: Unicode surrogates

Show/hide message thread

Unicode surrogates rgburger@xxxxxx (13 Mar 2006 12:11 UTC)

Re: Unicode surrogates Tom Emerson (13 Mar 2006 13:43 UTC)

Re: Unicode surrogates John Cowan (13 Mar 2006 13:56 UTC)

Re: Unicode surrogates Robby Findler (13 Mar 2006 14:03 UTC)

Re: Unicode surrogates bear (13 Mar 2006 17:04 UTC)

Re: Unicode surrogates Per Bothner (13 Mar 2006 17:13 UTC)

Re: Unicode surrogates bear (22 Mar 2006 23:52 UTC)

Re: Unicode surrogates Tom Emerson (13 Mar 2006 17:16 UTC)

Re: Unicode surrogates bear 22 Mar 2006 23:51 UTC

On Mon, 13 Mar 2006, Per Bothner wrote:

>bear wrote:
>> To put it another way, Windows allows characters that are not part
>> of Unicode to be used to name files.  If we restrict our character
>> set for filenames to Unicode-only, we will not be able to open
>> those files.  That problem is real.
>
>That does not mean it's a problem we need to solve.
>
>If you make it use to create filenames containing unpaired surrogates,
>that just means you make it easy to files with garbage filenames.

At first I disagreed with the idea that it wasn't a problem that
we needed to solve, but as I think about it you're right...  If
a particular implementation wants to be useful for systems work
on Windows, it needs to solve this problem.  But the standard need
not do so.  For the standard it's entirely reasonable to solve
only the problem of opening and using files that have valid unicode
filenames and leave methods of working with other files unspecified.

>I don't see that as a feature.  Any such filenames are presumably
>unintentional and due to bugs.

Indeed, they are not.  They are being used intentionally and on
purpose, the same way 8-bit extended characters were used in
conjunction with comm programs that supported only seven-bit ascii,
back in 1983 or thereabouts; to provide a final layer of "security
by obscurity."  When I operated a bulletin board system way back
when, I remember having the format utility (and a few others)
renamed to something with characters that people couldn't type
over the serial drivers I had installed.  The current situation on
Windows is similar in that the system protections from the current
user are generally inadequate and a "cheap trick" like this can
stop hostile scripts from running.

				Bear