Re: How would we use BCP 47 strings in a simple way?

Show/hide message thread

What to use for language/locale identifiers? Lassi Kortela (27 Jul 2020 05:55 UTC)
Re: What to use for language/locale identifiers? Lassi Kortela (27 Jul 2020 06:03 UTC)
Re: What to use for language/locale identifiers? Lassi Kortela (27 Jul 2020 08:33 UTC)
Re: What to use for language/locale identifiers? John Cowan (28 Jul 2020 19:19 UTC)
Re: What to use for language/locale identifiers? Lassi Kortela (28 Jul 2020 20:57 UTC)
Re: What to use for language/locale identifiers? Alex Shinn (29 Jul 2020 01:07 UTC)
Re: What to use for language/locale identifiers? John Cowan (29 Jul 2020 01:09 UTC)
Re: What to use for language/locale identifiers? Alex Shinn (29 Jul 2020 01:44 UTC)
(missing)
(missing)
Fwd: What to use for language/locale identifiers? John Cowan (29 Jul 2020 02:37 UTC)
Re: What to use for language/locale identifiers? Lassi Kortela (29 Jul 2020 06:16 UTC)
Re: What to use for language/locale identifiers? Alex Shinn (29 Jul 2020 07:45 UTC)
Re: What to use for language/locale identifiers? John Cowan (29 Jul 2020 13:56 UTC)
Re: What to use for language/locale identifiers? Alex Shinn (29 Jul 2020 15:39 UTC)
Re: What to use for language/locale identifiers? John Cowan (29 Jul 2020 16:30 UTC)
Re: What to use for language/locale identifiers? Alex Shinn (30 Jul 2020 08:22 UTC)
SRFI 29 (Localization) Lassi Kortela (30 Jul 2020 16:55 UTC)
Re: SRFI 29 (Localization) Alex Shinn (31 Jul 2020 14:09 UTC)
Re: SRFI 29 (Localization) John Cowan (31 Jul 2020 15:31 UTC)
How would we use BCP 47 strings in a simple way? Lassi Kortela (31 Jul 2020 15:54 UTC)
Re: How would we use BCP 47 strings in a simple way? Lassi Kortela (31 Jul 2020 15:56 UTC)
Re: How would we use BCP 47 strings in a simple way? John Cowan (02 Aug 2020 03:40 UTC)
Re: How would we use BCP 47 strings in a simple way? Lassi Kortela (02 Aug 2020 08:34 UTC)
Re: How would we use BCP 47 strings in a simple way? Lassi Kortela (02 Aug 2020 09:27 UTC)
Re: How would we use BCP 47 strings in a simple way? Lassi Kortela (02 Aug 2020 09:36 UTC)
Re: How would we use BCP 47 strings in a simple way? Lassi Kortela (02 Aug 2020 09:51 UTC)
Re: How would we use BCP 47 strings in a simple way? John Cowan (06 Aug 2020 04:30 UTC)

Re: How would we use BCP 47 strings in a simple way? Lassi Kortela 02 Aug 2020 08:34 UTC

> en-US: "Get your new radials at the tire center!"
> en-GB: "Get your new radials at the tyre centre!"
> en-CA: "Get your new radials at the tire centre!"

lol. It did get it right!

> Then a lookup request for "en" will immediately fail, and a request for
> "en-IN" will fail, truncate the search string to "en", and fail again.
> In either case the anglophone user will probably get the original
> Finnish.  (If you use the filtering algorithm instead, you will get the
> three English versions with no indication of which one to use.)
>
> So to avoid this happening, we make sure that there is always a string
> tagged "en" as well as well as "en-*" strings.  Since Americans are the
> most numerous (and also the most ignorant about other people's
> spellings), it probably makes sense to add this additional localization:
>
> en: "Get your new radials at the tire center!"
>
> Then a request for "en" will succeed, and a request for "en-IN" will
> fail, truncate the search string to "en", and succeed.
>
> All this stuff is very carefully spelled out in the second half of BCP
> 47.  The detailed algorithm descriptions amount to implementations, so
> there is no point in rediscovering the wheel.

Specs like this are heavy reading for people who are not localization
experts: every paragraph reads like there must be subtleties and implied
contextual knowledge that we are bound to miss. By extension we also
have no confidence in our ability to correctly implement the spec or to
extract a sensible sub-spec out of it. That's why I try to spell things
out like a five-year-old and rely on you to sanity-check our work.

BCP 47 section 3.3.1. Basic Filtering
<https://tools.ietf.org/html/bcp47#section-3.3.1> says: "Basic filtering
is identical to the type of matching described in [RFC3066], Section 2.5
(Language-range)."

RFC 3066 section 2.5 Language-range
<https://tools.ietf.org/html/rfc3066#section-2.5> says: "A
language-range matches a language-tag if it exactly equals the tag, or
if it exactly equals a prefix of the tag such that the first character
following the prefix is "-"."

Is this what you have in mind?

As for prioritizing variants of the same language, this would be handled
by the procedure that returns the localizations? I.e. its internal list
would be sorted in some sensible order, and the search would simply
return the first match.

At this point we should write some code to prove that things are not too
hard.