What to use for language/locale identifiers?
Lassi Kortela 27 Jul 2020 05:55 UTC
John, as the resident i18n expert do you have a recommendation for where
we should get the locale IDs for SRFI 198 and other APIs?
POSIX locales are like "cs_CZ.UTF-8" (language_country.encoding). If we
leave out the encoding we get "cs_CZ", or the symbol 'cs_CZ or 'cs-CZ or
'cs-cz. Would it be comprehensive enough if we use symbols like that?
We could also use just the language 'cs. But should we keep the country?
It probably makes sense to have en-US and en-GB be different, for
example. But it's also nice to be able to ask for en-*.
The POSIX language and country codes look like they're based on ISO
standards. Wikipedia has a list of the language codes, of which there
are two-letter and three-letter variants, as you know:
<https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes>.
Here's a list of the Windows locales:
<https://www.science.co.il/language/Locale-codes.php>. It's unofficial
so it may be incomplete.