Note also that there’s another typo in the definition, namely the category between N* and M*.
This typo is still in the document though I see that you fixed the other typo.
On Dec 9, 2019, 11:58 AM -0800, John Cowan <>, wrote:
Good catch. The prose is right, the rules are wrong.  I'm preparing a PR that will fix this along with some minor typos.

On Mon, Dec 9, 2019 at 2:49 PM Chris Hanson <> wrote:

Unicode uses the term graphic characters to include whitespace, but here we follow Posix and call them printable characters, restricting the former term to exclude whitespace.

char-set:graphic      = char-set:printing + char-set:whitespace

char-set:printing     = category L* + category N* + category + category M*
                        category S* + category P*
On Dec 9, 2019, 11:13 AM -0800, John Cowan <>, wrote:

On Mon, Dec 9, 2019 at 1:38 PM Arthur A. Gleckler <> wrote:
Back in April, John pointed out that SRFI 14 Charset Definitions is based on old definitions from Java 1.0, and that Unicode has changed quite a bit since then:

John has since proposed some revisions to SRFI 14 to bring it up to date with modern Unicode, but we haven't been able to get Olin Shivers, the SRFI author, to weigh in, so I've added a link to John's notes as a "post-finalization note," not an erratum, in the Status section.

Here's John's description of the proposed revisions:
Unicode, Latin-1 and ASCII definitions of the standard character sets section below reflect Java 1.0, which in turn reflects Unicode 2.0.  Unicode's definitions of these groups of characters has been substantially revised and updated since then, although the ASCII and Latin-1 definitions are frozen and will always be correct.
This note recommends that implementers of new implementations and maintainers of existing ones update their implementations to use the current Unicode definitions, as detailed in the supplementary CharsetDefinitions file.