String literals inside bytevector literals Lassi Kortela 25 Sep 2019 10:45 UTC
>>> (2) makes bits of a bytevector that happen to also be valid ascii or >>> utf-8 text "readable", but is more complicated to generate/parse and >>> ends up as a worse form of (1) for very unprintable stuff. >> >> The thing is that character encoding is easily messed up by running >> "iconv". (This problem concerns ordinary strings too.) > > Yes. On a subtler level, it conflates the model of the s-expression as a > sequence of *characters* - and it being a sequence of bytes. A bit of a > layer violation! Exactly - you express it much better than I could :) The particular byte-level encoding of characters in a text file is not reliable. And with fancy encodings like UTF-8, lots of glyphs look identical to each other while being encoded differently or are even composed of different characters. So Scheme's full string syntax is a particularly unreliable foundation for putting things into bytevectors where presumably every byte-value matters. Since ASCII is so simple and universal, we could make a concession for trusting the byte-values of ASCII graphic characters.