Note also that many texts are quite short, which is why some
systems have been able to get by with UTF-8 or UTF-16. The
binary logarithm of 32 is 5, putting O(lg n) representations
at a five-fold disadvantage, and they can't make up for that
in simplicity because the O(1) algorithms are already pretty
simple.