Format strings are wrong Marc Feeley 25 Nov 2003 21:07 UTC

Ken, I believe that format strings, as used in the proposed "format"
procedure as well as C's "printf", is very wrong.

The first objection is that it is difficult to type check statically,
because it requires the parsing of the format string, which may be
computed at run time.

Secondly, it is hard to remember the meaning of the one letter
escapes.  How come "~y" does pretty-printing and not "~p", how come
"~%" prints a newline instead of "~n", etc, etc.

Thirdly, the logical link between a particular escape in the format
string and the parameter it formats is not reinforced.  They are too
far appart in the source and it is easy to get the order wrong or
miss one parameter.

Finally it is not composable.  Scheme is a nice functional language
where we can define functions for each operation we need to do, and we
can easily compose one function with another to get new
functionnality.  With format strings these operations are hidden
behind escape codes that need the "format" interpreter to work.  Why
not expose those operations as true Scheme functions (that produce
strings or produce text on the current output port).

Here is a quick example to show what I mean (this is not a complete
proposal...).  Suppose we had a function called "field" (it could even
be called "format" or simply "!" depending on your tastes) that
converts any object (symbol, boolean, number, etc) to a string.  This
function takes the object as the first argument, and accepts optional
arguments.  The first optional argument "field-width" gives the
minimal width of the resulting string (it defaults to 0).  If the
object's external representation requires more characters than this
then the resulting string is longer that the minimal width.  If the
object's external representation requires fewer characters then spaces
are used as padding, either on the left if field-width is positive, or
on the right if field-width is negative.  The next optional argument,
which only has meaning for numbers in the object, gives the number of
decimal digits after the period (it defaults to #f which means use as
many as number->string would use).  Then we can do things like:

  (string-append "Hello, " (field "World!" 10))
  =>  "Hello,   \"World!\""

  (string-append "list: " (field '(one "two" 3)))
  =>  "list: (one \"two\" 3)"

  (string-append "list: " (field '(one "two" 3) 0 1))
  =>  "list: (one \"two\" 3.0)"

  (string-append (field 1/2 6 3) " ^ 2 = " (field 1/4 6 3))
  =>  " 0.500 ^ 2 =  0.250"

To make printing easier, a general purpose function called "print"
could be added with this definition:

  (define (print . lst) (for-each display lst))

allowing

  (print "list: " (field '(one "two" 3)))

Anyway, I don't want to get lost in details.  My main point is that a
functional interface is much more elegant and versatile that format
strings.

Marc