Curly-infix-expressions
Alan Manuel K. Gloria
This is a draft Scheme Request for Implementation (SRFI) for SRFI 105. To see an explanation of each status that a SRFI can hold, see here.
To provide input on this SRFI, please
mail to <srfi minus 105 at srfi dot schemers dot
org>
. See instructions here to subscribe to the list. You can
access previous messages via the archive of the mailing list.
This SRFI contains all the required sections, including
an abstract,
rationale,
specification,
and
reference implementation.
None
Lisp-based languages, like Scheme, are almost the only
programming languages in modern use that do not support infix notation.
In addition, most languages allow infix expressions to be combined
with function call notation of the form f(x).
This SRFI provides these capabilities, both for
developers who already use Scheme and want these conveniences,
and also for other developers who may choose to use other languages
in part because they miss these conveniences.
Scheme currently reserves {
...}
“for possible future extensions to the language”.
We propose that {
...}
be used
to support “curly-infix” notation as a homoiconic
infix abbreviation,
as a modification of the Scheme reader.
It is an abbreviation in much the same way that
'x is an abbreviation for (quote x).
A curly-infix list introduces a list whose visual presentation is in infix order instead of prefix order. For example, {n > 2} ⇒ (> n 2), and {a + b + c} ⇒ (+ a b c). By intent, there is no precedence, but e.g., {x + {y * z}} maps cleanly to (+ x (* y z)). Forms with mixed infix operators and other complications have “nfx” prepended to enable later macro processing, e.g., {2 + 3 * 5} ⇒ (nfx 2 + 3 * 5). Also, inside a curly-infix list (recursively), expressions of the form f(...) are simply an abbreviation for (f ...).
Note that this is derived from the “readable” project. We intend to later submit at least one additional SRFI that will build on top of this SRFI, but curly-infix-expressions are useful on their own.
Lisp-based languages, like Scheme, are almost the only programming languages in modern use that do not support infix notation. Even some Lisp advocates, like Paul Graham, admit that they “don’t find prefix math expressions natural” (http://www.paulgraham.com/pypar.html) even after decades of experience with Lisp-based languages. Paul Prescod has said, “I have more faith that you could convince the world to use Esperanto than prefix notation” (http://people.csail.mit.edu/gregs/ll1-discuss-archive-html/msg01571.html). Infix is not going away; standard mathematical notation uses infix, infix notation is taught to most people (programmers or not) in school, and nearly all new programming languages include infix.
Adding infix support to Scheme would be a useful convenience for some existing developers who use Scheme, and it would also eliminate a common complaint by developers who currently choose to use other languages instead.
Scheme currently reserves
{
...}
“for possible future
extensions to the language”. We propose that
{
...}
be used to
support “curly-infix” notation as a reader
abbreviation, just as
'x is an abbreviation for (quote x) and
(x y z)
is an abbreviation for
(x . (y . (z . ()))).
This proposal is an extremely simple and straightforward technique for supporting infix notation. There is no complex precedence system, all other Scheme capabilities (including macros and quasiquoting) work unchanged, any symbol can be used as an infix operation where desired, and Scheme remains general and homoiconic. Curly-infix-expressions (also known as c-expressions) are just a convenient reader abbreviation for infix notation.
At its core, this SRFI provides the simple curly-infix list that a list of expressions, with the difference that values are written in a different order. The simple curly-infix list {operand-1 operator operand-2 operator operand-3 operator ...} is mapped to (operator operand-1 operand-2 operand-3 ...) so that more than two operands are handled cleanly. E.g., {a + b + c} ⇒ (+ a b c).
Many previous systems have implemented “infix” systems as a named macro or procedure (e.g., INFIX). This looks ugly, and it does the wrong thing — the resulting list always has INFIX at the beginning, not the actual infix operator, so this approach can interfere with quoting, macros, and other capabilities. In particular, consider the following syntax-rules macro for function composition:
(define-syntax o (syntax-rules () ({f o g} (lambda args (f (apply g args)))) ({f o g o h o ...} {(lambda (x) (f (g x))) o h o ...})))
This example takes advantage of the fact that {f o g o h o ...} ⇒ (o f g h ...). Infix cannot be implemented as a macro alone, as the syntax-rules form has a particular treatment for the pattern. A macro for infix would very likely confuse the syntax-rules form.
Other systems build notations into the reader, but the infix notation are often a notation radically different from normal Lisp notation. The result, in some cases, would be that these notations would simultaneously lose Lisp’s abilities for quoting, quasiquoting, and so on, and these notations were not homoiconic.
In contrast, this curly-infix-expression proposal avoids these problems. For example, in curly-infix, `{,a + ,b} maps cleanly to `(+ ,a ,b), which works as expected with all macros.
Many past “infix” systems for Lisp build in precedence. However, Lisp systems often process other languages, and they may freely mix these different languages. Thus, the same symbol may have different meanings and precedence levels in different contexts. The symbol might not even be defined where it is being used, and allowing precedence definitions would create subtle errors if files are read in a different order. If users hook in their own precedence system into a reader, it could even become difficult to combine code written for different precedence systems. In short, building precedence into a Lisp reader creates many complexities.
Yet the complexity of precedence systems is often unnecessary. In practice, we’ve found that simple infix is all that’s needed most of the time in Lisp-based languages. Even in other languages, many developers unnecessarily use grouping symbols with infix operators to make their order clear. Thus, requiring grouping symbols is less of a hardship than it might appear.
By intentionally not building a precedence system into the reader, a very simple yet useful infix system results. We don’t need to register procedures, ensure that declarations of precedence precede their use, or anything like it. We also ensure that the notation is clearly homoiconic.
Instead, where precedence is desired, application and library writers
can implement precedence by defining and controlling the scope of an
“nfx” macro or procedure, or by later postprocessing
of that symbol.
Scheme macros are already quite powerful and capable of handling this;
in these cases, {
...}
provides a more
convenient notation.
The curly-infix approach, instead of trying to manage both infix
and precedence, handles simple cases and then
takes advantage of the existing Scheme scoping rules and macro system for
more complex cases (in the rare cases where they are needed).
It would be possible to extend curly-infix to provide a full fixed precedence system (e.g., if an expression is mixed, attempt to use various precedence rules). However, such capabilities would be extensions beyond this SRFI.
Note that curly-infix includes support for unary operators, but again,
they are without precedence.
As a result, they must be grouped separately.
This does not lead to hard-to-read expressions, however.
Examples of simple curly-infix lists combining infix and unary operations
include
{-(x) * -(y)}
and
{-{x} * -{y}}
(the notation is designed so that both work).
At first David A. Wheeler, who started this project, considered reporting an error if a simple infix expression isn’t provided. However, prepending “nfx” is much more flexible.
Some past efforts tried to automatically detect infix operators,
but this turns out to not work well. It’s hard to express
good rules for detecting infix operators, and the rules become too
complex for users (e.g., “punctuation-only symbols”
doesn’t detect “and” or “or”).
And in
any case, if they were automatically detected, an escape mechanism
would be needed anyway -
consider (map - ns)
for getting a new list with
the numbers in ns negated.
Allowing the user to expressly notate when
infix was intended, using {
...}
,
turns out to be far more clearer
and more intuitive. In particular, curly-infix allows the use of
infix with any symbol, whenever you want... and where it’s
not convenient, you don’t need to use it. It is also very
backwards-compatible: Normal lists work normally, and if you want
infix, use {
...}
.
The empty curly-infix list {} is intentionally mapped to (), as it is an empty list, and this is the likely user meaning (reducing unnecessary errors).
The one and two parameter cases are defined in part to reduce user error, and in part to provide better support:
{
...}
can conceptually be used
for grouping:
{{{a} + {b}}}
is equivalent to
(+ a b).
It ensures that the neoteric-expression
f{x} becomes the likely-intended (f x),
and also makes it easy to use prefix notation; e.g.,
{ f(x) } is just another way to write ( f x).
Finally, it also provides an easy escape mechanism
in sweet-expressions for symbols
that would otherwise have other meanings.Operators are compared using equal? so that constructs like ,op are legal operators, e.g., {x ,op y ,op z}. Note that unfortunately if the operator construct contains a cycle, it might not terminate if equal? does not terminate in the presence of cycles. This was specified this way so that implementors could use the normal Scheme equal? comparison instead of having to implement a special comparison operator just for this particular case.
Curly-infix requires that the infix operators be delimited (e.g., by
spaces). This is consistent with Lisp history and current practice.
Currently, in Lisp, operators are always
delimited in traditional s-expressions (typically by left
parentheses on the left, and by whitespace on the right).
It’s impractical to do otherwise today; most Lisps,
including Scheme, allow and predefine symbols that include characters (like
“-
”) that are typically used for infix operators.
Many developers put space around infix operators even in languages
that don’t require them, so syntactically requiring them is no burden.
In short, it is difficult to allow infix operators
without delimiters, and the visual results are the same as many
real-world uses in other languages, so the result appears quite
customary to typical software developers.
We would like to have a convention for users to easily enable curly-infix everywhere, e.g., for the default reader (e.g., read), read-eval-print loop (REPL), and loader (e.g., load). Our ideal would be that implementations would enable curly-infix in their normal invocation, but some implementors may not want to do that. For example, if an implementation already uses braces for a different local extension, they may not want to immediately switch to curly-infix in their default invocation. Thus, if implementors choose to not enable curly-infix in their default reader, a conventional command line name “curly-foo” is defined for each implementation foo that enables it.
There is simply no single option flag (for example) that everyone could agree on to enable this. In practice, we expect that implementations will build this capability into their default readers and then control it via some special flag, but we do not want to mandate exactly how it is turned on or passed.
We would like implementations to always have curly-infix enabled.
However, some implementations may have other extensions
that use {
...}
.
We want a simple, standard way to identify code that uses curly-infix
so that readers will switch to curly-infix if they need to switch.
This marker was recommended during discussion of SRFI-105.
After all, R6RS and R7RS (draft 6) already use
#!fold-case and #!no-fold-case
as special markers to control the reader.
Using #!srfi-105 is a simple, similar-looking marker
for a similar situation.
What’s more, it implies a reasonable convention for reader extensions:
markers that begin with #!
, followed by an ASCII letter, should
have the rest read as an identifier (up to a whitespace)
and use that to control the reader, and srfi-
should be
the namespace for SRFIs.
This marker need not interfere with other uses of #!.
SRFI-22 supports
#!
followed by space as a comment to the end of the line; this is supported
by several implementations, but this is easily distinguished from this
marker by the space.
Guile, clisp, and several other Lisps support
#!
...!#
as a multi-line comment, enabling scripts with mixed languages and
multi-line arguments.
But in practice the #! is almost always
followed immediately by /
or .
, and other scripts
could be trivially fixed to make that so.
R6RS had a non-normative recommendation to ignore a line that began
with #!/usr/bin/env, as well
as a #! /usr/bin/env, but this is non-normative;
an implementation could easily implement #! followed by space
as an ignored line, and treat #! followed by
/
or .
differently.
Thus, implementations could trivially support simultaneously markers such as
#!srfi-105
to identify curly-infix, the SRFI-22 #!+space marker as
an ignored line, and support
#!/ ...!# and #!. ...!# as a
multi-line comment.
Note that this SRFI does not mandate support or any particular
semantics for #!fold-case, #!no-fold-case,
the SRFI-22 #!+space convention, or
#! followed by a slash or period;
it is merely designed so that implementations could
implement them all simultaneously.
We recommend that #!srfi-105 not be the first two characters
in a file (e.g., put a newline in front of it).
If the file were made executable, and execution was attempted,
this might confuse some systems into trying to run the
program srfi-105.
By intent, this SRFI (including the enabling mechanism) doesn’t use or interact with any module system at all (including the R6RS and R7RS module systems). This is because some implementations won’t have a module system (or at least not a standard one). Curly-infix is an intentionally simple mechanism that can be built into even trivial Scheme implementations. Mandating module support is unnecessary and might inhibit its adoption.
Racket allows a notation called the “infix convention”
with the form “(a . operation . b)
”. An
advantage of this alternative is that it does not use the braces,
so it might be easier to implement in Schemes which already define
{
...}
in a local extension. However, the Racket “infix
convention” has many problems:
.
”
somewhere, you end up with the wrong result, and possibly without an
error being flagged. This notation also
makes it harder to see improper lists; improper lists are important
but rare, so it’s good to make them obvious - the Racket
infix convention makes improper lists hard to distinguish. The
Racket documentation even goes out of its way to emphasize that
infix convention use is unrelated to improper lists... which
suggests that they are easily confused.{
...}
, {x} is no longer useful as an escape mechanism
for sweet-expressions (a notation that builds on curly-infix). An
alternative would be to use (. x) as an escape mechanism,
but at that point dots-in-lists become busy and confusing.In short, cases where infix notation would be useful are extremely common, so its notation should be convenient. The Racket “infix convention” may be the next-best notation for infix notation after curly-infix, but it’s next-best, and we should strive for the best available notation for such a common need. Curly-infix does not conflict with the Racket infix convention; implementations could implement both. We recommend that an implementation that implements the Racket infix convention should also allow multiple operands and use curly-infix semantics for them, pretending that . op . is a single parameter. In that case, (a . + . b . + . c) would map to (+ a b c), and (a . + . b . * . c) would map to (nfx a + b * c). Note that the existence of the Racket “infix convention” is additional evidence of the need for a standard infix convention; many have separately created mechanisms to try to provide infix support.
The Gambit reader includes a notation called the “Scheme Infix eXtension (SIX)” that supports infix notation. SIX expressions begin with a backslash.
Like curly-infix, SIX is a reader extension. But SIX has a number of problems compared to curly-infix:
Like Racket, SIX does demonstrate that there is a need for an infix notation that can be used in Scheme.
A system could simultaneously implement curly-infix and SIX. However, curly-infix is far simpler, is more flexible (e.g., by allowing arbitrary symbols), and works much more easily with macros and quoting. Thus, we believe that curly-infix is the better system and more appropriate for standardization across Scheme implementations.
Guile 1.4.x at gnuvola.org is self-described as a “(somewhat amicable) fork from the official Guile”. It includes support for reading infix expressions. Once activated, infix expressions are surrounded by #[ and ]. Infix are surrounded by whitespace. It supports precedence, which sounds like an advantage, but operators must be registered before use (and few are predefined), creating an opportunity for terrible errors if the expression is read first. There is also the opportunity for serious problems if different programs are written assuming different precedence levels. Inside the infix notation a very different language is used (e.g., parentheses are used for grouping instead of necessarily creating lists, and parameters are separated by commas), so it is unclear how well it would work with other Scheme features such as quasiquotation.
The guile 1.4 reading infix module has a more complex grammar requiring a more complex implementation and understanding. Its registration system creates serious problems when trying to use it for larger systems. This infix notation has not been accepted into the version of guile used by most people, so it is very much not portable. But perhaps the biggest problem is that this notation is fundamentally not homoiconic; it is harder to determine where lists begin and end with it.
Like Racket and SIX, this module does demonstrate that there is a need for an infix notation that can be used in Scheme.
In contrast, curly-infix is simpler, requires no registration system or other complexities, works more clearly with macros and quasiquotation, and has the general advantage of being homoiconic.
Lisp’s standard notation is different from “normal” notation in that the parentheses precede the function name, rather than follow it. Others have commented that it’d be valueable to be able to say name(x) instead of (name x):
Neoteric-expressions allow users to use a more traditional-looking notation for function calls. Again, quoting rules and macros continue to work as usual.
The (. e) rule handles expressions like read(. port), ensuring that they map to (read . port). If (. x) didn’t mean x, then it would be easy to get this case wrong. Also, if someone wanted to build on top of an existing reader, they would have to reimplement parts of the list-processing system if this wasn’t handled. It is already true that (. x) is x in guile, so there was already a working example that this is a reasonable extension. In fact, in a typical implementation of a list reader, it takes extra effort to prevent this extension, so this is a relatively easy extension to include.
It would be possible to define neoteric-expressions to have
comma-separated values in a function call; this would make it even
more similar to traditional function call notation. A simple way would be to simply
remove all commas, but this would interfere with ,
-lifting,
and thus was immediately rejected.
A better rule, that would indeed work, would be to require each parameter to end with a comma, and then remove that ending comma. However, this rule:
,
-lifting
(making them hard to find).Many other languages do use commas, but they are required in those
languages because infix operators are not surrounded by any marker.
Since infix operations are already surrounded by
{
...}
in our notation,
there is no need for the additional commas for parameter separation.
Experimentation found that separating parameters solely by whitespace worked well, so that approach was selected.
Originally the prefix had to be a symbol or list. The theory was that by ignoring others, the sweet-reader would be backwards-compatible with some badly-formatted code, and some errors might not result in incorrectly-interpreted expressions. But this was an odd limitation, and in some cases other prefixes made sense (e.g., for strings). This was changed to eliminate the inconsistency.
Neoteric-expressions used to be called “modern-expressions”. But some people didn’t like that name, and the obvious abbreviation (“m-expression”) was easily confused with the original Lisp M-expression language. So the name was changed to neoteric, which has a similar meaning and abbreviates nicely. It wasn’t called “function-expressions” because “f-expressions” are previously used (and can sound bad if said quickly), and they weren’t called “prefix-expressions” because “p-expressions” sound like “pee-expressions”. It’s not called “name-prefix” because the prefix need not be a name. There is absolutely no truth to the rumor that the notation was developed by a secret technologically advanced species, so pay no attention to “Microcosmic God” by Theodore Sturgeon :-).
The neoteric rules do introduce the risk of someone inserting
a space between the function name and the opening (
.
But whitespace is already significant as a parameter separator, so this
is consistent with how the system works anyway... this is not really a
change at all.
Obviously, this is trivial to parse. We don’t lose any power,
because this is completely optional -- we only use it when we want to,
and we can switch back to the traditional s-expression notation if we
want to. It’s trivially quoted... if you quote a symbol followed
by (
, just keep going until its matching )
,
which is essentially the same rule as before!
There is no requirement that writers (e.g., “write” or a pretty-printer) write out curly-infix-expressions. They may choose to do so, e.g., for lists of length 3-6 whose car is the symbol “and”, the symbol “or”, or a punctuation-only symbol. However, it would probably be wise to wait until many implementations can handle c-expressions.
Curly-infix is designed so that it can work on other Lisps as well. We even have a working implementation in Common Lisp.
Curly-infix is an unusually simple mechanism, but like much of any Lisp-based language, its power comes from its simplicity.
“Curly-infix-expressions” or “c-expressions” are s-expressions with an additional notation: The curly-infix list. A curly-infix list is syntactically almost identical to a regular list, but it is surrounded by braces instead of by parentheses, and instead of a sequence of s-expressions it contains a sequence of neoteric-expressions (which add support for formats like f(x)). Once a curly-infix list is read, it is mapped differently than a regular list by a curly-infix reader:
Here is the precise definition of a curly-infix list (which is nearly identical to a traditional list):
curly-infix-list → | “{” <whitespace>* [ <n-expression> [ <whitespace>+ <n-expression> ]* [ <whitespace>+ . <whitespace>+ <n-expression> ] <whitespace>* ] “}” |
A “neoteric-expression” or “n-expression” is a curly-infix-expression, with the following modifications where e is any datum expression:
Here are some examples of c-expressions (note all operators in curly-infix are delimited):
A curly-infix reader is a datum reader
that can correctly read and map curly-infix-expressions.
A curly-infix reader must
include the braces “{
” and
“}
” as delimiters.
An implementation of this SRFI must accept the marker #!srfi-105 followed by a whitespace character. This marker (including the trailing whitespace character) is consumed and considered whitespace; after reading this marker, the reader must accept curly-infix expressions in subsequent datums until it reaches an end-of-file or some other conflicting marker (no conflicting marker is specified here). We encourage implementations to always implement curly-infix expressions, even when the marker is not received. However, portable applications must include this marker before any curly-infix expressions. We recommend that portable applications not use this marker as the first characters of a file (e.g., they should precede it with a newline).
The “standard readers” are the datum reader used by the REPL, the datum readers defined by the relevant Scheme standards (such as “read” and where applicable “get-datum”), and the readers used to load user-supplied code as defined by the relevant Scheme standards (e.g., the reader used by “load” and module-loading mechanisms for user code). The standard readers are curly-infix enabled if the standard readers are curly-infix readers.
An implementation of this SRFI must
have its standard readers be curly-infix enabled.
We encourage implementations’ default invocation to
have their standard readers be curly-infix enabled,
but this is not required.
If the standard readers are not curly-infix enabled
in an implementation’s default invocation,
then if it can be invoked from a command line via the command
“foo”, the implementation
must provide an alternative command “curly-foo”
(the command prefixed with “curly-
”) in which
its standard readers are curly-infix enabled.
In addition, if the implementation is invokable as
a graphical user interface (GUI), it
must provide a documented means to ensure that its
standard readers are curly-infix enabled.
An implementation must not, by default, bind the symbol “nfx” to a procedure, macro, or syntax that cannot be overridden. This symbol is reserved for use by library writers (in the case of a library-based implementation of this SRFI, this symbol is reserved for use by other libraries) and application writers.
However, an implementation may provide a default library that binds the “nfx” symbol (as it is then a library, this case actually falls under the “reserved for use by library writers” clause above). Application writers and other library writers using that implementation are then free to use or not use the implementation’s provided “nfx”. An implementation may even provide more than one, if they so desire. For a library-based implementation of this SRFI, any “nfx” implementation or implementations may be provided, as long as the “nfx” symbol can be rebound by users of that library.
Implementations may provide the procedure (curly-infix-read . port) as a curly-infix reader.
Note that, by definition, this SRFI modifies lexical syntax.
The implementation below is portable, with the exception that
Scheme provides no standard mechanism to override
{
...}
in its
built-in reader. Thus, implementations will typically have a
modified reader that detects “{
“, starts reading a list
until its matching “}
”, and then calls process-curly
defined below. We recommend that implementations always do
this, but an implementation must at least activate this behavior
when “curly-foo” is invoked
or when they read #!srfi-105.
This reference implementation is SRFI type 2: “A mostly-portable solution that uses some kind of hooks provided in some Scheme interpreter/compiler. In this case, a detailed specification of the hooks must be included so that the SRFI is self-contained.”
; Return true if lyst has an even # of parameters, and the (alternating) ; first parameters are "op". Used to determine if a longer lyst is infix. ; If passed empty list, returns true (so recursion works correctly). (define (even-and-op-prefix? op lyst) (cond ((null? lyst) #t) ((not (pair? lyst)) #f) ((not (equal? op (car lyst))) #f) ; fail - operators not the same ((not (pair? (cdr lyst))) #f) ; Wrong # of parameters or improper (#t (even-and-op-prefix? op (cddr lyst))))) ; recurse. ; Return true if the lyst is in simple infix format ; (and thus should be reordered at read time). (define (simple-infix-list? lyst) (and (pair? lyst) ; Must have list; '() doesn't count. (pair? (cdr lyst)) ; Must have a second argument. (pair? (cddr lyst)) ; Must have a third argument (we check it ; this way for performance) (even-and-op-prefix? (cadr lyst) (cdr lyst)))) ; true if rest is simple ; Return alternating parameters in a list (1st, 3rd, 5th, etc.) (define (alternating-parameters lyst) (if (or (null? lyst) (null? (cdr lyst))) lyst (cons (car lyst) (alternating-parameters (cddr lyst))))) ; Not a simple infix list - transform it. Written as a separate procedure ; so that future experiments or SRFIs can easily replace just this piece. (define (transform-mixed-infix lyst) (cons 'nfx lyst)) ; Given curly-infix lyst, map it to its final internal format. (define (process-curly lyst) (cond ((not (pair? lyst)) lyst) ; E.G., map {} to (). ((null? (cdr lyst)) ; Map {a} to a. (car lyst)) ((and (pair? (cdr lyst)) (null? (cddr lyst))) ; Map {a b} to (a b). lyst) ((simple-infix-list? lyst) ; Map {a OP b [OP c...]} to (OP a b [c...]) (cons (cadr lyst) (alternating-parameters lyst))) (#t (transform-mixed-infix lyst)))) ; ------------------------------------------------ ; Key procedures to implement neoteric-expressions ; ------------------------------------------------ ; Read the "inside" of a list until its matching stop-char, returning list. ; stop-char needs to be closing paren, closing bracket, or closing brace. ; This is like read-delimited-list of Common Lisp. ; This implements a useful extension: (. b) returns b. (define (my-read-delimited-list stop-char port) (let* ((c (peek-char port))) (cond ((eof-object? c) (read-error "EOF in middle of list") '()) ((eqv? c #\;) (consume-to-eol port) (my-read-delimited-list stop-char port)) ((my-char-whitespace? c) (read-char port) (my-read-delimited-list stop-char port)) ((char=? c stop-char) (read-char port) '()) ((or (eq? c #\)) (eq? c #\]) (eq? c #\})) (read-char port) (read-error "Bad closing character")) (#t (let ((datum (neoteric-read port))) (cond ((eq? datum '.) (let ((datum2 (neoteric-read port))) (consume-whitespace port) (cond ((eof-object? datum2) (read-error "Early eof in (... .)\n") '()) ((not (eqv? (peek-char port) stop-char)) (read-error "Bad closing character after . datum")) (#t (read-char port) datum2)))) (#t (cons datum (my-read-delimited-list stop-char port))))))))) ; Implement neoteric-expression's prefixed (), [], and {}. ; At this point, we have just finished reading some expression, which ; MIGHT be a prefix of some longer expression. Examine the next ; character to be consumed; if it's an opening paren, bracket, or brace, ; then the expression "prefix" is actually a prefix. ; Otherwise, just return the prefix and do not consume that next char. ; This recurses, to handle formats like f(x)(y). (define (neoteric-process-tail port prefix) (let* ((c (peek-char port))) (cond ((eof-object? c) prefix) ((char=? c #\( ) ; Implement f(x). (read-char port) (neoteric-process-tail port (cons prefix (my-read-delimited-list #\) port)))) ((char=? c #\[ ) ; Implement f[x] (read-char port) (neoteric-process-tail port (cons 'bracketaccess (cons prefix (my-read-delimited-list #\] port))))) ((char=? c #\{ ) ; Implement f{x}. Balance } (neoteric-process-tail port (let ((tail (neoteric-read port))) (if (eqv? tail '()) (list prefix) ; Map f{} to (f), not (f ()). (list prefix tail))))) (#t prefix)))) (define (neoteric-read . port) (if (null? port) (neoteric-read-real (current-input-port)) (neoteric-read-real (car port)))) ; This is the "real" implementation of neoteric-read ; (neoteric-read just figures out the port and calls neoteric-read-real). ; It implements an entire reader, as a demonstration, but if you can ; update your existing reader you should just update that instead. ; This is a simple R5RS reader, with a few minor (common) extensions. ; The key part is that it implements [] and {} as delimiters, and ; after it reads in some datum (the "prefix"), it calls ; neoteric-process-tail to see if there's a "tail". (define (neoteric-read-real port) (let* ((c (peek-char port)) (prefix ; This cond is a normal Scheme reader, puts result in "prefix" ; This implements "read-expression-as-usual" as described above. (cond ((eof-object? c) c) ((char=? c #\;) (consume-to-eol port) (neoteric-read-real port)) ((my-char-whitespace? c) (read-char port) (neoteric-read-real port)) ((char=? c #\( ) (read-char port) (my-read-delimited-list #\) port)) ((char=? c #\) ) (read-char port) (read-error "Closing parenthesis without opening") (neoteric-read-real port)) ((char=? c #\[ ) (read-char port) (my-read-delimited-list #\] port)) ((char=? c #\] ) (read-char port) (read-error "Closing bracket without opening") (neoteric-read-real port)) ((char=? c #\{ ) (read-char port) (process-curly (my-read-delimited-list #\} port))) ((char=? c #\} ) (read-char port) (read-error "Closing brace without opening") (neoteric-read-real port)) ((char=? c #\") ; Strings are delimited by ", so can call directly (default-scheme-read port)) ((char=? c #\') (read-char port) (list 'quote (neoteric-read-real port))) ((char=? c #\`) (read-char port) (list 'quasiquote (neoteric-read-real port))) ((char=? c #\,) (read-char port) (cond ((char=? #\@ (peek-char port)) (read-char port) (list 'unquote-splicing (neoteric-read-real port))) (#t (list 'unquote (neoteric-read-real port))))) ((ismember? c digits) ; Initial digit. (read-number port '())) ((char=? c #\#) (process-sharp port)) ((char=? c #\.) (process-period port)) ((or (char=? c #\+) (char=? c #\-)) ; Initial + or - (read-char port) (if (ismember? (peek-char port) digits) (read-number port (list c)) (string->symbol (fold-case-maybe port (list->string (cons c (read-until-delim port neoteric-delimiters))))))) (#t ; Nothing else. Must be a symbol start. (string->symbol (fold-case-maybe port (list->string (read-until-delim port neoteric-delimiters)))))))) ; Here's the big change to implement neoteric-expressions: (if (eof-object? prefix) prefix (neoteric-process-tail port prefix)))) ; Modify the main reader so that [] and {} are also delimiters, and so ; that when #\{ is detected, read using my-read-delimited-list ; any list from that port until its matching #\}, then process ; that list with "process-curly", like this: ; (process-curly (my-read-delimited-list #\} port))
The readable project website has more information: http://readable.sourceforge.net
We thank all the participants on the “readable-discuss” mailing list.
Copyright (C) 2012 David A. Wheeler and Alan Manuel K. Gloria. All Rights Reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.