Lassi Kortela <xxxxxx@lassi.io> writes: | Is there any kind of standard for HTTP server middleware procedures? This would make a great SRFI - nowadays an almost universally useful abstraction, with a simple interface permitting numerous implementation strategies. This would be a fantastic contribution. I'd like to help. I have been running my own HTTP/1.0 web server, Shuttle, on top of MIT Scheme for well over a decade. I run it behind Nginx now, but I ran it on its own for years. I've never published it because I never got around to adding support for HTTP/1.1 or 2.0. However, the request-response API isn't tied to the HTTP version, and it works well, so I'll describe it here in case it helps with your efforts. Most of the text below is drawn from comments in the code. A "dispatcher" defines how each request is handled. To add a new dispatcher to a server, one calls: (web-server/add-dispatchers! web-server . web-dispatchers) Dispatchers are created using the macro `make-web-dispatcher', which defines: * which incoming requests this dispatcher handles * how to parse variable elements of the path, e.g. <month> and <year> in "/calendar/<year>/<month>" * how to parse query parameters * what code should run when a matching request arrives. The syntax is: (make-web-dispatcher ((request method) path-pattern (query-parameter ...) body)) * `Request' is the name of a variable which will be bound to the `http-request' object in `body'. Here's the definition of `http-request', slightly simplified: (define-record-type http-request (make-http-request headers http-version method peer-address port request-arrival-time uri web-server) http-request? (headers http-request/headers) (http-version http-request/http-version) (method http-request/method) (peer-address http-request/peer-address) (port http-request/port) (request-arrival-time http-request/request-arrival-time) (uri http-request/uri) (web-server http-request/web-server)) * `Method' is a symbol naming an HTTP method, e.g. GET or POST. * `Path-pattern', which is a list whose elements are either strings, representing a literal part of the path, or lists whose first element is the symbol `?' and whose other element is another symbol, representing a variable part of the path. If an element is a literal, the path at that point must match the symbol exactly. If it's a variable, the path at that point can be any value that doesn't match another pattern at the same point. The symbol names a variable that will be bound to this element of the path in `body'. For example, the sequence of path elements `("foo" (? bar))' specifies that the pattern must start with "foo", but may then contain any single element. For example, "/foo/alpha" will match that pattern, binding `bar' to "alpha", but only if there is no other dispatcher whose path is literally `("foo" "alpha")'. `Path-pattern' may end with a list of the form `(? . bar)'. In that case, the symbol names a variable, and its value will be a list of the strings in the rest of the path. So `(foo (? . bar))' will match "/foo/alpha/beta", and `bar' will be bound to '("alpha" "beta"). * `Query-parameter' is a list of the query parameters that this parameter matches. Each query parameter may be either a symbol, `(list symbol)', or `(optional symbol)'. The symbol names a variable that will be bound to the values of this query parameter inside `body'. If a query parameter is a symbol, then it is required. It is an error if the path matches and this query parameter is not included in the URI. It is also an error if the path matches but more than one value is supplied for this query parameter. If a query parameter is of the form `(list symbol)', then it allows multiple values and is optional. Its default value is the empty list. All its values will collected into a list. If a query parameter is of the form `(optional symbol)', then it allows zero values or one value. If no value is supplied, its value is #f. For example, the query parameter `(a (list b) (optional c))' matches the following URI query strings: URI query string a b c ---------------- --- ----- --- ?a=1 "1" () #f ?a=1&b=2 "1" ("2") #f ?a=1&b=2&b=3 "1" ("2" "3") #f ?a=1&c=4 "1" () "4" ?a=1&b=2&c=4 "1" ("2") "4" ?a=1&b=2&b=3&c=4 "1" ("2" "3") "4" If a query parameter appears in a URI but without a value (i.e. without "=value"), its value is the empty string. Here are the rules for handler lookup using dispatchers: * Paths are matched by patterns that contain strings, which must match exactly, and variables, expressed as `(? variable)', which match anything between slashes. * Only one handler may match any particular pattern. * If two patterns match the same path up to some point, but one ends with a string and the other ends with a variable, then the one that ends with a string is considered the only match. Matching continues with it. * The handler for any path pattern can take only one combination of query parameters, although some query parameters may be optional and some required. No dispatching is done on what combination of parameters is present. Different paths should be used for this. Every handler should return three values: code: response code, a number, e.g. 200 for HTTP OK headers: an alist of headers beyond the basic headers Content-Length and Content-Type output: a thunk that will write the output, or false if there is no output. Content-Length will be computed from this automatically. For example, here's a dispatcher used to handle some RSS feeds for my blog: (make-web-dispatcher ((request get) ("blog" "label" (? label) "rss") ()) (let ((symbol (intern-soft label))) (if (and symbol (blog-label-exists? symbol)) (values ok-response-code ; 200 '(("Content-Type" . "application/rss+xml")) (lambda () (invoke-writer (write-blog-rss-feed (blog-posts-with-label symbol) (format #f "Blog posts on ~A by Arthur A. Gleckler" (string-capitalize (symbol->string symbol))) (format #f "https://speechcode.com/blog/label/~A/rss" symbol))))) (signal-page-not-found 404)))) I'm not wedded to any of this, but I've found some aspects of it particularly useful, e.g.: * automatic parsing of path components * automatic parsing of query parameters * passing a port as part of the request, allowing the dispatcher to decide what to do with any content that may be arriving with the request, e.g. form data or a huge file upload * returning a thunk so that the actual work can be done later, allowing the containing server to decide on things like whether to chunk results * the fact that it's easy to write procedures that use composition to add things like authentication to requests. (Logging is built into Shuttle, but it could be added this way, too.) I hope this is helpful. I'm excited by the idea of standardizing web request handlers. It would be great to make it easy for Scheme hackers to share web code across Scheme implementations and web servers.