GraphQL parser mostly working
Lassi Kortela
(31 Jul 2019 09:55 UTC)
|
Re: GraphQL parser mostly working
Peter Bex
(31 Jul 2019 11:35 UTC)
|
Re: GraphQL parser mostly working
John Cowan
(31 Jul 2019 11:39 UTC)
|
Re: GraphQL parser mostly working Lassi Kortela (31 Jul 2019 12:05 UTC)
|
Re: GraphQL parser mostly working Lassi Kortela 31 Jul 2019 12:05 UTC
> Cool! Just a quick note: that timing seemed to be a bit on the slow side > so I checked and it seems you forgot to compile read-char-if into a > library. With ./test-read -:d you see these lines: > > ; loading ./read-char-if.import.scm ... > ; including ./read-char-if.scm ... > > Running it in the profiler (add -:p to the invocation) also shows the > majority of time being spent in <eval>, which is generally a bad sign :) > > With that eval, I get these timings here: > > ./test-read ../schema.public.graphql > /dev/null 2.78s user 0.05s system 70% cpu 4.012 total > > If I compile it into a .so file, I get better times: > > ./test-read ../schema.public.graphql > /dev/null 1.63s user 0.02s system 82% cpu 2.007 total > > Still far from great. Most of the time is spent in string-index > in srfi-13. Awesome, thanks a lot for the help! I have no idea how to use the Chicken compiler yet so I didn't know any of these things :) string-index is only used to test whether a character belongs to an ASCII span like A..Z. I changed those to test the character code using `cond` and `<=` which shaved off 0.2 seconds from the original times. That, and compiling libraries with: $ csc -library read-char-if.import.scm $ csc -library net.graphql.read.import.scm brings the runtime down to 0.85 seconds. Here's the current profile: (|##sys#write-char/port| 1 10.000000) (|read-char-if.scm:20: loop| 1 10.000000) (|graphql-read.scm:28: scheme#string->symbol| 1 10.000000) (|graphql-read.scm:89: read-char-if#read-char?| 1 10.000000) (|graphql-read.scm:40: read-char-if#read-char?| 1 10.000000) (|##sys#read-char/port| 6 70.000000) (|graphql-read.scm:39: scheme#string->symbol| 1 10.000000) (|read-char-if.scm:2: k| 2 20.000000) (|graphql-read.scm:28: scheme.base#char=?| 1 10.000000) (|graphql-read.scm:210: packrat#make-result| 3 100.000000) (|read-char-if.scm:14: chicken.base#open-output-string| 3 30.000000) (|graphql-read.scm:28: scheme.base#string-map| 1 10.000000) (|read-char-if.scm:9: match-char?| 11 110.000000) (|test-read.scm:7: scheme#write| 1 40.000000) (|graphql-read.scm:95: scheme#append| 1 10.000000) (|<syntax>| 1 20.000000) (|graphql-read.scm:210: g1285| 1 10.000000) (|graphql-read.scm:105: scheme#append| 7 110.000000) (|read-char-if.scm:9: scheme#peek-char| 16 170.000000) (|read-char-if.scm:17: chicken.base#get-output-string| 3 30.000000) (|graphql-read.scm:79: read-char-if#read-char?| 1 10.000000) (|read-char-if.import.scm:2: chicken.load#load-extension| 1 10.000000) John: > You can merge the lexer into the packrat grammar: because of its infinite lookahead, there's no reason to have a separate lexer. In the official GraphQL grammar the lexical syntax is indeed merged into the parse grammar. I thought about doing that, but the lexer was already done recursive-descent style and I wondered if the error messages are going to be intelligible if the lexer is in the same grammar as the parser. Of course, the lexer could be placed in its own packrat parser. In case anyone wants to take a stab at this approach, be my guest :) From the profile, the time spent in 'append', 'get-output-string' and 'peek-char' suggest that lexing with packrat could easily make it faster. Then again, I don't know whether packrat's internal data structures are in fact faster than these. packrat#make-result takes up a surprising amount of run time now. some of the 'string->symbol' calls are spurious right now (single-char symbols; could just test the chars). It is so nice to be back using a language where you can optimize only the slow parts instead of being forced to write "optimized" code for your entire program from the start (been writing a lot of C lately). I can try the packrat lexer myself later if there's time, but the current solution is fast enough for real work (GitHub's grammar is huge) so I'd like to hook this up to Peter's Spiffy web server next.