json-stream-read should validate json too
Duy Nguyen
(21 Jan 2020 09:15 UTC)
|
Re: json-stream-read should validate json too
Amirouche Boubekki
(21 Jan 2020 10:47 UTC)
|
Re: json-stream-read should validate json too
Duy Nguyen
(21 Jan 2020 12:44 UTC)
|
Re: json-stream-read should validate json too Amirouche Boubekki (21 Jan 2020 13:46 UTC)
|
Re: json-stream-read should validate json too
Duy Nguyen
(23 Jan 2020 09:11 UTC)
|
Re: json-stream-read should validate json too
Amirouche Boubekki
(23 Jan 2020 19:12 UTC)
|
Re: json-stream-read should validate json too
Amirouche Boubekki
(23 Jan 2020 19:24 UTC)
|
Maximum nesting level limit and efficiency (Re: json-stream-read should validate json too)
Amirouche Boubekki
(23 Jan 2020 18:54 UTC)
|
Re: Maximum nesting level limit and efficiency (Re: json-stream-read should validate json too)
Duy Nguyen
(24 Jan 2020 01:38 UTC)
|
Re: json-stream-read should validate json too
Amirouche Boubekki
(23 Jan 2020 19:16 UTC)
|
Le mar. 21 janv. 2020 à 13:44, Duy Nguyen <xxxxxx@gmail.com> a écrit : > > On Tue, Jan 21, 2020 at 5:47 PM Amirouche Boubekki > <xxxxxx@gmail.com> wrote: > > > Alternatively maybe we can wrap user-provided 'proc' in our own proc > > > that does validation on top, something like a stripped down version of > > > %json-read that does nothing but validate? For example, > > > make-json-validator takes a proc and returns a new proc also performs > > > validation. > > > > I will look at it, it seems to me if one can validate inside > > json-stream-read, it will be more useful. > > Yes it's definitely more useful inside json-stream-read to me. I was > just worried some people value performance and may be ok with no > validation (e.g. you have verified it at some point before). I don't > know if such a use case exist though. My goal is to have as conformant as possible reader and writer implementation, and a specification that describes a library that covers most uses, this includes: - parsing JSON text: covered by json-read procedure - parsing bigger than memory JSON text: hence the streaming parser: json-stream-read - make it possible to adapt JSON types to custom Scheme types e.g. using records: streaming parser is a solution, but it is not the easiest to use: we could imagine another procedure that makes it easier to customize the output? - Do not crash in case of bad input like deeply nested JSON. And possibly: - printing bigger than memory JSON text - parsing json lines If one drops the "conformant" reader from the implementation, it is possible to make it faster e.g. one can consider that once `t` is read, that the following letters are `rue` that is `true. Unlike at the moment, there is an explicit test that checks that the parser does error in case `txyz` [0]. That is a small nip. Another small nip, is the use of scheme regexp to validate number instead of passing them directly to string->number [1]. So, there is room to improve performance, if one want to read JSON very fast. Note: the specification does not prescribe conformant reader, it use the term "should". [0] https://github.com/scheme-requests-for-implementation/srfi-180/blob/master/srfi/json-checks.sld#L467-L474 [1] https://github.com/scheme-requests-for-implementation/srfi-180/blob/master/srfi/json.scm#L18 > > > Also, I was thinking about adding a parameters like > > `json-maximum-nesting-level` that would be 501 by default. And that > > will control the reader, in case there is 501 or more nested JSON > > array or object, json-stream-reader will raise a json-error? What do > > you think? > > Do we really have any problem with nesting level though? I think the > streaming code itself does not, and the way 'proc' is currently > implement, we don't call it recursively either. This reminds me of a > hacker news thread [1]. Anyway, because it's quite easy to count depth > from user code (and if 'proc' composes well), and (I assume) we don't > have any limits regarding nesting level, I think it's best leave it > out. > [1] https://news.ycombinator.com/item?id=21483256 Thanks for the link. I have not proof as of yet, but I think it will be faster to parse JSON text without streaming, but to stay safe, it must have nesting level limit. So, maybe there is a place for a `json-read-fast` procedure? > -- > Duy -- Amirouche ~ https://hyper.dev