test-apply and on-test-begin callback
Tomas Volf
(24 Jul 2024 21:32 UTC)
|
Re: test-apply and on-test-begin callback
Per Bothner
(24 Jul 2024 22:10 UTC)
|
Re: test-apply and on-test-begin callback Tomas Volf (25 Jul 2024 13:31 UTC)
|
Thank you very much for your response. To keep it succinct, in the rest of the message `skip list' refers to "currently active skip specifiers" and `run list' refers to specifiers passed to the test-apply (naming taken from the reference implementation). On 2024-07-24 15:10:18 -0700, Per Bothner wrote: > I'm not sure what I intended for test-apply. > I think it is meant as a top-level tool for running a test-suite, > and thus "currently active skip specifiers" would normally not be a factor. > However, I suppose it could be used in more complex situations; I just > don't remember if I had anything in mind. I acknowledge the re-use of stateful specifiers across both skip list and run list is quite an edge case, and the specification just saying order is unspecified would be fine with me. I have no immediate use case to share the specifiers, but as I am writing an implementation of SRFI-64, I sadly need to think about what user might want to do (and is allowed to do by specification), not what I would consider sane. > > In this case I think "the reference implementation is the specification, > if not explicitly contradicted". Assuming you can figure out whet the > reference implementation does. (Doesn't seem that complicated, > but I haven't dug into it in a long time.) Per your advice I took the reference implementation and tried to verify its behavior (without looking at the source too closely, would like to avoid copyright taint). It seems that the SRFI-64 distributed with GNU Guile is based on the reference implementation (not sure if any changes were made), so I know that it does not adhere to the specification all that well (that is reason I started writing my own implementation, as GNU Guile user, I wanted to have SRFI-64 compliant library available), but here I will focus just on test-apply (and only on the part relevant to my original question, there is no need to dig into the cases of non-compliance). I used the following test program (`pk' just prints its arguments): (let ((r (test-runner-null))) (test-runner-on-test-begin! r (λ (r) (pk 'on-test-begin 'name (test-runner-test-name r) 'kind (test-result-kind r)))) (test-runner-on-test-end! r (λ (r) (pk 'on-test-begin 'name (test-runner-test-name r) 'kind (test-result-kind r)))) (test-apply r (test-match-name "test-a") (λ () (test-begin "xx") (test-assert "test-a" #t) (pk (test-result-kind)) (test-assert "test-b" #t) (pk (test-result-kind)) (test-end)))) I got this output: ;;; WARNING: compilation of /home/wolf/Downloads/testing.scm failed: ;;; Unbound variable: %test-source-line2 ;;; (on-test-begin name "test-a" kind #f) ;;; (on-test-begin name "test-a" kind pass) ;;; (pass) ;;; (on-test-begin name "test-b" kind skip) ;;; (on-test-begin name "test-b" kind skip) ;;; (skip) Ignoring the failed compilation (as I said, I do not want to dig into the sources too much), we can see that run list behaves just as a negated skip list. So for any test *not* matching the run list, it sets both preliminary and final result-kind to 'skip. I am not sure that complies with the specification. I see few relevant parts. For the preliminary result we have: > If we've started on a new test, but don't have a result yet, then the result > kind is 'xfail if the test is expected to fail, 'skip if the test is supposed > to be skipped, or #f otherwise. Skipping is defined in description of `test-skip': > Before each test (or test-group) the set of active skip-specifiers are applied > to the active test-runner. If any specifier matches, then the test is skipped. And for the test-apply: > A test is executed if it matches any of the specifiers in the test-apply and > does not match any active test-skip specifiers. Since test-apply explicitly treats run list and skip list as two separate concepts (instead of saying something like for example "If one or more specifiers are listed, they are (in negated form) treated as part of currently active skip specifiers."), I think the preliminary result is mandated to be #f. It also says "test is executed", not "test is not skipped". So it could be argued that even the final result of 'skip is wrong (and should stay #f). Now that I think about it, that is probably the most reasonable reading of the standard. The test was not even considered for execution, so the result is not available, hence #f. What is your opinion here? > > On the other hand, if you think some other semantics would be > more useful or clear, that would probably be OK too. I think the semantics as I understand the standard are fine. It still is bit unclear how/if the run lists compose in case of nested test-apply calls (I *think* per the specification they do not compose, but replace each other, so only the inner-most run list is taken into account. Is that correct reading?) As far as the current reference implementation goes, I think it is bit sub-optimal that there is no way to tell whether test was skipped due to run list or skip list. I can imagine test runner to prefer not to report test not on run list at all. If I, for example, have a test file with 100s of tests, and I run (test-apply (test-match-name "t-foo" ...), I would expect only t-foo in the output, not the 100s of other tests with 'skip result. Or rather, I would expect to at last be able to write such test runner, but in reference implementation that is not possible as far as I can tell. But in the end I can just do by with `grep' I guess? Thank you and have a nice day, Tomas Volf -- There are only two hard things in Computer Science: cache invalidation, naming things and off-by-one errors.