testing inexacts
Aubrey Jaffer
(29 Jan 2005 02:43 UTC)
|
Re: testing inexacts Per Bothner (29 Jan 2005 03:56 UTC)
|
Aubrey Jaffer wrote: > | The test-name is a string that names the test case. It is used when > | reporting errors, and also when skipping tests, as described below. > > Must TEST-NAMEs be unique? No. After all, they're optional: a missing name is equivalent to "". > If not, then aren't calls to TEST-END ambiguous? I don't believe so. test-begin/test-end have to be properly bracketed. The name in the test-end is mainly for readability and to catch test-suite errors. I also intend that if the test-end name doesn't match the current name (from the previous test-begin), but it matches an earlier one, then extra implicit test-end calls would be added. However, the implementation doesn't yet do that. This would primarily for recovering from test-suite errors, or exceptions that aren't caught. I.e. like recovering from a syntax error in that the test suite would fail, but we try to fail a little more elegantly. > | *Rationale:* In some ways using symbols would be preferable. > | However, we want human-readable names, and standard Scheme does > | not provide a way to include spaces or mixed-case text in literal > | symbols. > > Writing tests should be about the tests; and not about making > capitalization consistent. The point is *reporting* the results of tests. The report should be human-readable, and allowing mixed case and spaces in test names helps that. > Please allow symbols as well. Using symbols allows us to match test names using eq?. Using strings requires matching test names using equal?. That is an acceptable price to pay for more readable test names. > And while we are at it, R5RS sections are hierarchically numbered. > Why not allow integers? Since these are just names, and we're not doing any operations on test names except displaying them and comparing them, there is no particular value to allowing integers. E.g. allowing the name 34 doesn't add much compared to using "34". However, I have no strong opposition to allowing numbers - or symbols. The concern I have is with "test specifiers". Those "evaluate" to procedures, but it may be convenient to allow short-hands. The draft allows "test-name" as a short-hand for (test-match-named "test-name"). I have considered allowing integers, perhaps as a short-hand for test-match-nth. That wouldn't work if we allow integers as test-names. Feedback on the "syntax" of test-specifiers wojuld be welcome. Though I guess I should post my ideas. If you look at the HTML source (section "Skiping selected tests") you see some ideas I had before settled on defining specifiers as boolean functions; I'd like to combine the convenient syntax for the commented-out specifiers with the simple and general model of using procedures. > | The following forms may be more convenient than using |test-assert| > | directly: > | > | (test-eqv [test-name] test-expr expected) > > The EXPECTED is usually shorter to write than the TEST-EXPR. I > recommend swapping TEST-EXPR and EXPECTED. I don't feel strongly about it. However, the "flow" is that you first evaluate text-expr, and then compare that to the expected result, so having the latter last may be more natural. > Also, putting the optional > argument last is what Scheme programmers are accustomed to. True, but I think the test-name should still come first. I think having the name first, as we do for declarations, and as we do in documents (like dictionaries) is more natural. Visually scanning quickly for a test-name is also easier if the test-name is first. > TEST-EQUAL is just as useful as TEST-EQV and should be provided. It's in the draft, but just in passing: Similarly test-equal and test-eq are shorthand for test-assert combined with equal? or eq?, respectively. > For > testing inexact calculations, a TEST-APPROXIMATE procedure which > accepts values within a small range of the expected number would be > very useful. That sounds useful. Should the error range be specified absolutely or relatively? The latter is presumably more general - except for "approximately zero". How about: (test-approximate [test-name] test-expression expected [error]) where error defaults to (say) 0.01 and is relative to expected I.e. (and (>= result (- expected (* expected error)) (>= result (+ expected (* expected error)))) (test-zero [test-name] test-expression [error]) where error is absolute and defaults (say) 0.01 I.e. (and (>= result (- error)) (<= result error)) > For extra points make TEST-APPROXIMATE recursively > descend list and array structures, using its standard of approximate > numerical match. The range (delta) should be a property of the test > runner. Perhaps the default delta should be a test runner property, but test-approximate/test-zero could override it? > Of course, having optional inexact tests in a testing file isn't > portable to implementations lacking inexacts. R5RS requires those > implementations to signal an error when inexact number syntax is > encountered (macros don't help). "r4rstest.scm" goes through the > hassle of replacing what would be literal inexact numbers with calls > to STRING->NUMBER. I would really like a better way to do this. Not all testing files are going to be portable. The goal is that the api be portable, so it is easy to write portable tests, but presumably not more portable than what you're testing. E.g. a test for complex numbers is only going to work if the implementation supports complex numbers. What you want is for the complex tests to be skipped (and the report summary say so) if complex is unavailable. If the tests depend on reader syntax one can always put them in a separate file and load it. E.g.: (if no-complex (test-skip "complex number tests")) (test-group "complex number tests" (load "complex-number-tests.scm")) If we allow test-specifiers to be integers interpreted relatively, this could be simplified to: (if no-complex (test-skip 1)) ;; skip following group (test-group "complex number tests" (load "complex-number-tests.scm")) > | Additionally, if the matching |test-begin| installed a new test-runner, > | then the |test-end| will de-install it, after reporting the accumulated > | test results in an implementation-defined manner. > | > | (test-group suite-name decl-or-expr ...) > | > | Equivalent to: > | > | (if (not (test-to-skip% suite-name)) > | (dynamic-wind > | (lambda () (test-begin suite-name)) > | (lambda () decl-or-expr ...) > | (lambda () (test-end suite-name)))) > > In a test system it is desirable to use the fewest possible features > of Scheme, so that problems in the implementation are less likely to > render the test system unusable. In this light, is the nesting of > test-groups bringing benefits large enough to justify the use of > complicated constructs like DYNAMIC-WIND? We only use dynamic-wind to "cleanup" - i.e "unwind-protect". In an implementation without dynamic-wind it would be acceptable to replace it with a macro that just calls the 3 thunks in sequence. In an implementation that doen't have full dynamic-wind but does have "cleanups" (e.g. Kawa, since it doesn't yet have full continuations) it would be nice to register the final thunk as a cleanup. I can change the implementation to make it easier tweak this part. -- --Per Bothner xxxxxx@bothner.com http://per.bothner.com/