Re: arithmetic issues Aubrey Jaffer 23 Oct 2005 03:58 UTC
| Date: Sat, 22 Oct 2005 20:39:01 -0500 | From: Alan Watson <xxxxxx@astrosmo.unam.mx> | | Aubrey Jaffer wrote: | > | > Flonums often are the most difficult feature to port to new | > | > architectures. | > | | > | Why do you say that? | > | >From the experience of porting SCM to dozens of C compilers. | | Okay, when you said "architecture" I thought you refered to CPU | instructions and data formats. Yes, compilers are a pain and there | are frequent bugs in the standard library. Sorry for the confusion; platform would have been a better word. | > | That is, I would mandate only unlimited size integers in the | > | core. The rest of the tower should be moved to the library. | ... | I would distinguish "moving the rest of the tower to a library in | the *language* *definition*" and "moving the rest of the tower to a | library in an *implementation*". | | That is, moving all but exact integers out of the core of the | language definition simplifies the language definition and keeps it | grounded in things that are generally agreed to be correct. I agree; exact integers are certainly the best basis for formal specification. | However, that does not prohibit implementors from including | important aspects of other numbers in the core of their | implementation. For example, the core of their implementation could | have representation and garbage collection for flonums, ratnums, | and whatever else. You would probably end up duplicating some | arithmetic routines, but that's about it. | | > [In SCM] The arithmetic subrs test first for INUMs, then bignums, | > then flonums. The type dispatch for bignums and flonums is very | > similar. It would be good to find what causes the difference. | | This is my point. The branches for inums and bignums are probably | predicted as taken. Thus, when you use these generic operators on | flonums, you incur two mispredicted branches. flonum-specific | operators would save those. Point taken. | > I tested SCM and SCMLIT (fixnums only), both compiled with gcc | > -O3, computing 2000 digits of pi 4 digits at a time on a Pentium | > 4 3.00GHz. The benchmark uses only small integers. | > | > SCM took 5330.ms, while SCMLIT took 3330.ms, a substantial | > savings. | | However, your results suggest to me that perhaps some of the | branches are not predicted as they should be. It might be worth | using the "__builtin_expect" feature of GCC to hint to the compiler | that numbers are expected to be inums. I tried adding __builtin_expect() to several of the type dispatch macros. The one which sped up was used mainly for testing assertions. It reduces the times to 4480.ms (SCM) and 2970.ms (integer-only). Thanks! | Of course, type-specific operators are "just" a performance hack to | get around a lack of type analysis in many implementations. Hats | off to stalin. There are some other possible uses. Algebraic Petrofsky points out that a real-only EXPT can do (expt -27 1/3) ==> -3, which R5RS EXPT won't.