Hello,
> | Concerning complexity, you are right:
Log2-binary-factors needs
> | time O(n) for an n-bit argument, whether in Scheme or on the
bare
> | metal.
>
> Iterating from the LSB, if the density of ones is d, then the
> probability of k zero LSBs is d^k. The average number of iterations
> for long numbers would be O(1/log(d)). So it is faster than
linear!
Well, nearly. Please let me correct:
1. You assume i.i.d. random bits with
density of ones d, where 0 < d < 1.
Then the number K_b of bits to inspect
until the first '1' is found
is geometrically distributed, in this
case Pr{K_b = k} = (1 - d)^(k-1) d, for k >= 1.
The expected number of bits inspected
is E{K_b} = 1/d.
2. Now consider words of w bits each,
and again the bits are i.i.d.
random variables with density of ones
d, where 0 < d < 1.
Then the probability that a given word
is non-zero (any bit is '1') is p = 1 - (1 - d)^w.
The number K_w of words to inspect until
the first non-zero word is found
is again geometrically distributed,
this time Pr{K_w = k} = (1 - p)^(k-1) p, k >= 1.
The expected number of words inspected
is E{K_w} = 1/p, which tends
to 1/(w d) as d tends to zero and w
is a constant.
This tells us a few things:
a) For i.i.d. random bits the runtime
for long strings only depends on the
density of ones (no surprise); in this
sense it is "faster than linear".
b) Processing w bits in parallel gives
about factor w speed-up.
c) The "1/log(d)" you mention
is not correct.
The problem with the "fast than
linear" thing is that it only holds if the bits
are i.i.d. random. That is hard to justify
in most applications. (In fact,
I wouldn't go near it.)
Anyhow, in the worst-case (numbers of
the form 2^k, k >= 0) the algorithm inspects
all bits, but most of the time I would
like to think of it as being constant time.
> I grepped through gambc30 and gambc40b12 without
finding
> first-bit-set.
Oops, I had simply scanned the documentation
for the name and found it at:
gambc40b12.tar.gz
> gambit-c.pdf > p. 44
(Remember: "If all else has failed, read the manual.")
Ah! I wrote FIRST-BIT-SET but Gambit
really calls it FIRST-SET-BIT.
> How about COUNT-BINARY-FACTORS?
I still like FIRST-BIT-SET best, or
something as LSB-BIT-SET.
Actually, I do not care too much about
the name.
See you,
Sebastian.
----
Dr. Sebastian Egner
Senior Scientist
Philips Research Laboratories
Prof. Holstlaan 4 (postbox WY63, kamer WY 6-55)
5656 AA Eindhoven
The Netherlands
tel: +31 40 27-42069
fax: +31 40 27-42566
email: xxxxxx@philips.com