Re: Weighted boolean | Simplelists

Show/hide message thread

Weighted boolean Lassi Kortela (05 May 2020 13:39 UTC)
(missing)
Weighted alphabets Lassi Kortela (05 May 2020 13:47 UTC)
Re: Weighted boolean Marc Nieper-Wißkirchen (05 May 2020 13:49 UTC)
Re: Weighted boolean John Cowan (05 May 2020 15:09 UTC)
Re: Weighted boolean Bradley Lucier (05 May 2020 19:27 UTC)
Re: Weighted boolean John Cowan (05 May 2020 21:15 UTC)

Re: Weighted boolean Bradley Lucier 05 May 2020 19:27 UTC

On 5/5/20 11:09 AM, John Cowan wrote:
> On Tue, May 5, 2020 at 9:39 AM Lassi Kortela <xxxxxx@lassi.io
> <mailto:xxxxxx@lassi.io>> wrote:
>
>     Should `make-random-boolean-generator` take an optional argument giving
>     the probability with which #t is returned?
>
>
> I suppose we could, but gweighted-sampling is meant to solve that
> problem in a general way.  For example, an unfair coin can be modeled
> like this:
>
> (gweighted-sampling 0.55 (circular-generator #t) 0.45
> (circular-generator #f))
>
> And of course it could equally well generate the symbols `heads` and
> `tails` instead.

A few points:

1.  I don't understand either the description or the implementation of
the gweighted* routines.

2.  Although I hate suggesting that other people do more work, I'd
recommend having both

(a) a Bernoulli distribution, returning 0 and 1:

https://en.wikipedia.org/wiki/Bernoulli_distribution

The values that this distribution returns can be used to index a vector
of any other values (including booleans) if you want to return other values.

(b) a Categorical distribution:

https://en.wikipedia.org/wiki/Categorical_distribution

As Lassi mentioned, this could be implemented using a Huffman coding

https://en.wikipedia.org/wiki/Huffman_coding

or

https://en.wikipedia.org/wiki/Categorical_distribution#Sampling

I don't recommend the method using binomial sampling, I know of no
really "good" way to sample a binomial distribution.