The scope of the srfi Vladimir Nikishkin (17 Jul 2020 02:11 UTC)
Re: The scope of the srfi Linas Vepstas (17 Jul 2020 02:26 UTC)
Re: The scope of the srfi Vladimir Nikishkin (17 Jul 2020 02:33 UTC)
Re: The scope of the srfi Arthur A. Gleckler (17 Jul 2020 03:00 UTC)
Re: The scope of the srfi Vladimir Nikishkin (17 Jul 2020 03:53 UTC)
Re: The scope of the srfi Arthur A. Gleckler (17 Jul 2020 04:42 UTC)
Re: The scope of the srfi Linas Vepstas (17 Jul 2020 04:43 UTC)
Re: The scope of the srfi Vladimir Nikishkin (17 Jul 2020 05:01 UTC)
Re: The scope of the srfi Linas Vepstas (17 Jul 2020 05:14 UTC)
Re: The scope of the srfi Vladimir Nikishkin (17 Jul 2020 05:20 UTC)
Re: The scope of the srfi Linas Vepstas (17 Jul 2020 05:43 UTC)

Re: The scope of the srfi Vladimir Nikishkin 17 Jul 2020 05:20 UTC

No.

I am not speaking of the equality in the sense of distributional
identity. That is actually my PhD topic, and it is completely
irrelevant here.

I mean literal Scheme eq? (in the sense of _the same computer memory_)
for the two (actually one) procedures, one of which is used to encode
the distribution inside the sample generator, and the other one is
given to the user for various other purposes. Those depend on the
sampling algorithm, but if a fairly standard "inverse function
generator" is used, that would need to be the inverse of a
distribution function.

On Fri, 17 Jul 2020 at 13:14, Linas Vepstas <xxxxxx@gmail.com> wrote:
>
>
>
> On Fri, Jul 17, 2020 at 12:01 AM Vladimir Nikishkin <xxxxxx@gmail.com> wrote:
>>
>> >You're done
>>
>> Yes, and everyone will be doing it slightly differently, and the
>> sampling from this curve will turn out to be just _slightly_ different
>> from the one implemented in the srfi, which will give a whole lot of
>> funky debugging techniques to find out where is the problem that the
>> user would have to learn.
>>
>> If you're implementing a sampler from a distribution there should be a
>> way to get _the same_ distribution that is used to provide the
>> samples, in the sense of (eq?).
>
>
> That is what the Student's t-test does. It tells you how close you are. Again, it's impossible to hit eq? precisely, for many complex reasons. (see wikipedia).  Unfortunately, Student's is widely mis-understood, and widely mis-used, most commonly by the medical, pharmaceutical industries, (and, of course, psychology and athletics performance academics) and is one of the contributing factors to the replicability crisis in science.  The idea of providing an eq? for a statistical library is scary, because I know several neuroscience PhDs who would happily make use of it in their code, publish the results, and then wonder why the idiots at Whatsamata U. cannot get the same results . Such concepts must be banished to the ninth circle of hell.
>
> -- Linas
>
>>
>>
>> On Fri, 17 Jul 2020 at 12:43, Linas Vepstas <xxxxxx@gmail.com> wrote:
>> >
>> >
>> >
>> > On Thu, Jul 16, 2020 at 10:53 PM Vladimir Nikishkin <xxxxxx@gmail.com> wrote:
>> >>
>> >>
>> >> It's just strange to see a library providing a "normally-distributed
>> >> random number" generator, but not an actual function to compute the
>> >> value of a gaussian bell in, i.g. that particular points where the
>> >> generator has produced some samples. An inverse value is extremely
>> >> often needed in statistical analysis.
>> >
>> >
>> > But this is almost trivial for the gaussian:
>> >
>> > (define mean (fold + 0 lis))
>> > (define rms (sqrt (fold (x s) (+ s (* x x))) 0 lis)))
>> >
>> > That's it. You're done. That's the gaussian bell.  Did you want the Student t-distribution instead?  If you want to get fancier than that... the standard advice is "use Gnu R" or "use SciPy" but re-imagining either of those two is a whole nuther thing. (There is a Jupyter for scheme, somewhere, but it is not obviously maintained and besides Jupyter is .. problematic for record-keeping and publishing. Lacks long-term stability. Depends on super-messy undebuggable javascript. Cannot be checked into git ... etc. There are several competitors to Jupyter that are more rational, but they are not popular, and do not provide scheme.)
>> >
>> >> Furthermore, a normal distribution spans all the real line from -inf
>> >> to inf.
>> >
>> >
>> > Yes. Although it is impossible to ever produce those values .. values greater than six sigma are exceptionally rare. Assuming you can generate one random value per nanosecond,  you will generate a value greater than +11 or -11 approximately once per age of the universe. (Recall the age of the universe being a mere 4e26 nanoseconds old - a number that easily fits in a single-precision float).
>> >
>> >> How does this work with Scheme numbers?
>> >
>> >
>> > single-precision-float is more than good enough for this particular example...
>> >
>> > Although I will once again call for a srfi that wraps up either GnuMP or MPFR because some applications (number theory) need more than 56 bits of precision. More than 600 decimal places is very common.
>> >
>> > > I think it is not what I find aesthetically pleasing
>> >
>> > Aesthetics is always important, but I cannot guess at the sense of aesthetics at play, here... ?
>> >
>> > -- linas
>>
>>
>>
>> --
>> Yours sincerely, Vladimir Nikishkin
>
>
>
> --
> Verbogeny is one of the pleasurettes of a creatific thinkerizer.
>         --Peter da Silva
>

--
Yours sincerely, Vladimir Nikishkin