Re: [scheme-reports-wg2] Threads and futures

Show/hide message thread

Re: [scheme-reports-wg2] Threads and futures Linas Vepstas (19 Mar 2022 05:54 UTC)

Re: [scheme-reports-wg2] Threads and futures Marc Nieper-Wißkirchen (19 Mar 2022 08:24 UTC)

Re: [scheme-reports-wg2] Threads and futures John Cowan (19 Mar 2022 18:57 UTC)

Re: [scheme-reports-wg2] Threads and futures Linas Vepstas (19 Mar 2022 20:04 UTC)

Re: [scheme-reports-wg2] Threads and futures Marc Nieper-Wißkirchen (19 Mar 2022 22:11 UTC)

Re: [scheme-reports-wg2] Threads and futures Linas Vepstas (20 Mar 2022 07:50 UTC)

Re: [scheme-reports-wg2] Threads and futures John Cowan (20 Mar 2022 22:34 UTC)

Re: [scheme-reports-wg2] Threads and futures Marc Feeley (21 Mar 2022 04:49 UTC)

Re: [scheme-reports-wg2] Threads and futures Marc Nieper-Wißkirchen (21 Mar 2022 06:28 UTC)

Re: [scheme-reports-wg2] Threads and futures Marc Nieper-Wißkirchen (21 Mar 2022 06:54 UTC)

Re: [scheme-reports-wg2] Threads and futures Ray Dillinger (21 Mar 2022 19:00 UTC)

Re: [scheme-reports-wg2] Threads and futures Dr. Arne Babenhauserheide (21 Mar 2022 16:54 UTC)

Re: [scheme-reports-wg2] Threads and futures Marc Nieper-Wißkirchen (21 Mar 2022 16:04 UTC)

Re: [scheme-reports-wg2] Threads and futures Taylan Kammer (23 Mar 2022 14:20 UTC)

Re: [scheme-reports-wg2] Threads and futures Marc Nieper-Wißkirchen (23 Mar 2022 14:28 UTC)

Re: [scheme-reports-wg2] Threads and futures Dr. Arne Babenhauserheide (23 Mar 2022 15:40 UTC)

Re: [scheme-reports-wg2] Threads and futures Marc Nieper-Wißkirchen (23 Mar 2022 15:34 UTC)

Re: [scheme-reports-wg2] Threads and futures Dr. Arne Babenhauserheide (23 Mar 2022 22:08 UTC)

Re: [scheme-reports-wg2] Threads and futures Ray Dillinger (01 Apr 2022 23:04 UTC)

Re: [scheme-reports-wg2] Threads and futures Per Bothner (01 Apr 2022 23:21 UTC)

Re: [scheme-reports-wg2] Threads and futures Ray Dillinger (01 Apr 2022 23:28 UTC)

Re: [scheme-reports-wg2] Threads and futures Per Bothner (01 Apr 2022 23:50 UTC)

Re: [scheme-reports-wg2] Threads and futures Ray Dillinger (02 Apr 2022 00:01 UTC)

Re: [scheme-reports-wg2] Threads and futures Per Bothner (02 Apr 2022 00:35 UTC)

Re: [scheme-reports-wg2] Threads and futures Linas Vepstas (10 Apr 2022 23:46 UTC)

Re: [scheme-reports-wg2] Threads and futures Linas Vepstas (27 Mar 2022 16:54 UTC)

Re: [scheme-reports-wg2] Threads and futures Marc Nieper-Wißkirchen 19 Mar 2022 08:24 UTC

It should be noted that SRFI 226 parameter objects are not meant as a
replacement for thread locals. These form a different concept; there
will have to be an extra set of procedures for fast TLS. While SRFI
18's thread-specific-set! is certainly some primitive on top of which
TLS could be implemented, it is not sufficient for interoperability
between independent libraries accessing it.

Am Sa., 19. März 2022 um 06:50 Uhr schrieb Linas Vepstas
<xxxxxx@gmail.com>:
>
> Replying to Marc, and Nala,
>
> It doesn't seem appropriate to talk about C/C++ here, so I will skip that discussion, except to say: it's all about performance. Flexibility is great but if an API cannot be implemented so it's fast, then it's not a good API. It's very hard to determine if an API is fast without measuring it. Thinking about it is often misleading; reading the implementation easily leads to false conclusions. Sad but true.
>
> And so, an anecdote: Nala notes: "guile parameters are built on top of guile fluids".   That is what the docs say, but there may be some deep implementation issues. A few years ago, I performance-tuned a medium-sized guile app (https://github.com/MOZI-AI/annotation-scheme/issues/98)  and noticed that code that mixed parameters with call/cc was eating approx half the CPU time. For a job that took hours to run, "half" is a big deal. I vaguely recall that parameters were taking tens of thousands of CPU cycles, or more, to be looked up.
>
> Fluids/parameters need to be extremely fast: dozens of cycles, not tens of thousands.  Recall how this concept works in C/C++:
> -- A global variable is stored in the text segment, and a lookup of the current value of that variable is a handful of cpu cycles: it's just at some fixed offset in the text segment.
> -- A per-thread global variable is stored in the thread control block (TCB), located at the start of the (per-thread) stack.  So, a few cycles to find the TCB, a few more to compute the offset and maybe do a pointer chase.
>
> Any sort of solution for per-thread storage in scheme, whether fluids or parameters, needs to be no more complex than the above.  The scheme equivalent of the  TCB for the currently running thread needs to be instantly available, and not require some traversal of an a-list or hash table.  The location of the parameterized value should not be more than a couple of pointer-chases away; dereferencing it should not require locks or atomics. It needs to be fast.
>
> It needs to be fast to avoid the conclusions of the earlier-mentioned "Curse of Lisp" essay: Most scheme programmers are going to be smart enough to cook up their own home-grown, thread-safe paramater object: but their home-grown thing will almost surely have mediocre performance. If "E" is going to be an effective interface to the OS, it needs to be fast.  If you can't beat someone's roll-your-own system they cooked up in an afternoon, what's the point?
>
> Conclusion: srfi-226 should almost surely come with a performance-measurement tool-suite, that can spit out hard numbers for parameter-object lookups-per-microsecond while running 12 or 24 threads.  If implementations cannot get these numbers into the many-dozens-per-microsecond range, then ... something is misconceived in the API.
>
> My apologies, I cannot make any specific, explicit recommendations beyond the above.
>
> -- Linas.
>
> On Fri, Mar 18, 2022 at 4:51 PM Marc Nieper-Wißkirchen <xxxxxx@gmail.com> wrote:
>>
>> Am Fr., 18. März 2022 um 19:22 Uhr schrieb Linas Vepstas
>> <xxxxxx@gmail.com>:
>> >
>> > Nala,
>> >
>> > On Fri, Mar 18, 2022 at 12:52 PM Nala Ginrut <xxxxxx@gmail.com> wrote:
>> >>
>> >> Please also consider the Scheme for the embedded system.
>> >> The pthread is widely used in RTOS which could be the low-level implementation of the API,
>> >
>> >
>> > After a quick skim, srfi-18 appears to be almost exactly a one-to-one mapping into the pthreads API.
>>
>> It was also influenced in parts by Java and the Windows API, I think,
>> but Marc Feeley will definitely know better.
>>
>> > The raw thread-specific-set! and thread-specific in srfi-18 might be "foundational" but are unusable without additional work: at a minimum, the specific entries need to be alists or hash tables or something -- and, as mentioned earlier, the guile concept of "fluids" seems (to me) to be the superior abstraction, the abstraction that is needed.
>>
>> Please add your thoughts about it to the SRFI 226 mailing list. As far
>> as Guile fluids are concerned, from a quick look at them, aren't they
>> mostly what parameter objects are in R7RS?
>>
>> >
>> >>
>> >> and green-thread is not a good choice for compact embedded systems.
>> >
>> >
>> > My memory of green threads is very dim; as I recall, they were ugly hacks to work around the lack of thread support in the base OS, but otherwise offered zero advantages and zillions of disadvantages.   I can't imagine why anyone would want green threads in this day and age, but perhaps I am criminally ignorant on the topic.
>>
>> For SRFI 18/226, it doesn't matter whether green threads or native
>> threads underlie the implementation. There can even be a mixture like
>> mapping N Scheme threads to one OS thread.
>>
>> >>> Actual threading performance depends strongly on proprietary (undocumented) parts of the CPU implemention.  For example, locks are commonly  implemented on cache lines, either on L1 or L2 or L3. Older AMD cpus seem to have only one lock for every 6 CPU's, (I think that means the lock hardware is in the L3 cache? I dunno) and so it is very easy to stall with locked cache-line contention. The very newest AMD CPU's seem to have 1 lock per CPU (so I guess they moved the lock hardware to the L1 cache??) and so are more easily parallelized under heavy lock workloads.  Old PowerPC's had one lock per L1 cache, if I recall correctly.  So servers work better than consumer hardware.
>> >>>
>> >>> To be clear: mutexes per-se are not the problem; atomic ops are.  For example, in C++, the reference counts on shared pointers uses atomic ops, so if your C++ code uses lots of shared pointers, you will be pounding the heck out CPU lock hardware, and all of the CPU's are all going to snooping on the bus and they will all be checkpointing and invalidating and rolling back like crazy, hitting a very hard brick wall on some CPU designs.  I have no clue how much srfi-18 or fibers depend on atomic ops, but these are real issues that hurt real-world parallelizability.  Avoid splurging with atomic ops.
>>
>> If you use std::shared_ptr, many uses of std::move are your friend or
>> you are probably doing it wrong. :)
>>
>> >>> As to hash-tables:  lock-free hash tables are problematic. Facebook has the open-source "folly" C/C++ implementation for lockless hash tables. Intel has one too, but the documentation for the intel code is... well I could not figure out what intel was doing.  There's some cutting-edge research coming out of Israel on this, but I did not see any usable open-source implementations.
>> >>>
>> >>> In my application, the lock-less hash tables offered only minor gains; my bottleneck was in the atomics/shared-pointers. YMMV.
>>
>> Unfortunately, we don't have all the freedom of C/C++. While it is
>> expected that a C program will crash when a hash table is modified
>> concurrently, most Scheme systems are expected to handle errors
>> gracefully and not crash. We may need a few good ideas to minimize the
>> amount of locking needed.
>>
>> --
>> You received this message because you are subscribed to the Google Groups "scheme-reports-wg2" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to xxxxxx@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/msgid/scheme-reports-wg2/CAEYrNrTBaY%3Dm%2BuRSmir9e%3Dr2MENDVE5RELmAPQKNqWro9H-zqQ%40mail.gmail.com.
>
>
>
> --
> Patrick: Are they laughing at us?
> Sponge Bob: No, Patrick, they are laughing next to us.
>
>
> --
> You received this message because you are subscribed to the Google Groups "scheme-reports-wg2" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to xxxxxx@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/scheme-reports-wg2/CAHrUA37L5LWyaz-EEr0QGMMG70Wqsz7NZm%3DtKH88L4Jy9O6VJA%40mail.gmail.com.