Email list hosting service & mailing list manager

Making SRFI go faster Lassi Kortela (25 Apr 2021 09:33 UTC)
Re: Making SRFI go faster Vladimir Nikishkin (25 Apr 2021 09:46 UTC)
Re: Making SRFI go faster Lassi Kortela (25 Apr 2021 09:57 UTC)
Re: Making SRFI go faster Amirouche Boubekki (25 Apr 2021 11:05 UTC)
Re: Making SRFI go faster Lassi Kortela (25 Apr 2021 11:14 UTC)
Re: Making SRFI go faster Marc Feeley (25 Apr 2021 12:01 UTC)
Re: Making SRFI go faster Lassi Kortela (25 Apr 2021 12:15 UTC)
Re: Making SRFI go faster Alex Shinn (26 Apr 2021 13:09 UTC)
Re: Making SRFI go faster Jakub T. Jankiewicz (26 Apr 2021 18:51 UTC)
Re: Making SRFI go faster Alex Shinn (27 Apr 2021 02:59 UTC)
Re: Making SRFI go faster Amirouche Boubekki (25 Apr 2021 10:47 UTC)
Re: Making SRFI go faster Lassi Kortela (25 Apr 2021 10:57 UTC)
Re: Making SRFI go faster Lassi Kortela (25 Apr 2021 11:04 UTC)
Re: Making SRFI go faster Adam Nelson (25 Apr 2021 21:00 UTC)
Re: Making SRFI go faster Lassi Kortela (25 Apr 2021 21:10 UTC)
Re: Making SRFI go faster Amirouche Boubekki (25 Apr 2021 11:34 UTC)
Re: Making SRFI go faster Lassi Kortela (25 Apr 2021 12:01 UTC)
Re: Making SRFI go faster Marc Nieper-Wißkirchen (25 Apr 2021 12:23 UTC)
R6RS and portability Lassi Kortela (25 Apr 2021 12:35 UTC)
Re: R6RS and portability Marc Nieper-Wißkirchen (25 Apr 2021 14:18 UTC)
Re: R6RS and portability Marc Feeley (25 Apr 2021 14:41 UTC)
Re: R6RS and portability Marc Nieper-Wißkirchen (25 Apr 2021 14:55 UTC)
Scheme package management Lassi Kortela (25 Apr 2021 15:04 UTC)
Re: Scheme package management Marc Feeley (25 Apr 2021 15:08 UTC)
Re: Scheme package management Lassi Kortela (25 Apr 2021 15:14 UTC)
Re: Scheme package management Alex Shinn (26 Apr 2021 08:15 UTC)
Re: Scheme package management Lassi Kortela (26 Apr 2021 09:02 UTC)
Re: Scheme package management Alex Shinn (26 Apr 2021 09:33 UTC)
Re: Scheme package management Lassi Kortela (26 Apr 2021 09:41 UTC)
Re: Scheme package management Jakub T. Jankiewicz (26 Apr 2021 12:01 UTC)
Re: Scheme package management Lassi Kortela (26 Apr 2021 12:09 UTC)
Re: Scheme package management Alex Shinn (26 Apr 2021 12:58 UTC)
Re: Scheme package management Alex Shinn (26 Apr 2021 12:35 UTC)
Re: R6RS and portability Marc Feeley (25 Apr 2021 15:05 UTC)
Re: R6RS and portability Marc Nieper-Wißkirchen (25 Apr 2021 15:14 UTC)
Scheme package management Lassi Kortela (25 Apr 2021 15:22 UTC)
Re: Scheme package management Marc Nieper-Wißkirchen (25 Apr 2021 15:35 UTC)
Re: Scheme package management Lassi Kortela (25 Apr 2021 15:45 UTC)
Re: Scheme package management Marc Nieper-Wißkirchen (25 Apr 2021 15:51 UTC)
Re: Scheme package management Lassi Kortela (25 Apr 2021 16:27 UTC)
Re: Scheme package management Marc Feeley (25 Apr 2021 15:47 UTC)
Re: Scheme package management Lassi Kortela (25 Apr 2021 15:54 UTC)
Scheme package management Marc Feeley (25 Apr 2021 15:28 UTC)
Re: Scheme package management Marc Nieper-Wißkirchen (25 Apr 2021 15:41 UTC)
Re: R6RS and portability Jakub T. Jankiewicz (25 Apr 2021 15:55 UTC)
Re: R6RS and portability Lassi Kortela (25 Apr 2021 16:15 UTC)
Re: Making SRFI go faster Adam Nelson (25 Apr 2021 20:56 UTC)
Re: Making SRFI go faster Marc Nieper-Wißkirchen (25 Apr 2021 21:14 UTC)
Re: Making SRFI go faster Adam Nelson (25 Apr 2021 21:29 UTC)
Re: Making SRFI go faster Lassi Kortela (25 Apr 2021 21:40 UTC)
Re: Making SRFI go faster Marc Nieper-Wißkirchen (26 Apr 2021 06:05 UTC)
Re: Making SRFI go faster Marc Feeley (25 Apr 2021 21:07 UTC)
Re: Making SRFI go faster Adam Nelson (25 Apr 2021 21:34 UTC)
Building up R7RS in stages Lassi Kortela (25 Apr 2021 21:45 UTC)
Re: Making SRFI go faster Marc Feeley (25 Apr 2021 21:59 UTC)
Re: Making SRFI go faster Amirouche Boubekki (26 Apr 2021 06:54 UTC)
Re: Making SRFI go faster Marc Nieper-Wißkirchen (25 Apr 2021 11:36 UTC)
Re: Making SRFI go faster Lassi Kortela (25 Apr 2021 11:47 UTC)
Re: Making SRFI go faster Adam Nelson (25 Apr 2021 20:12 UTC)
Re: Making SRFI go faster Lassi Kortela (25 Apr 2021 20:29 UTC)
Re: Making SRFI go faster John Cowan (26 Apr 2021 00:28 UTC)
Spec vs code, user-driven vs designer-driven Lassi Kortela (26 Apr 2021 06:15 UTC)
Re: Spec vs code, user-driven vs designer-driven John Cowan (01 May 2021 06:34 UTC)
Re: Spec vs code, user-driven vs designer-driven Marc Nieper-Wißkirchen (01 May 2021 07:03 UTC)
Re: Spec vs code, user-driven vs designer-driven Lassi Kortela (01 May 2021 08:14 UTC)
Re: Spec vs code, user-driven vs designer-driven Marc Nieper-Wißkirchen (01 May 2021 09:12 UTC)
Re: Spec vs code, user-driven vs designer-driven Lassi Kortela (01 May 2021 09:56 UTC)
Re: Spec vs code, user-driven vs designer-driven Marc Nieper-Wißkirchen (01 May 2021 10:29 UTC)
Re: Spec vs code, user-driven vs designer-driven Lassi Kortela (01 May 2021 11:01 UTC)
Re: Spec vs code, user-driven vs designer-driven Marc Nieper-Wißkirchen (01 May 2021 11:33 UTC)
Re: Spec vs code, user-driven vs designer-driven Lassi Kortela (01 May 2021 12:09 UTC)
Re: Spec vs code, user-driven vs designer-driven Marc Nieper-Wißkirchen (01 May 2021 12:49 UTC)
Re: Spec vs code, user-driven vs designer-driven Lassi Kortela (01 May 2021 13:34 UTC)
Re: Spec vs code, user-driven vs designer-driven Marc Nieper-Wißkirchen (01 May 2021 14:02 UTC)
Re: Spec vs code, user-driven vs designer-driven Lassi Kortela (01 May 2021 14:39 UTC)
Re: Spec vs code, user-driven vs designer-driven Per Bothner (01 May 2021 15:37 UTC)
Re: Spec vs code, user-driven vs designer-driven Amirouche Boubekki (01 May 2021 14:10 UTC)
Re: Spec vs code, user-driven vs designer-driven Lassi Kortela (01 May 2021 15:04 UTC)
Re: Spec vs code, user-driven vs designer-driven Amirouche Boubekki (01 May 2021 16:43 UTC)
Re: Spec vs code, user-driven vs designer-driven Adam Nelson (01 May 2021 17:35 UTC)
Re: Spec vs code, user-driven vs designer-driven Marc Nieper-Wißkirchen (01 May 2021 17:55 UTC)
Re: Spec vs code, user-driven vs designer-driven John Cowan (01 May 2021 18:32 UTC)
Re: Discussion with the creator of Lojban, and editor of R7RS-large Arthur A. Gleckler (02 May 2021 02:08 UTC)
Re: Discussion with the creator of Lojban, and editor of R7RS-large Arthur A. Gleckler (02 May 2021 04:16 UTC)
Re: Discussion with the creator of Lojban, and editor of R7RS-large Amirouche (02 May 2021 11:27 UTC)
Re: Spec vs code, user-driven vs designer-driven Marc Nieper-Wißkirchen (02 May 2021 17:21 UTC)
Re: Spec vs code, user-driven vs designer-driven John Cowan (01 May 2021 18:12 UTC)
Re: Spec vs code, user-driven vs designer-driven Arthur A. Gleckler (01 May 2021 18:21 UTC)
Re: Spec vs code, user-driven vs designer-driven Marc Feeley (01 May 2021 18:37 UTC)
Re: Spec vs code, user-driven vs designer-driven John Cowan (01 May 2021 20:18 UTC)
Re: Spec vs code, user-driven vs designer-driven John Cowan (01 May 2021 17:08 UTC)
Re: Spec vs code, user-driven vs designer-driven Lassi Kortela (01 May 2021 16:30 UTC)
Re: Spec vs code, user-driven vs designer-driven Faré (03 May 2021 02:24 UTC)
Re: Spec vs code, user-driven vs designer-driven Lassi Kortela (03 May 2021 09:49 UTC)
Re: Spec vs code, user-driven vs designer-driven Faré (03 May 2021 14:20 UTC)
Re: Spec vs code, user-driven vs designer-driven Lassi Kortela (03 May 2021 14:33 UTC)
Re: Spec vs code, user-driven vs designer-driven John Cowan (03 May 2021 14:41 UTC)
Re: Spec vs code, user-driven vs designer-driven Marc Nieper-Wißkirchen (03 May 2021 15:01 UTC)
Re: Spec vs code, user-driven vs designer-driven John Cowan (03 May 2021 19:47 UTC)
Re: Spec vs code, user-driven vs designer-driven Marc Nieper-Wißkirchen (03 May 2021 20:43 UTC)
Re: Spec vs code, user-driven vs designer-driven John Cowan (03 May 2021 23:49 UTC)
Re: Spec vs code, user-driven vs designer-driven Marc Nieper-Wißkirchen (04 May 2021 07:33 UTC)
Re: Spec vs code, user-driven vs designer-driven John Cowan (05 May 2021 18:34 UTC)
Re: Spec vs code, user-driven vs designer-driven Marc Nieper-Wißkirchen (05 May 2021 18:52 UTC)
Re: Spec vs code, user-driven vs designer-driven John Cowan (05 May 2021 20:13 UTC)
Re: Spec vs code, user-driven vs designer-driven Marc Nieper-Wißkirchen (05 May 2021 20:26 UTC)
Re: Spec vs code, user-driven vs designer-driven Amirouche (05 May 2021 21:37 UTC)
Re: Spec vs code, user-driven vs designer-driven Alex Shinn (05 May 2021 21:50 UTC)
Re: Spec vs code, user-driven vs designer-driven Marc Nieper-Wißkirchen (06 May 2021 13:18 UTC)
Re: Spec vs code, user-driven vs designer-driven Marc Nieper-Wißkirchen (03 May 2021 14:27 UTC)
Re: Making SRFI go faster Marc Nieper-Wißkirchen (26 Apr 2021 08:09 UTC)
Re: Making SRFI go faster Lassi Kortela (26 Apr 2021 08:15 UTC)
Re: Making SRFI go faster Marc Nieper-Wißkirchen (26 Apr 2021 08:27 UTC)
Re: Making SRFI go faster Wolfgang Corcoran-Mathe (26 Apr 2021 02:46 UTC)
Re: Making SRFI go faster Marc Nieper-Wißkirchen (26 Apr 2021 05:58 UTC)
Re: Making SRFI go faster Lassi Kortela (26 Apr 2021 06:45 UTC)
Re: Making SRFI go faster Amirouche Boubekki (26 Apr 2021 07:06 UTC)
Interaction between spec and code Lassi Kortela (26 Apr 2021 07:36 UTC)
Re: Interaction between spec and code Marc Nieper-Wißkirchen (26 Apr 2021 07:59 UTC)
Re: Interaction between spec and code Lassi Kortela (26 Apr 2021 08:06 UTC)
Re: Interaction between spec and code Marc Nieper-Wißkirchen (26 Apr 2021 08:17 UTC)
Re: Interaction between spec and code John Cowan (30 Apr 2021 14:39 UTC)
Re: Interaction between spec and code Lassi Kortela (30 Apr 2021 14:56 UTC)
Re: Interaction between spec and code John Cowan (01 May 2021 05:02 UTC)
Re: Making SRFI go faster Marc Nieper-Wißkirchen (25 Apr 2021 20:30 UTC)
Re: Making SRFI go faster John Cowan (25 Apr 2021 23:04 UTC)

Re: Discussion with the creator of Lojban, and editor of R7RS-large Amirouche 02 May 2021 11:26 UTC

On 2021-05-02 03:29, John Cowan wrote:
> On Sat, May 1, 2021 at 7:35 PM Amirouche <xxxxxx@hyper.dev> wrote:
>
> I'll answer a few points publicly that I think deserve it.
>
>> What is the difference between ontological engineer and ontologist?
>
> Basically the difference between a computer programmer and a computer
> scientist.  The latter has a PhD and is most interested in theory; the
> former probably doesn't have as deep a theoretical understanding but
> has a good grasp of practical work.  Most top-level ontologies have
> classes like Quality, Endurant, Perdurant, Achievement,
> Accomplishment, etc.  The Gist top-level, which is meant for
> ontological engineers, has things like BuildingAddress,
> TelephoneNumber, EmailAddress, GovernmentOrganization, Language, etc.
>
> Discriminator question:  "What's a meronym?"  (I actually know the
> answer to this, however!)
>

I looked up the definition of meronym. It is unclear to me whether
an ontologist with PhD does not know what a meronym is. Maybe I am
missing the point of what you are explaining.

>> I need something like that, except since my target is JavaScript,
>> so far string will be immutable, also I will only document
>> primitives
>> (things that can not be computed otherwise), and other stuff.
>
> Will you be compiling to JS or writing a sub-Scheme interpreter in JS?
>

The compiler is written with Scheme relying on Chez Scheme, it produce
a subset of Scheme, that is transpiled to a subset of JavaScript. The
produced JavaScript is tested with nodejs. The output is JavaScript
code. There is no scheme procedure or javascript function `eval`
involved
at all in the compiler.

Given BiwaScheme does not support tail calls (still?) it was a no-go.

Regarding Gambit, the JS backend is very good, like the rest of Gambit.

They are small problems of usability, and my lack of knowledge of Gambit
C FFI is giant tree on the road. For my search engine, and prolly many
other applications full-stack Gambit is the best solution (see below
the discussion about parallel and concurrency). Tho, given my time
constraints,
and all the work I have done before with Chez, it is easier for *me*,
to investigate a subset of Scheme to JavaScript compiler, and use
my Chez Scheme knowledge and libraries. The big problem is FoundationDB
drivers. In other words, Gambit has everything better (network support,
TLS,
cooperative threads, parallel threads). R7RS-large support is tedium,
but possible. Also, Gambit macro support outside define-macro is unclear
to me.

Now that I wrote the above, I figured that I can workaround my lack of
knowledge about Gambit C FFI, and hence the lack of FoundationDB
bindings,
by relying on the JS backend for both the frontend and the backend!

>> Ok, I think I know, but maybe I do not know the difference.
>> Interoperability
>> is re-using libraries, not necessarily running the same program on
>> another Scheme?
>
> Here's the relevant part of Clinger's paper:
>
> Portability and interoperability are two different things, and often
> come into conflict with each other. When programming languages are
> specified, portability can be increased (and interoperability
> diminished) by inventing some specific semantics for all corner cases
> while forbidding extensions to standard syntax and procedures.
> Interoperability can be increased (and portability diminished) by
> leaving unimportant corner cases unspecified while allowing
> implementations to generalize syntax and semantics in ways that
> simplify interoperation with other standards and components.
>
> Consider, for example, the port-position procedure of R6RS Scheme. Its
> first draft specification, based on a Posix feature that assumes every
> character is represented by a single byte, would have precluded
> efficient interoperation with UTF-8, UTF-16, and Windows. Subsequent
> drafts weakened that specification, increasing interoperability at the
> expense of allowing port positions to behave somewhat differently on
> different machines and in different implementations of the standard.
>
> The R6RS tended to resolve those conflicts in favor of portability,
> however, while the R7RS tended to favor interoperability [17, 21, 23].
> It is therefore fairly easy for an R7RSconforming implementation to
> preserve backward compatibility with R6RS and SRFI libraries and to
> provide convenient access to libraries and components written in other
> languages, but it is considerably more difficult for R6RSconforming
> implementations to avoid creating barriers to interoperation with R7RS
> and SRFI libraries. That asymmetry between the R7RS and R6RS standards
> often goes unremarked and its importance under-appreciated when
> incompatibilities between those two standards are discussed.
>

Thanks for the explanation and the quoted reference.  I understand
the words, and the idea, but without proper guidance I will not be
able to tell myself "Here we favor interoperability".

Portability would be "bug-for-bug compliance". Whereas
"interoperability"
gives room to fix or improve or extend the standard.

In an industrial / commercial setting, if *portability* is a strong
requirement, it should be settled from the start. But I do not see
a situation where given the same code base I would like e.g. to run
Gambit in production, and Chez in my dev setup.

>>>> I do not know whether threads are necessary in the standard.
>
> More and more people want intra-process concurrency.

I am not sure they really want intra-process concurrency. What people
want, me included, is better use of the available hardware, hence better
performance, hence reduced hardware cost. Case in point, I benchmarked
an asyncio prototype (gunicorn + uvicorn + asyncpg) vs. the legacy app
(gunicorn + django stack): The throughput and latency of the asyncio
is astonishing, more than 10 times requests-per-second, at half the
latency. And the django app fails to respond to all requests when I
increased the workload (even after tweaking the number of POSIX
processus or threads).

> I myself prefer inter-process concurrency, but that's not the way
> it seems to be going.

I will expand a little the conversation into a picture that
is related to building SaaS products that wants to be scalable,
both in terms of number of engineers and backend infrastructure.

The primary constraint I set myself in my thinking, that I try
to put in practice with the Babelia search engine is:

- easy to develop: it should painless to reproduce the production
   setup locally, possibly with well tested "polyfills" to bridge
   the gap with production. I have an idea how to pull only the missing
   data (like rsync or git pull) from production, with something like
Datomic.

- easy to deploy: The idea here is to take into account deployment
   strategy before deployment, so that it is not an after-thought.
   In particular, I think about database migrations, and the necessity
   to coordinate services or micro services deployments. Babelia store
   the original HTML, if the filtering / scoring mechanic change, there
   is not need to re-index everything. Now that I rethink about it I can
   even support CJK.

- easy to scale: my idealistic goal can be summarized as an input
   box where you set your money budget, then the allocation of nodes
   is done for you by the computer across the infrastructure.

So the big picture, according to me, is that there is volatile memory
RAM, persistent memory such as SSD, and there is compute power.

Google published several papers describing that idea, Borg paper
focus on compute, Spanner focus on durable storage, and there is
third one about a giant memory array.

The summary is a question:

   How to reproduce a Von Newman architecture with several machines?

And:

   What is the OS of a distributed architecture?

That is a giant memory array, a giant disk and a giant pool of
computation
power, where the bus is the network, and that it should be considered be
non reliable.

At this stage, I think FoundationDB solved the giant RAM problem
(modulo [0])
and giant disk problem, that is easy to use in dev setup and possible at
web-scale

[0]
https://forums.foundationdb.org/t/multiple-clusters-for-the-same-application-memory-backend-and-rpc-layer/2685

Remains the computation part, which comes back to the conversation about
inter-process and intra-process concurrency. Since disk io is factored
in FDB (with possibly a S3 compatible layer), remains network io that
MAY
happen as part of the computation. So far I think there is the following
cases:

1) Do lot of network io, cpu is mostly waiting => intra-process is
necessary
   to make the best use of the cpu, hence there may be relevant
concurrency,
   if multiple green threads share a state (which is rare and factored in
dedicated
   libraries, e.g. the database connection pool). This is in SaaS
products the most
   common case I have seen.

2) There is the case where computation is fire-and-forget, and is io
bound:
   such as sending an email. My point is it does not necessarily require
a full
   cpu to handle those cases (in other words, in that case python backend
scheduler
   called celery is overkill. In microservice approach, a service might
be dedicated
   to run all those io bound tasks in the background)

3) There is the case where computation is fire-and-check-much-later,
such
   as recomputing a machine learning model, it requires a lot of CPU,
possibly
   several cores. In that case, inter-process communication is not
necessary,
   but parallelism is. There is big problem with the check-much-later
part,
   because the computation might be part of a bigger workflow (that is
the name
   used in Celery), and companies are built to guarantee that the full
workflow
   is properly executed. I think Borg paper mention that, AFAIK it is not
part
   of Kubernetes. Also, solving the problem of distributed priority queue
with
   dynamic scheduling is a requirement to solve that problem in the
general case.
   AFAIK it is an open-problem, and it is NOT an easy grab with FDB,
since the priority
   may need to be recomputed several times per unit of time, it creates
so called
   hot keys, and block FDB nodes responsible for those keys, because they
can not
   serve other keys. Related problems include at-least-once /
at-most-once delivery
   or execution... The problem here is to have the equivalent of Kernel
scheduler.
   There is narrow solutions available such as
https://github.com/spotify/luigi
   It is a different from regular inter-process communication, a single
thread
   can not manage the workflow, if it dies, the workflow dies with it
(without
   non-volatile persistence).

4) Another case I know of, is the computation as part of the
request-response
   cycle, that must be done in parallel, and requires inter-process
communication.

There is a 5) where latency is of paramount importance, but never met
the case
in production, in that case, request-reponse cycle must be done by a
single CPU
core.

Long story short: AFAICT CML and the actor model are useful both for
inter-process
and intra-process communication.

CML and the actor model both help to reduce (maybe even eliminate?) the
need to
deal with synchronization primitives directly (mutex, conditions), hence
reduce
the possibility of race conditions / dead-locks, while allowing to take
full-advantage
of the CPU or CPUs possibly spread over multiple servers.

The original question is:

>>>> I do not know whether threads are necessary in the standard.
>
> More and more people want intra-process concurrency.

What I know is CML and the actor model are much easier to work with,
compared
to mutex and condition variables, or promises / futures, and callbacks)
and
help with performance both in backend and the frontend for intra-process
and inter-process
communication.

What I do not know is whether you can do everything with the actor model
that you can do with CML. And what is the difference (outside the
surface API).

I am not familiar with gochan. AFAIU, it will move the computation to a
parallel thread
if it is deemed necessary by the language runtime. In the cases I have
in mind, I know
in advance whether I want a dedicated CPU or not.