Type strategy for Scheme Lassi Kortela (09 Nov 2022 22:54 UTC)
Re: Type strategy for Scheme Marc Nieper-Wißkirchen (10 Nov 2022 07:07 UTC)
Re: Type strategy for Scheme Marc Nieper-Wißkirchen (10 Nov 2022 07:42 UTC)
Re: Type strategy for Scheme Peter Bex (10 Nov 2022 08:05 UTC)
Re: Type strategy for Scheme Philip McGrath (10 Nov 2022 08:54 UTC)
Re: Type strategy for Scheme Lassi Kortela (10 Nov 2022 09:49 UTC)
Re: Type strategy for Scheme Lassi Kortela (10 Nov 2022 09:08 UTC)
Re: Type strategy for Scheme Marc Feeley (10 Nov 2022 23:34 UTC)
Re: Type strategy for Scheme Lassi Kortela (11 Nov 2022 19:17 UTC)
Re: Type strategy for Scheme Marc Feeley (11 Nov 2022 21:34 UTC)
Re: Type strategy for Scheme Lassi Kortela (12 Nov 2022 21:58 UTC)
Re: Type strategy for Scheme Marc Nieper-Wißkirchen (12 Nov 2022 22:52 UTC)
Re: Type strategy for Scheme Lassi Kortela (12 Nov 2022 23:14 UTC)
Re: Type strategy for Scheme Marc Nieper-Wißkirchen (13 Nov 2022 09:41 UTC)
Re: Type strategy for Scheme Lassi Kortela (13 Nov 2022 11:59 UTC)
Re: Type strategy for Scheme John Cowan (13 Nov 2022 19:52 UTC)
Re: Type strategy for Scheme Lassi Kortela (13 Nov 2022 20:22 UTC)
Re: Type strategy for Scheme Marc Nieper-Wißkirchen (13 Nov 2022 20:35 UTC)
Re: Type strategy for Scheme Lassi Kortela (13 Nov 2022 21:41 UTC)
Re: Type strategy for Scheme Marc Feeley (14 Nov 2022 00:03 UTC)
Re: Type strategy for Scheme Lassi Kortela (14 Nov 2022 09:36 UTC)
Re: Type strategy for Scheme Marc Feeley (14 Nov 2022 16:27 UTC)
Re: Type strategy for Scheme Lassi Kortela (14 Nov 2022 19:58 UTC)
Re: Type strategy for Scheme John Cowan (13 Nov 2022 20:40 UTC)
Re: Type strategy for Scheme Marc Nieper-Wißkirchen (13 Nov 2022 20:42 UTC)
Re: Type strategy for Scheme Panicz Maciej Godek (20 Nov 2022 21:59 UTC)
Re: Type strategy for Scheme Per Bothner (20 Nov 2022 22:58 UTC)

Re: Type strategy for Scheme Marc Feeley 14 Nov 2022 16:27 UTC

> On Nov 14, 2022, at 4:36 AM, Lassi Kortela <xxxxxx@lassi.io> wrote:
>
>> The thing is that “static typing” is not a “feature” that is orthogonal to other features.  When static typing is added to an otherwise dynamically typed language it “cripples” the language… some things can’t be expressed anymore because they violate the rules of the typing discipline (mind you it cripples the expressiveness but does give some nice properties in return, so it is not black and white).  For example, the Y combinator works fine in Scheme, but many static typing disciplines will not accept it.  So adding static typing puts constraints on the other features, making some of them unusable.
>
> Yes. But it helps to make a distinction between features that are incompatible at the source code level, and those with incompatible execution models.
>
> The encouraging thing is that it's hard to think of examples of the latter. Probably some flavors of concurrency cannot be mixed with tractable semantics. Synchronous and asynchronous code are probably hard to interface due to a different model of time, but synchronous programming is rare outside embedded systems.
>
>> My understanding of your position is that you work your way around this issue by creating different “worlds” for example parts of the program that are statically typed and others that are dynamically typed.
>
> Exactly!
>
> The words "language" and "program" occlude thinking in CS. Lisp and Smalltalk show that you can just have a big world. Code enters and exits this world as time goes on. You can divide it mentally into different "programs" as a convention, but you don't have to, and often shouldn't.
>
> ST and Lisp systems have one world with the same language semantics across all of it. That's a clean design, but doesn't scale into large systems where human nature and organizational inertia dictate that different languages have to coexist. Conventional operating systems solve the problem by putting each language in its own walled garden ("process"). That achieves clean coexistence of languages but throws away the benefits of Lisp/ST where parts of the world can access all other parts at a fine granularity, using a common conduit. (In Lisp, classically the procedure call - in ST, the message send.)
>
> We often implement one language on top of another. But there tends to be a mismatch since the host language was designed as an opinionated language that's good for writing applications and not as an un-opinionated toolkit of language constructs. Lisp is a much better foundation for language implementation than most languages since it's dynamic and doesn't stray far from the lambda calculus (this is much of the point of Racket). But we can do better. The recent work on types has demonstrated that we can think of the fabric of a language as a structure with mathematical properties, and strive to re-define that structure in ways that give nicer properties for reasoning while supporting similar code as before. This mindset should be extended beyond the use of types in source code, and there is indeed a long tradition of formalizing runtime semantics. It's just not unified yet.
>
> Type theorists are currently making formal models for individual languages. But assume a system like the JVM or Racket that mixes a dozen languages. In this case, a unified mathematical structure is much more tractable for machine processing (not to mention, simpler to implement with less code) than a dozen separate structures with "ad hoc, informally specified, bug-ridden" connections between them. That's a quote from Greenspun's Tenth Rule. At the moment, we are all greenspunning ad hoc FFI's between languages, just as people were and are greenspunning ad hoc interpreters.
>
> Take a look at the current and past projects in our field which do multi-language virtual machines, or machine translation of code between languages. They demonstrate that these problems are intractable to solve cleanly without a unified model. The pile of special cases is too big to keep under intellectual control.
>
> The status quo is that each language is a semantic island. It would be a huge improvement to have e.g. five competing unified models of varying sophistication and comprehensiveness. From this competition, a standard model should eventually emerge, as it has in other scientific fields. There will always be some cutting-edge things that don't fit into the model, but the 80% case is solved handily.
>
> A standard model of programming languages is a giant effort that takes maybe 100 committed researchers or more. (IIRC the GraalVM or Racket team is about 50, give or take, with some people more committed than others.) But here, as with other problems, Lisp's expressiveness gives us an advantage, and we should do our part.
>
> So the Racket language toolkit approach is right. But they're saying: "Hey, you can come in and make your own new language!" Instead they should be saying, "We made your existing languages run better!" At the moment they can't say that, because they concentrate so much on individual languages as silos instead of knitting them out of the same fabric. An obvious starting point would be, how to make Racket a great implementation of Common Lisp or Standard ML. Those languages are close to Scheme and are wonderfully well specified. If they can do that, they can branch out to more exotic or more poorly specified languages.
>
> Once a few existing languages run well using a common framework, the obvious realization falls out of the design that you can make new languages by taking an existing one and tweaking a few things. E.g. take call/cc out of Scheme, or add it to Clojure. It becomes apparent that a language is just a gatekeeper to an ocean of computational features. Maybe your favorite language is a wise gatekeeper, maybe not. In any case, languages become an issue of aesthetics, not theory. Programmers who are oriented toward abstract math should make the theory solid, and programmers with a literary bent should make pretty conventions ("languages") out of the theorists' work.
>
> I suspect that today's "languages" will be seen as clumsy and inconvenient instruments in the future. This can already be seen in the way Lisp coders look at Java. Lisp has macros, so you can configure the syntax, and Lisp coders go "You can't even add a new kind of loop to Java!" There's no reason this malleability has to be restricted to syntax. It should concern the semantics too. In a big program, it pays to use different computational and typing features to express different parts. Ada already lets you restrict parts of a program to a tailor made subset of the language, and people say this is a great help. PLT has isolated teaching subsets out of Scheme, also to good effect. This is the future.
>
>> But then isn’t that just saying that you accept the existence of different languages (i.e. these worlds) and what you are looking for is an “as seamless as possible” FFI for these languages to talk to each other, like nodes of a distributed system communicating with each other?>
>
> Exactly. The FFI and distributed metaphors are good ones.
>
> Concretely, gambit.h has the following:
>
> /*
> * Subtype tag assignment.
> *
> * These tags are stored in the head of memory allocated objects
> * (including pairs).
> *
> * ___SB = number of subtype tag bits
> * ___sVECTOR       = tag for vectors
> * ___sPAIR         = tag for pairs
> * ___sRATNUM       = tag for ratnums
> * ___sCPXNUM       = tag for cpxnums
> * ___sSTRUCTURE    = tag for structures
> * ___sBOXVALUES    = tag for box and multiple-values objects
> * ___sMEROON       = tag for Meroon objects
> * ___sJAZZ         = tag for Jazz objects
> * ___sSYMBOL       = tag for symbols
> * ___sKEYWORD      = tag for keywords
> * ___sFRAME        = tag for continuation frames
> * ___sCONTINUATION = tag for continuations
> * ___sPROMISE      = tag for promises
> * ___sWEAK         = tag for weak objects (wills and GC hash tables)
> * ___sPROCEDURE    = tag for procedures
> * ___sRETURN       = tag for returns
> * ___sFOREIGN      = tag for foreign data
> * ___sSTRING       = tag for strings
> * ___sS8VECTOR     = tag for 8-bit signed integer vectors
> * ___sU8VECTOR     = tag for 8-bit unsigned integer vectors
> * ___sS16VECTOR    = tag for 16-bit signed integer vectors
> * ___sU16VECTOR    = tag for 16-bit unsigned integer vectors
> * ___sS32VECTOR    = tag for 32-bit signed integer vectors
> * ___sU32VECTOR    = tag for 32-bit unsigned integer vectors
> * ___sF32VECTOR    = tag for 32-bit floating point number vectors
> * ___sS64VECTOR    = tag for 64-bit signed integer vectors
> * ___sU64VECTOR    = tag for 64-bit unsigned integer vectors
> * ___sF64VECTOR    = tag for 64-bit floating point number vectors
> * ___sFLONUM       = tag for flonums
> * ___sBIGNUM       = tag for bignums
>
> So in addition to standard Scheme, you already have hooks in the runtime for Meroon and JazzScheme objects. It would be natural to add more.
>
> But what we should look for is FFI between computational features, not between entire languages.
>
>> I think it will be hard to be seamless and this will cause users to pick a world and do their programming there, which is what is being done now.
>
> Sure. Programmers will always opt to choose a pre-made world as a starting point. (Advanced programmers will perhaps modify it to suit. E.g. on a team of coders, the team lead could configure the language.)
>
> Making things seamless is indeed extremely hard. This is to be expected when any field makes a "standard model" out of its theories. Concretely, you'd e.g. out the many idiosyncratic behaviors of Perl or JavaScript and replace them with something coherent. Perhaps you'd make "adapters" so that the old quirks are emulated faithfully, and people with legacy codebases would opt to use this "quirks mode".
>
> But most code doesn't really depend on a specific language and its quirks. It expresses an idea which requires a few computational concepts, and can be ported to a cleaner version of the same concepts.
>
>> I can’t help but make the link with my work on different FFIs between Gambit Scheme and other languages, namely JavaScript[1] and Python[2].  In those FFIs we have chosen to embed the JavaScript and Python syntax in Scheme so that it is as seamless as possible to use those languages from Scheme (note that all three are dynamically typed which simplifies things).  I reject the idea that expressing everything with parentheses is “better” even though I am a great fan of s-expressions for all the reasons we know.  Sometimes a different syntax is useful to indicate which of these worlds is running the code so that our mind doesn’t get confused.  Of course, in our work, there is a mapping from infix expressions to s-expressions that simplifies processing the foreign expressions, possibly using them in pattern matching or macros, but the programmer is not working with this directly.  A quick example:
>
> The future of S-expressions is to serve as an on-disk storage format for code and data. The user interface should be fully themeable. A subset of programmers are good at aesthetics and enjoy making things like UI themes. They should be given a way to make window dressing for S-expressions so that we can have nicer ways to read and edit code without over-burdening the textual syntax. (Observe the effort to add new syntax to languages like C++, Perl, and Haskell. It's an obvious dead end because they insist on using text.)
>

Generally speaking I agree with you.  I even have a research project I’m trying to start on a related topic (to allow the sharing of a codebase by programmers with different language preferences… email me privately if you want to know more).

The point I am less convinced with is that adding static typing to Scheme is the best next step towards this goal.

Also I see a problem with adoption by the community of a “universal language” that can express (most of) what can be expressed with existing languages. I expect the performance to suffer (because implementing a single specific language can take advantage of the missing features to enable various optimizations). So for the first 10-20 years it takes to develop and implement an efficient compiler for the universal language, few people will see a benefit in using it. “Worse is better”…

Your vision may be “the right thing”, but you may be 100 years ahead of the times. [I’m not saying the direction is bad, just that the road may be very lonely]

Marc