Towards XL3

Sunday, February 10, 2019

Today, I pused a large-ish new commit on the XL native branch. This is collecting some ideas I had following FOSDEM and reading some Rust-related books.

I think that I am slowly inching towards yet another iteration of XL, different enough from the earlier ones that it might need a new name.

A bit of history

For the record, in my mind, major iterations of XL included:

  1. 1995-1999: An initial compiler generating 68K assembly code, written in C++, where the language was not yet fully introspective.
  2. 2000-2002: The Mozart / Moka parenthesis where I tried to build an asbtract syntax tree (AST) that could support multiple language. That part had a fully public AST and techniques to work on them, and introduced the notion of compiler plug-in.
  3. 2003-2007: The XL simple universal AST known as XL0 emerged. Only 8 node types, ability to represent practically any data type. The XL2 language, imperative with generics, with static compilation, was built following a clear bootstrap process, i.e. the compiler compiled itself. Compiler plug-ins were much easier to write, thanks notably to "aspect-like" translation statements (building a translation table from multiple source files) and explicit support for tree rewrites in the compiler. I believe this is also the time where I settled for the off-side rule (i.e. indentation being significant)
  4. 2007-2010: Largely influenced by Pure, XLR was an attempt at using the same XL0 for a purely functional, dynamically compiled language. I started using LLVM to JIT code.
  5. 2010-2015: Tao3D built on top of the XLR foundation, and demonstrated just how extensible the language was. Practice also led me to introduce a few syntactic rules I was a bit reluctant with, e.g. sensitivity to space to disambiguate A-B and write -N (the first one being an infix -, the second one being a prefix write to a prefix -. I'm not satisfied with many things in Tao3D, e.g. the module system, some frankly bizarre lookup rules, or architectural problems supporting multi-threaded execution (Tao3D does have multiple threads, but getting multiple threads to execute XL code is an entire other topic).
  6. 2015: The ELIOT variant of the language (later renamed ELFE at the request of Legrand) explores the Internet of Things and distributed computing, showing quite a lot of promise. I'm doing extremely fun experiements having several Raspeberry Pi executing parts of a program that is running on a MacBook, quite transparently.
  7. 2016: A very small ELFE interpreter (about 20kLOC) ends up being the most faithful implementation of XLR ever, and notably the first one to really implement the type system. That solidifies some ideas on how to represent XL objects in memory, that I will never really have time to turn into reality.
  8. 2016-2018: Not much happening, but Tao3D, XLR and ELFE bit rot because LLVM keeps pushing incompatible changes and I'm unable to really follow. Rust, Go and Haskell take more and more of my mental space. Various attempts at implementing a Haskell-like type inference in XLR prove very complicated, in large part because a value routinely belongs to multiple types in XL. I'm also concerned about some design choices precluding ahead-of-time compilation.

FOSDEM Minimalist Languages track

At FOSDEM, I have already seen the first talks in the virtualization track at DevConf.cz, so I settle for the minimialist languages track instead. Also, I buy programming books (Rust programming and Go concurrency), something I had not done in a long time. And I attend a talk about bootsrapping, and talks about microkernels, that also give me quite a few ideas.

The combination of all these events made me realize two things:

  1. XL still has a lot to say. It's elegantly solving dozens of problems I heard about in the minimalist languages track.
  2. Much like Rust and Go, XL was initially designed as a better system language. Right now, the XLR implementation is far off-track with respect to that goal, because it requires all of LLVM and, by design, is hard to compile ahead of time.

What about restarting?

Time to go back to a blank slate, and start writing the compiler and support libraries in a new variant of the language. See where this goes. Basically, the idea is to write XL code before I write the compiler, so that I know exactly what semantics I have in mind and how I think it should compile.

I find it exciting to explore a whole set of new areas. There are a number of changes that have been crystalizing for a very long time in my mind, and I think it's about time I start thinking about the language itself, rather than a specific implementation.

Here are a few random ideas and tidbits collected so far:

  1. I am definitely switching back towards is syntax, in that sense going back to how things were in XL2. So now it's X is 0 rather than X -> 0.
  2. The module system has to go back to what I had in XL2, with clearly separate interface and implementation source files, and proper dependency tracking.
  3. I'm getting rid of builtins.xl or any startup file. The top-level XL module plays that role now. This makes the whole organization much more modular, which is good.
  4. The type system is under intense scrutiny. I believe it will still be based mostly on "shapes" or "pattern matching" much like in XLR / ELFE. I want some level of type inference, but the jury is still out on how much can be done practically.
  5. The Rust idea of "fat pointer" notably for traits and slices is, interestingly, very close to what I initially had in mind for XL2. So I think XL and Rust will converge on that front.
  6. The Rust ideas of mutability and ownership are good and efficient, but I don't want them hardcoded. So an interesting problem is: "how can I express the same thing in XL, in such a way that it's defined by the library and not the language".
  7. Interestingly, that brought back some old ideas from issues that arose with the definition of for loops in XL2, where I wanted for X in 1..4 loop... to define a variable named X, not simply reference it.
  8. A recent insight for me was that as far as the CPU goes, everything is "passed by value". The rest is built on top of that. A reference, a pointer, etc, are all constructed types.
  9. This in turn made me re-think some aspects that XL2 inherited from Ada, notably in in and out parameters. In XL2, I would have written out X : integer. Now I'm thinking that it's possible to construct an out integer type with the right effect, so that you would instead write X : out integer. This has relatively major implications.
  10. An aspect of Rust I like is that there can be as many impl blocks as you want. I started wondering about "distributed" interfaces and implementations.
  11. Partially as a result, I'm toying with changes in the syntax for interfaces. For example, I used to have module STUFF with ... to define an interface. I'm now shifting towards simply STUFF with..., or STUFF has .... Not sure yet about that.
  12. For types, this syntax does not work too well. I'm still looking for the proper syntax for what was called "data inheritance" in XL2. The non-orthogonal syntax type int is integer is one possibility. Or with type inference, int is integer is enough to know that int is a type. But is it readable? What about int : type? It's orthogonal, but does it look good?
  13. By experimenting, I came up with a few interesting generic type constructors. I used to have integer or real as a type constructor, but I found uses for not integer as well as for copy and number. I think that I want handle_type is new integer to create types that require an explicit conversion.
  14. I'd like to be able to support smart builds (like crates, Cargo,...) using only what is in the modules. This means I should be able to describe modules in depth, including dependencies. That's when I realized that XL would not support semantic versioning numbers, i.e. it's not really easy to support 1.0.0 as being part of a valid XL program today. So I'm thinking about how to fix that.

I think the major insight is that at the hardware level, everything is passed by value.