Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Purely to Mention Elysia

I built a programming language just to tell people it’s for Elysia.

Technical Decisions

Programming languages are one of the most studied fields in computer science. The field is broad, ranging from low-level assembly to modern languages like Python. To make this language simple and easy to use while remaining dependency-free, the Felys programming language is:

  • Interpreted: It would be too much work to emit code to multiple ISAs without using LLVM or existing software. Building a simple virtual machine that accepts custom bytecode is a pragmatic way.
  • Imperative: The language is a product, so it needs to be easy to use. In contrast, functional languages are more for academic research and most people are not familiar with them, though this is probably something many CS people learned in their first year. That said, Felys still embraces the functional paradigm for simplicity and clarity.
  • Dynamic Typing: The type system is complicated, so I kept it dynamic. This allows me to implement a neural network library framework much more easily later.
  • Strong Typing: Weak typing sometimes provides convenience, but it becomes evil when programmers are not careful or the codebase is large. I personally don’t like it.
  • Immutable: This is a very functional thing, but the real reason I’m introducing it is to simplify memory management. Writing a garbage collector requires unsafe Rust, and I want to avoid it. Immutability means that simple reference counting can provide acceptable performance.

Some of the problems can be easily solved by switching to another language like Go. Historically, the very first version was written in C and migrated to Rust. I stuck with Rust simply because it’s a joy to see the compiler catch all the mistakes I made, and to learn how to code while fighting against the borrow checker and lifetime checker.

Architecture

This entire project contains two major subsystems: the programming language (compiler and runtime) and an auto-differentiation system for neural network training.

Programming Language

  • Front-end: It parses source code strings into an abstract syntax tree. For fast iteration, a parser generator that bootstraps itself is introduced. Parsing expression grammar and deterministic finite automata algorithms power it.
  • Mid-end: It converts the syntax tree to a linear control-flow graph in the form of static single assignment (SSA) intermediate representation (IR). There are several passes such as constant propagation, dead code elimination, and jump threading to optimize the code. Before emitting the bytecode, it also performs register allocation to minimize memory usage during runtime.
  • Back-end: A virtual machine reads the bytecode and runs it. This is more similar to assembly, where data needs to be loaded and stored, and control flow is done via jumps and branches to offsets.

Auto Differentiation

Recall that all variables are immutable; common designs like PyTorch do not work because they store and modify gradients inside the graph. This does not affect dynamic computation graph building, so native flow control is fully usable. The only difference is that gradients must be stored explicitly outside the graph. This is not great for an interface, but it does allow people to see the magic behind it.

Apologies

I started this project in 2023. There is too much I could talk about, and I don’t even know where to start. I will add more algorithm-related topics in the future. There are lots of coding bad practices, and I did not want to fix them until I worked in the industry. I am planning another round of reconstruction, but I’m unsure when I will start it.