You should make a new programming language
Monday, August 12, 2024
Every software engineer uses a programming language, usually multiple. Few of us make programming languages. This makes sense, because the work we need to get done can typically be done just fine in the languages that exist. Those already have people making them better. Let's focus on the task at hand.
But that means that we're missing out on some learning opportunities. I stumbled into those when I made a language based on a silly premise: control flow via exceptions and nothing else. It was done as a joke, but I accidentally learned things along the way.
It's special that we make our own tools
Every serious woodworker makes some of their own equipment. Some will make their workbench, maybe sawhorses, perhaps jigs for myriad tools and work setups. These are things a woodworker can make from wood. But we don't often have access to the machines we'd need for making all the tools we use: You'd need a metalworking shop to make portions of chisels and planes, let alone any power tools we use.
As programmers, we're in a different position. We have near total control over the machine, and we have the capability, in theory, to build everything from scratch1. Since the tools we use are all software-based, and we write software, we can create all of our own tools, from the operating system on up.
This is a privilege which few fields enjoy. The closest other one I can think of is that machinists can likely produce a lot of their own tools, too. Where we assume that CPUs and RAM exist, they can assume that motors and control boards exist. Then they can build those into the rest of the tool. And so, like machinists, we're able to get incredibly close to our tools.
What you learn by making a language
One of the tools we interact with the most is the programming language. We use one to get any programming work done, and they shape how we think through problems as well. You use a programming language as a tool of thought even when you're away from the keyboard. This makes it ripe for learning. You will learn a lot if you make a new programming language.
You'll learn about grammars and language design. Before you can implement a programming language, you'll have to decide what you even want it to be. Is this an imperative language, or functional, or something else? Is it object oriented? Does it have traditional syntax borrowed from another language, or are you doing something new and weird? These, and many others, are the questions you'll grapple with in designing a language.
In the process, you'll learn about why other languages are designed the way they are. If you're lucky, you'll learn some of this in the initial design process. For example, while working on my next language, Lilac, I learned why semicolons are so common because I tried picking something else. Discussing it with a friend uncovered a lot of potential drawbacks in other choices! If you're less lucky, you'll learn those lessons in the implementation phase, and those lessons will really stick.
You'll learn about parsing. This is one of the first things you'll run into when you start to implement your language. You can't do a whole lot else without parsing the language. To start writing the parser, you'll have to pick what kind of parser to write. Don't overthink it when you're just starting out. Although, if you're really interested in parsers, it can be a wonderful topic to dive deep into.
You'll learn about runtime execution. Running your code means you have to write the runtime (or the compiler) which means thinking deeply about how it will work at run time. When an exception is thrown, how does that actually work? When you reference a variable, how do you know which memory location to find it in? If you run a recursive function, is there a limit to how far you can recurse? Why is that? These are some of the questions you'll answer.
The list really goes on, and on, and on. You can tailor your language to what you want to learn about. My first language, Hurl, taught me about the basics of making an interpreter, designing a language, and writing a grammar. My second language, Lilac, is going to teach me more about type systems, runtimes, and instrumentation.
As you go make a language, you'll gain deeper intuitions for and understanding of other languages. When I implemented Hurl and ran into parsing errors, it would spit out raw token names at me. This resembled some of the errors I used to see sometimes in my Neovim Rust LSP integration, and it started to make those errors easier to understand. Each language and implementation decision you make will deepen your understanding of the languages you use, and you'll be a better user for it.
It will be a bad language, and that's okay
The nice thing with writing your own language for learning is that it's likely to be a bad one. It's certainly possible to make new, good languages, and that's wonderful! But in my experience, it's best to separate out learning how to do something from doing it exceptionally well.
When you go into it knowing that it's going to be a bad language, it can be very freeing! Bad doesn't mean that it's not useful to you, because it still can be. Mostly, it means that it will lack the fit and finish of a "real" language and it will be defective in some way that limits widespread use. But you can make something that solves a specific problem for you, lets you do Advent of Code puzzles, or earns you nerd cred with your friends. These are useful things.
Since you aren't going to make the next Python, you can focus on the things that are interesting, compelling, and fruitful for learning. You can slough off all the things that are tedious but necessary for real-world usage. Your learning can be targeted and you can keep it fun, so you're more likely to finish the project. And it's okay to break things arbitrarily, or make wildly ridiculous language choices that just make you smile. Because hey, it's going to be bad anyway, right?
Getting started making languages
It's intimidating to sit down in front of a blank editor and "make a new language." For a long time, I thought—even as Principal Software Engineer—that it was some dark art that is beyond my abilities. That's a load of crock, and all of us programmers can do it. It gets easier every year to get started, because there are so many resources out there to learn from.
The first thing I'd recommend is implementing someone else's language in a guided fashion. I followed Crafting Interpreters for this, and it's incredible. I've also heard good things about Writing An Interpreter In Go and Build Your Own Lisp. Any of these will give you a taste of how languages work and let someone experienced guide you thorough it.
One thing, though: I've found it is a good idea to choose a different implementation language from what the book uses. Crafting Interpreters uses Java and C, so I used Rust. By choosing a different language, you're forced to grapple with the concepts to translate them. You can't simply retype the code, so you will learn it at a deeper level.
After that, the direction you go is really up to you. I got started with Hurl by just kind of designing it and throwing things at the wall to see what sticks. That worked and let me crystallize a lot of the knowledge I got from Crafting Interpreters. For Lilac, I've read one book so far and have a short list of others to read. When I asked friends for recommendations, these are a few of the books they recommend for this:
- Introduction to Compilers and Language Design, which I've read and really enjoyed
- Engineering a Compiler
- Programming Languages: Application and Interpretation
- Compilers: Principles, Techniques, and Tools aka the Dragon Book
What you read will depend on where you want to go next and what you want to learn.
Go Forth, make something fun
I think we should all go and make a new language. It's a great way to learn, and new ideas have to come from somewhere. At the end of the day, it's a wonderful way to have some fun with your computer.
Oh, and please expand the vocabulary of programming language names. We can say "Go Forth" but it's hard to put together a whole sentence with just programming languages. Let's fix that, shall we? And let's B Swift about it.
There are some firmware blobs which we don't control. But there is fully open hardware, and you have to stop going down the stack somewhere. Well, I guess you could go start a mining operation to extract ore from the earth and go truly from scratch...
If this post was enjoyable or useful for you, please share it! If you have comments, questions, or feedback, you can email my personal email. To get new posts and support my work, subscribe to the newsletter. There is also an RSS feed.
Want to become a better programmer?
Join the Recurse Center!
Want to hire great programmers?
Hire via Recurse Center!