Writing Hurl's grammar, twice

Monday, July 17, 2023

Recently I started working on a programming language, Hurl. Writing the initial code samples and developing the concept is all fine and good, but the next step is to actually make it work. The steps I outlined for developing Hurl in the last post were:

  1. Write out code samples to get a feel for Hurl and its semantics
  2. Define Hurl's grammar in a loose BNF-esque fashion
  3. Implement the lexer and parser
  4. Write a formatter as a demoable use of the parser
  5. Write an interpreter!

The last post got us through number 1, getting code samples. Huge thanks to the readers who pointed out typos and bugs in my programs, by the way! Now it's time to move on to steps 2 and 3.

It's 2023, so naturally I decided to do as little of the work myself as possible. The path I took was to first try to use ChatGPT to get me as far as I could, and then use my own human brain to finish the work. My background is having taken just two PL courses in college and worked through Crafting Interpreters. I was curious to see how much an LLM could help me with something I have seen but am not an expert in.

Spoiler alert: I'd probably not use it again, but I think it helped?

Step 1: Computer, write me a grammar!

To start things off, I turned to my trusty frenemy ChatGPT to see how much it could do for me. Since I'm mostly developing this in my evenings after a full day of work and childcare, reducing friction is very helpful for making progress. So I formed these three hypotheses going in:

  • ChatGPT would generate a valid grammar for the language if I provided code examples and pointed out minor issues to iterate with it
  • ChatGPT would easily generate a standalone lexer for the language, again with some minor iteration required
  • ChatGPT would fail to generate a parser for the language

I wanted to see how far I would get and test these hypotheses, then take the reins myself once ChatGPT couldn't get me further. It was... a mixed bag. I'll show you what I mean, but using ChatGPT got me started, but definitely didn't succeed at any of the steps independently1. And it failed spectacularly at writing the lexer, which I thought it would be great at.

I started off by feeding it my previous blog post as a source of proto-documentation on Hurl. The first task was to break down the task of developing the language itself into discrete tickets. It was pretty successful here!

The tickets it wrote were good, and the effort required from me was low, so it ended up definitely saving me time from writing out tickets myself. (Yes, I use tickets for my personal projects. I need to project manage myself or I'll chase every squirrel and never accomplish a lick of real work.2) In particular, the breakdown for the grammar definition task was pretty helpful. It reminded me that I need to include things like the comment syntax; while obvious in retrospect, I blanked on that need for a while.

After this, I told it to write the grammar in BNF. I'm not worried about the details; this is just to help me/us develop the lexer and parser, so it doesn't need to be formal. Foreshadowing, though: it would've helped to make it formal.

The grammar it generated was... close? It seemed okay because I wasn't very familiar with writing grammars, but I ran into a number of issues down the road. In particular, it couldn't really handle one request I had, which was to include comments in the grammar itself. I wanted that because (1) I'm writing a formatter so the parse-tree needs to include comments to include them in the output, and (2) I'm the kind of monster who just might give the comments semantics someday.

For those things it couldn't handle itself, explaining the problem didn't help, it would enter an "oops loop". You know, where it just apologizes then repeats the same exact mistake again. And again. And again. In those instances, I had to just give it the answer, exactly what I wanted in the grammar. Fine.

This is the grammar we ended up with:

<program> ::= <stmt_list>
<stmt_list> ::= <comment> | <stmt> <optional_comment> ";" <stmt_list> | ε

<stmt> ::= <declaration>
         | <func_definition>
         | <exception_handling>
         | <exception>
         | <expr>

<declaration> ::= "let" <identifier> "=" <expr>

<func_definition> ::= "func" "(" <params> ")" "{" <stmt_list> "}"

<params> ::= <identifier> "," <params> | <identifier> | ε

<exception_handling> ::= "try" "{" <stmt_list> "}" <catch_list>

<catch_list> ::= <catch> <catch_list> | <catch>

<catch> ::= "catch" "as" <identifier> "{" <stmt_list> "}"
          | "catch" "(" <expr> ")" "{" <stmt_list> "}"

<exception> ::= "hurl" <expr> | "toss" <expr>

<expr> ::= <term> "+" <expr> | <term> "-" <expr>
         | <term>

<term> ::= <factor> "*" <term> | <factor> "/" <term> | <factor> "%" <term>
         | <factor>

<factor> ::= "(" <expr> ")"
           | "~" <factor>
           | <identifier>
           | <number>
           | <string>
           | "true"
           | "false"

<identifier> ::= /[a-zA-Z_][a-zA-Z_0-9]*/

<number> ::= /[0-9]+(\.[0-9]+)?/

<string> ::= /"([^"\\]|\\.)*"/

<comment> ::= "#" /[^\n]*/

<optional_comment> ::= <comment> | ε

Those familiar with languages will probably notice a few issues with this. Feel free to peruse it and tear it apart before moving on!

Step 2: Computer, write me a lexer! Wait no, another grammar!

The next thing I had it try was to write a lexer. This just... failed. I expected the structure to be at least okay and need revision, but what it came out with was to my eyes inscrutible. This could be my own inexperience, but I decided that this wasn't going to work for us.

Instead, I changed tacks: let's use a parser-generator called Pest. If it was able to generate one grammar for us, it can probably convert that to Pest's grammar and we get a parser out of it!

This part went okay. Not great, just okay.

I gave it our grammar again and also an example of a Pest grammar, and had it convert our grammar to Pest's formal syntax. This grammar ran into a few syntax errors when I tried to use it, which I fed back in, and it was able to correct successfully!

Then it was a matter of adding things to the grammar which had been omitted before, like member accesses and comparison operators. This was where I should have called things off, but I stubbornly stuck with ChatGPT. It got pretty unproductive, and instead of a little reading the docs, I kept trying to make ChatGPT do the thing for me.

Eventually we landed... somewhere. I don't have the full grammar here because ChatGPT kept digging us into holes and it just got tiresome. This is where I hit eject and bailed out, switched to doing things myself.

Interlude: Reading Pest's docs, a rant

Part of the impetus here for using ChatGPT is that I was pretty intimidated by Pest (and other parser-generators). I'd glanced at it and knew it was a powerful tool, but felt like it was some arcane magic that I couldn't learn easily on my own. I wanted a crutch, a safety blanket, someone to tell me how to do it.

This was reinforced by the Pest Book, which is the official guide for learning to use Pest. This is literally called a book. So it's big, right? It's a lot to get through? That notion scared me, kept me from reading the guide. I bet it has scared other people off of it, too.

But... by my counts, the "book" is only in the order of 5,000 words for the meat of it about grammars. This is substantial, but it's a far cry from the size of the Rust Book. This is a long tutorial, not a book!

Can we please stop scaring people off of docs by calling them books? It's not likely intended that way, and I won't speculate about reasons for calling it a book, but I think it's a bad thing to do.

Anyway, once I realized that the docs were not, in fact, a book, I read them. Well, no, I'm a parent with limited time and energy: I quickly skimmed them. And that was enough!

Step 3: Revise the grammar by hand, and write a parser

From here, I revised the grammar by hand. I'm sure it has bugs still, but now all my example programs successfully parse!

Getting everything to "just" parse was fairly straightforward, and then the more complicated bit (for me) was how to convert this into a concrete syntax tree. (As opposed to an AST, a CST contains other things like comments! The more you know!) Doing that conversion showed me things that were lacking in my grammar, like precedence of operators or whatnot. The final grammar is in the git repo if you want to see it! It's a little longer than the informal one, but reasonable.

During the conversion, I wrote the CST parser. Pest gives you a parse tree as the result of parsing, which is fine but not super helpful for for writing tools like a formatter or interpreter. Instead of being able to use CST structs, we just get general-purpose tree node types. Converting these into a CST is fairly mechanical (and coding assistants such as Copilot are quite helpful for reducing this drudgery).

We walk the parse tree and for each node, parse it recursively. Each time we're invoking a function like this:

pub fn parse_assignment(pair: Pair<Rule>) -> Result<Stmt, ParseError> {
    assert_eq!(pair.as_rule(), Rule::assignment);

    let mut inner = pair.into_inner();
    let ident = parse_ident(inner.next().unwrap())?;
    let expr = parse_expr(inner.next().unwrap())?;

    Ok(Stmt::Assignment(ident, expr))
}

We pass in a Pair (the parse tree node, basically) and get back either an error or a Stmt element of our CST. Inside it, we first validate that we're at a valid point in the parse tree, then we parse the identifier and its expression and return those. It's a pretty straightforward translation from the grammar.

The full parsing code is also in the repo.

Next steps

The rest of this project, I'm just going to use my brain and my usual cadre of coding tools (which includes Copilot for tedium-reduction).

My immediate next step is to write a formatter for Hurl! The idea is that it will exercise the CST and parsing code and be a neat little demo, without the full effort of writing an interpreter. And, of course, this very serious language demands very serious tools ;)

After that, it's time to write the interpreter itself. The focus for it will just be to get something running. I might do small benchmarks to make sure that it's "reasonable", because I do want to be able to use this for coding challenges like Advent of Code. But a language like Hurl is clearly not about performance (except in the sense of "performance art").

Feelings about LLMs

LLMs (and ChatGPT in particular) are an emotionally charged topic these days. It's pretty natural for a technology like this that seems like it can be transformational. I've run the gamut on them. Last fall, I felt like they were overhyped and they were not useful; just get out of my way and stop distracting me! This spring, I saw the demo for GPT-4 and drank it in deeply; it was an "oh shit" moment where I realized these are here to stay. Ultimately I landed somewhere in the middle.

That GPT-4 demo launched me into a mode of exploring what we can do with LLMs and how I can use them for my work. If LLMs are here to stay, I'd better figure out how to get value out of them. My comfortable middle for now is: LLMs are useful for my work sometimes, they're very powerful, and they have strong limits.

I probably won't use ChatGPT for another language project. It helped me get through some portions of the project, with a lot of steering. I had to lean on the meager PL knowledge I entered with, and wasted some time, but overall had a fun experience. Next time I'll do it the old-fashioned human way, because I've learned about languages from this project. But I might use ChatGPT or similar tools for other projects in other domains.

The future seems bright. These tools have a lot of problems today (ethics around training and copyright loom large), but the potential for improving the world is great. We need to face the problems head-on, and we also need to remember that this technology is worth pursuing ethically. If we get it right, we can build a better world.

A world where this tired parent can write a programming language by herself in the evenings, after work and a trying bedtime with her toddler.

If that isn't worth pursuing, I don't know what is.


1

The full transcript is available if you3 want to peruse it.

2

Often distractions turn into footnotes, so... when I have footnotes on footnotes you know my focus was particularly poor that evening of writing!

3

If you work at OpenAI or can put me in touch with a human who does, I'd love to talk about OpenAI's names policy. My deadname is on my account and it cannot be edited, which is a source of pain whenever I use ChatGPT. I'd like to provide feedback on this and kinda beg someone to help a girl out here.


If this post was enjoyable or useful for you, please share it! If you have comments, questions, or feedback, you can email my personal email. To get new posts and support my work, subscribe to the newsletter. There is also an RSS feed.

Want to become a better programmer? Join the Recurse Center!