Paper Review: Architecture of a Database System
Saturday, October 1, 2022
Last week, I read "Architecture of a Database System" for a Red Book reading group.
This is as massive paper: 119 pages. What surprised me is how approachable it is. I have relatively little background building database systems and more experience using them. Despite this, the paper was readable and I was able to take away quite a bit from it, which I've already put into practice in my redis-compatible KV store that I'm building to learn about database systems.
The paper is structured in a way that makes it easy to skip around and focus on the parts that are most interesting or useful to you at the moment. It also gives a lot of pointers into other papers or texts to learn more or build a foundation.
- The first section is under ten pages and gives a map of the rest of the paper as well as of architecture in general, so you can put the different pieces in context. This is probably the section I would recommend everyone read.
- The second and third sections are also really useful as a user of a database system to put in concrete terms why, for example, PostgreSQL does not handle large numbers of open connections very well. (Hello, PgBouncer!).
- The fourth section gives an overview of the relational query processor and helps understand how queries are parsed, optimized, and executed.
- The fifth section talks about storage and what considerations go into making it efficient.
- The sixth section talks about transactions, concurrency, and recovery. This section breaks down what ACID is (spoiler: it's not well defined, but it's useful anyway), talks about locking, and most importantly goes through transaction isolation levels. It wraps up with durability. This section was probably the most intense for me!
- The seventh section talks about the junk drawer that exists in database architectures, just like in all architectures: shared components that get shoved into one category, the section of misfit toys. I skimmed this one.
I think this paper is an excellent introduction to database architecture for users of databases and for anyone who wants to learn more about the internals. It will give you a good, broad foundation which you can use to drive further exploration and improve your understanding of databases as you use them.