Paper review: C-store

Friday, November 4, 2022

It's that time again: I read another paper, and here's what I took away from it! This week I read "C-store: a column-oriented DBMS" from chapter 4 of the Red Book. This one I picked since I thought it would be helpful for the chess database I'm working on, and it does seem applicable!

This paper was pretty significant for making a strong case for the utility of columnar databases in read-heavy situations. It demonstrated an architecture for a column database which not only beats row-based databases of the time (in their workload) but also beat the proprietary column databases of the time as well. One of their key takeaways is that being columnar gives you:

The overall architecture they presented seems straightforward and perhaps deceptively simple. They have three major components:

Each of these components was described in brief detail. There was enough detail to get the gist, but not enough to go write an implementation myself. I think this is the nature of publishing: You have limited space to publish in, and also it would be nice to save some details to publish later. They also have a number of things which were planned but not implemented, so sparse detail may also be from simply not having answers.

Some of the things I was left wondering were:

This paper really gave me some inspiration for how to structure the database I'm working on. Hopefully I'll have a post up about that database's structure once it's working (always more to do, always performance traps to fall in before it's done!) and I'll be able to talk more about how its design was informed by C-store.

Next week's paper will be the DynamoDB paper, which I'm excited to read! Later!


If this post was enjoyable or useful for you, please share it! If you have comments, questions, or feedback, you can email my personal email. To get new posts, subscribe to the newsletter or use the RSS feed.

Want to become a better programmer? Join the Recurse Center!