RocksDB development finds a CPU bug | RocksDB
This is the story of how a RocksDB unit test I added four years ago, a mini-stress test you might call it, revealed a novel hardware bug in a newer CPU. It w...
I share interesting articles, videos, papers and more about distributed systems, formal methods and computer science.
This is the story of how a RocksDB unit test I added four years ago, a mini-stress test you might call it, revealed a novel hardware bug in a newer CPU. It w...
Three Sets of Specs (for Staying in Spec) In the last post I talked about how systems formalisms are like sculpting with a chisel: we remove behaviors we don’t…
I have an early adulthood trauma from struggling to understand consensus amidst a myriad of poor explanations. I am overcompensating for that by adding my own attempts to the fray. Today, I want to draw a series of pictures which could be helpful. You can see this post as a set of missing illustrations for Notes on Paxos, or, alternatively, you can view that post as a more formal narrative counter-part for the present one.
why compression is (almost) always worthwhile
Learn how we used modular protocol specification for verifying both correctness and performance of MongoDB’s distributed transactions protocol.
We can automatically check correctness properties of a TLA+ specification using TLC, a model checker that will exhaustively explore a spec’s reachable states...
For years, PostgreSQL has been one of the most critical, under-the-hood data systems powering core products like ChatGPT and OpenAI’s API. As our user base grows rapidly, the demands on our databases have increased exponentially, too. Over the past year, our PostgreSQL load has grown by more than 10x, and it continues to rise quickly.
In distributed systems, there’s a common understanding that it is not possible to guarantee exactly-once delivery of messages. What is possible though is exactly-once processing. By adding a unique …