Welcome to Distributed Bytes!

I share interesting articles, videos, papers and more about distributed systems, formal methods and computer science.

Made with by Federico Ponzi

Recent Posts

Formal Methods Beyond Correctness: Isolation & Permissiveness of Distributed Transactions in MongoDB | MongoDB

Learn how we used modular protocol specification for verifying both correctness and performance of MongoDB’s distributed transactions protocol.

Formal Methods Beyond Correctness: Isolation & Permissiveness of Distributed Transactions in MongoDB | MongoDB

One-off Verified Transpilation with Claude

We can automatically check correctness properties of a TLA+ specification using TLC, a model checker that will exhaustively explore a spec’s reachable states...

One-off Verified Transpilation with Claude

Scaling PostgreSQL

How OpenAI scales PostgreSQL

Scaling PostgreSQL

Scaling PostgreSQL to power 800 million ChatGPT users

For years, PostgreSQL has been one of the most critical, under-the-hood data systems powering core products like ChatGPT and OpenAI’s API. As our user base grows rapidly, the demands on our databases have increased exponentially, too. Over the past year, our PostgreSQL load has grown by more than 10x, and it continues to rise quickly.

Scaling PostgreSQL to power 800 million ChatGPT users

SpiceDB Documentation - Authzed Docs

Welcome to the SpiceDB and AuthZed docs site.

SpiceDB Documentation - Authzed Docs

On Idempotency Keys - Gunnar Morling

In distributed systems, there’s a common understanding that it is not possible to guarantee exactly-once delivery of messages. What is possible though is exactly-once processing. By adding a unique …

On Idempotency Keys - Gunnar Morling

What Does Write Skew Look Like?

This post is about gaining intuition for Write Skew, and, by extension, Snapshot Isolation. Snapshot Isolation is billed as a transaction isolation level that offers a good mix between performance and correctness.

What Does Write Skew Look Like?

How to do distributed locking — Martin Kleppmann’s blog

Redis has been gradually making inroads into areas of data management where there are stronger consistency and durability expectations – which worries me, because this is not what Redis is designed for. Arguably, distributed locking is one of those areas. Let’s examine it in some more detail.

How to do distributed locking — Martin Kleppmann’s blog

Reproducing the AWS Outage Race Condition with a Model Checker | Waqas Younas' blog

As a small experiment, we’ll use a model checker to see how such a race could happen. Formal verification can’t prevent every failure, but it helps us think more clearly about correctness and reason about subtle concurrency bugs.

Reproducing the AWS Outage Race Condition with a Model Checker | Waqas Younas' blog

TLA+ Modeling of AWS outage DNS race condition

On Oct 19–20, 2025, AWS’s N. Virginia region suffered a major DynamoDB outage triggered by a DNS automation defect that broke endpoint resol...

TLA+ Modeling of AWS outage DNS race condition