Stacktrace#03postgres/postgres

How Postgres Prevents Both Deadlock AND Livelock in Upserts

"The transaction with the higher XID backs out." Seven lines of comment. Two classes of bugs eliminated.

Jan 15, 20268 min read

written byKarn

I was digging through PostgreSQL's INSERT ... ON CONFLICT code when I found this gem: a one-line rule that prevents two of the nastiest concurrency bugs at once. The fix? "The transaction with the higher XID backs out." That's it. Let me show you why it's brilliant.

Every UPSERT Is a Dance

You've probably written this SQL a hundred times:

INSERT INTO users (email, name)
VALUES ('alice@example.com', 'Alice')
ON CONFLICT (email) DO UPDATE SET name = EXCLUDED.name;

Looks innocent, right? Insert if the row doesn't exist, update if it does. Under the hood though, this is surprisingly tricky.

PostgreSQL can't just check-then-insert. Between your check and your insert, another transaction might slip in. Classic race condition. So PostgreSQL does something clever: it inserts first, then checks for conflicts.

This is called speculative insertion. And it creates a fun coordination problem.

The Problem: Two Transactions, Same Key, Same Moment

Picture this: two transactions trying to insert the same email at the exact same time.

Transaction A (XID 100):  INSERT INTO users VALUES ('alice@...')
Transaction B (XID 101):  INSERT INTO users VALUES ('alice@...')

Both transactions speculatively insert their row. Both rows exist (temporarily) in the table. Now both transactions scan for conflicts and - surprise - they find each other.

What now?

Option 1: Both Wait → DEADLOCK 💀

The obvious thing: each transaction waits for the other to finish.

A: "I see B's row. I'll wait for B to finish..."
B: "I see A's row. I'll wait for A to finish..."

   ↓

Both: 💀 DEADLOCK

Neither can proceed. PostgreSQL would have to kill one of them. The deadlock detector will catch this eventually, but it's not great - detection takes time, and killing transactions is expensive.

Option 2: Both Back Out → LIVELOCK 🔄

Okay, new plan: when you detect a conflict, back out your speculative insert, wait for the other transaction to finish, then retry.

Sounds reasonable. Let's see what happens:

A: "Conflict! I'll back out, wait for B, then retry."
B: "Conflict! I'll back out, wait for A, then retry."

   ↓

A: backs out
B: backs out

   ↓

A: "B is done! Let me retry..."
B: "A is done! Let me retry..."

   ↓

Both insert again. Both find each other. Again.

   ↓

🔄 Forever...

This is livelock. Both transactions are technically "making progress" - backing out, waiting, retrying - but neither ever completes. It's like two people in a hallway, both stepping aside to let the other pass, both stepping forward again, forever doing this awkward dance.

Here's the nasty part: livelocks don't trigger the deadlock detector. The system just spins, burning CPU, going nowhere.

The Solution: One Simple Rule

Here's the fix I found in PostgreSQL's source:

/*
 * To avoid the livelock, one of the backends must back out first,
 * and then wait, while the other one waits without backing out.
 * It doesn't matter which one backs out, so we employ an arbitrary
 * rule that the transaction with the higher XID backs out.
 */

That's the whole idea.

Both transactions are doing the same thing. They're symmetric. But they're not identical - each has a unique Transaction ID (XID). PostgreSQL exploits this asymmetry:

Higher XID (the "younger" transaction): Back out, wait, then retry
Lower XID (the "older" transaction): Just wait, keep your tuple in place

No deadlock - only B backs out, so A isn't waiting for someone who's waiting for A.

No livelock - only one transaction backs out at a time.

The Actual Implementation

Let's look at the real code. First, there's an enum that defines the different wait modes:

/* src/backend/executor/execIndexing.c:122-128 */
 
typedef enum
{
    CEOUC_WAIT,                     /* Normal: wait for conflict to resolve */
    CEOUC_NOWAIT,                   /* Deferred: don't wait */
    CEOUC_LIVELOCK_PREVENTING_WAIT, /* Speculative: smart waiting */
} CEOUC_WAIT_MODE;

That CEOUC_LIVELOCK_PREVENTING_WAIT mode is where the magic happens. Here's the actual check:

/* src/backend/executor/execIndexing.c:876-880 */
 
if (TransactionIdIsValid(xwait) &&
    (waitMode == CEOUC_WAIT ||
     (waitMode == CEOUC_LIVELOCK_PREVENTING_WAIT &&
      DirtySnapshot.speculativeToken &&
      TransactionIdPrecedes(GetCurrentTransactionId(), xwait))))
{
    /* Wait for the other transaction */

Let me unpack this:

GetCurrentTransactionId() - that's my XID
xwait - the conflicting transaction's XID
TransactionIdPrecedes(mine, theirs) - returns true if my XID is lower (I started earlier)

So: if my XID is lower (I'm the older transaction), I wait. If my XID is higher (I'm younger), I skip this block - which means I back out and retry instead.

Why XID? Why Not Something Else?

You might wonder: why use Transaction IDs? Why not process IDs, timestamps, or a random coin flip?

Turns out XIDs are kind of perfect for this:

They're unique - every transaction gets exactly one. They're comparable - there's a clear ordering (with wraparound handling for the 32-bit counter). They're free - already available, no extra syscalls needed. And crucially, they're consistent - both transactions can independently look at each other's XID and reach the same conclusion about who should back off.

That last point is key. You need both sides to independently agree on who yields. XIDs make this trivial.

The Full Speculative Insertion Flow

Here's the complete picture of what happens during an INSERT ... ON CONFLICT:

┌─────────────────────────────────────────────────────┐
│ 1. Pre-check: Is there already a conflicting row?   │
│    → If yes and committed: Do ON CONFLICT action    │
└───────────────────────┬─────────────────────────────┘
                        │ no committed conflict
                        ↓
┌─────────────────────────────────────────────────────┐
│ 2. Acquire "speculative insertion lock"             │
│    (So others know we're in the middle of this)     │
└───────────────────────┬─────────────────────────────┘
                        ↓
┌─────────────────────────────────────────────────────┐
│ 3. Insert tuple speculatively                       │
│    (Tuple exists but is marked "speculative")       │
└───────────────────────┬─────────────────────────────┘
                        ↓
┌─────────────────────────────────────────────────────┐
│ 4. Insert index entries, check for conflicts        │
│    → If conflict with committed row: abort, retry   │
│    → If conflict with speculative row: XID check!   │
└───────────────────────┬─────────────────────────────┘
                        ↓
┌─────────────────────────────────────────────────────┐
│ 5. Mark tuple as "really inserted" or "killed"      │
│    Release speculative lock, wake up waiters        │
└─────────────────────────────────────────────────────┘

The XID check happens at step 4. When two speculative insertions collide:

The younger transaction (higher XID) goes back to step 1
The older transaction (lower XID) waits at step 4

This Pattern Is Everywhere

Once you see it, you notice this "deterministic tiebreaker" pattern all over distributed systems:

System	Tiebreaker	Used For
PostgreSQL	Transaction ID	Livelock prevention
Lamport Clocks	(timestamp, node_id)	Total ordering
Raft	(term, log_index)	Leader election
Bitcoin	Block hash	Chain selection
Ethernet	MAC address	Collision resolution

The common thread: when symmetric actors need asymmetric behavior, pick a deterministic property both can observe independently.

The Commit That Added This

This feature landed on May 8, 2015:

commit 168d5805e4c08bed7b95d351bf097cff7c07dd65
Author: Andres Freund
Date:   Fri May 8 05:31:36 2015 +0200

    Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.

    This is implemented using a new infrastructure called "speculative
    insertion". It is an optimistic variant of regular insertion that
    first does a pre-check for existing tuples and then attempts an
    insert. If a violating tuple was inserted concurrently, the
    speculatively inserted tuple is deleted and a new attempt is made.

    Author: Peter Geoghegan, with significant contributions from
            Heikki Linnakangas and Andres Freund.

The speculative insertion approach was chosen specifically because it lets ON CONFLICT work without holding locks during the check phase - critical for high-concurrency workloads.

What I Learned

Livelocks are sneakier than deadlocks. Deadlocks are dramatic - everything stops, the detector fires, someone gets killed. Livelocks are insidious. The system looks busy, progress seems to be happening, but nothing actually completes. When you're designing retry logic, always ask: "What happens if everyone retries at once?"

Symmetry breaking is powerful. When identical actors need different behavior, find a property they can all independently observe. It doesn't have to be meaningful - just unique and comparable. XIDs, process IDs, MAC addresses, timestamps... keep identifiers around. You never know when you'll need a tiebreaker.

The comments tell the real story. That 7-line comment in PostgreSQL doesn't just say what the code does. It explains the problem (two bad options), why they fail (deadlock, livelock), and why the solution works (asymmetric behavior). The comment is worth more than the code it describes.

One More Thing

The PostgreSQL team could have just let the deadlock detector handle this. After all, deadlocks are detected and one transaction gets killed automatically. Why add extra complexity?

Because deadlock detection has a cost. It runs periodically, holds locks while scanning, and killing transactions is expensive (rollback, cleanup, client retry). The XID comparison? It's essentially free - just comparing two integers that you already have.

This is the difference between "it works" and "it works well at scale."

What's the cleverest tiebreaking rule you've seen in code? I'd love to hear about it.

Sources:

Command Palette