Tiled Hacker news on React Router

Idempotency keys for exactly-once processing

182 points - 12/01/2025

Source

AdieuToLogic
12/06/2025
From the article:
```
  In distributed systems, there’s a common understanding that 
  it is not possible to guarantee exactly-once delivery of 
  messages.
```
This is not only a common understanding, it is a provably correct axiom. For a detailed discussion regarding the concepts involved, see the "two general's problem"[0].
To guarantee exactly once processing requires a Single Point of Truth (SPoT) enforcing uniqueness shared by all consumers, such as a transactional persistent store. Any independently derived or generated "idempotency keys" cannot provide the same guarantee.
The author goes on to discuss using the PostgreSQL transaction log to create "idempotency keys", which is a specialization of the aforementioned SPoT approach. A more performant variation of this approach is the "hi/low" algorithm[1], which can reduce SPoT allocation of a unique "hi value" to 1 in 2,147,483,648 times when both are 32-bit signed integers having only positive values.
Still and all, none of the above establishes logical message uniqueness. This is a trait of the problem domain, in that whether two or more messages having the same content are considered distinct (thus mandating different "idempotentcy keys") or duplicates (thus mandating identical "idempotency keys").
0 - https://en.wikipedia.org/wiki/Two_Generals'_Problem
1 - https://en.wikipedia.org/wiki/Hi/Lo_algorithm
imron
12/05/2025
I like to use uuid5 for this. It produces unique keys in a given namespace (defined by a uuid) but also takes an input key and produces identical output ID for the same input key.
This has a number of nice properties:
1. You don’t need to store keys in any special way. Just make them a unique column of your db and the db will detect duplicates for you (and you can provide logic to handle as required, eg ignoring if other input fields are the same, raising an error if a message has the same idempotent key but different fields).
2. You can reliably generate new downstream keys from an incoming key without the need for coordination between consumers, getting an identical output key for a given input key regardless of consumer.
3. In the event of a replayed message it’s fine to republish downstream events because the system is now deterministic for a given input, so you’ll get identical output (including generated messages) for identical input, and generating duplicate outputs is not an issue because this will be detected and ignored by downstream consumers.
4. This parallelises well because consumers are deterministic and don’t require any coordination except by db transaction.
bokohut
12/05/2025
This was my exact solution in the late 1990's that I formulated using a uid algorithm I created when confronted with a growing payment processing load issue that centralized hardware at the time could not handle. MsSQL could not process the ever increasing load yet the firehose of real-time payments transaction volume could not be turned off so an interim parallel solution involving microservices to walk everything over to Oracle was devised using this technique. Everything old is new again as the patterns and cycles ebb and flow.
pyrolistical
12/05/2025
This article glosses over the hardest bit and bike sheds too much over keys.
> Critically, these two things must happen atomically, typically by wrapping them in a database transaction. Either the message gets processed and its idempotency key gets persisted. Or, the transaction gets rolled back and no changes are applied at all.
How do you do that when the processing isn’t persisted to the same database? IE. what if the side effect is outside the transaction?
You can’t atomically rollback the transaction and external side effects.
If you could use a distributed database transaction already, then you don’t need idempotent keys at all. The transaction itself is the guarantee
jackfranklyn
12/05/2025
The messier version of this problem: banks themselves don't give stable unique identifiers. Transaction references get reused, amounts change during settlement, descriptions morph between API calls. In practice you end up building composite keys from fuzzy matching, not clean UUIDs. Real payment data is far noisier than these theoretical discussions assume.
hinkley
12/05/2025
Failure resistant systems end up having a bespoke implementation of a project management workflow built into them and then treating each task like a project to be managed from start to finish, with milestones along the way.
Lethalman
12/06/2025
> To ensure monotonicity, retrieval of the idempotency key and emitting a message with that key must happen atomically, uninterrupted by other worker threads. Otherwise, you may end up in a situation where thread A fetches sequence value 100, thread B fetches sequence value 101, B emits a message with idempotency key 101, and then A emits a message with idempotency key 100\. A consumer would then, incorrectly, discard A’s message as a duplicate.
Also check out Lamport vector clocks. It solves this problem if your producers are a small fixed number.
Groxx
12/05/2025
Why call this "exactly once" when it's very clearly "at most once"?
zmj
12/05/2025
I like the uuid v7 approach - being able to reject messages that have aged past the idempotency key retention period is a nice safeguard.
amarant
12/06/2025
Huh. Interesting solution! I've always thought the only way to make an API idempotent was to not expose "adding" endpoints. That is, instead of exposing a endpoint "addvalue(n)" you would have setvalue(n)". Any adding that might be needed is then left as an exercise for the client.
Which obviously has it's own set of tradeoffs.
otterley
12/05/2025
This is some useful reading that's in the same vein: https://docs.aws.amazon.com/wellarchitected/latest/reliabili...
eximius
12/05/2025
These strategies only really work for stream processing. You also want idempotent APIs which won't really work with these. You'd probably go for the strategy they pass over which is having it be an arbitrary string key and just writing it down with some TTL.
ekjhgkejhgk
12/05/2025
Here's what I don't understand about distributed systems: TCP works amazing, so why not use the same ideas? Every message increments a counter, so the receiver can tell the ordering and whether some message is missing. Why is this complicated?
manoDev
12/05/2025
> The more messages you need to process overall, the more attractive a solution centered around monotonically increasing sequences becomes, as it allows for space-efficient duplicate detection and exclusion, no matter how many messages you have.
It should be the opposite: with more messages you want to scale with independent consumers, and a monotonic counter is a disaster for that.
You also don’t need to worry about dropping old messages if you implement your processing to respect the commutative property.
9dev
12/06/2025
What I would like to learn is how to implement arbitrary client-chosen idempotency keys for public HTTP APIs to avoid duplicate requests. Stripe does this, for example; but other than keeping a record of every single request ever received, I don’t see an elegant solution…
amelius
12/06/2025
Wait, how does this ensure that the processing is not happening zero times?
attila-lendvai
12/05/2025
does OP mean simply the identity of the message?
idempotency means something else to me.

Idempotency keys for exactly-once processing

AdieuToLogic

dragonwriter

AdieuToLogic

zkldi

AdieuToLogic

ModernMech

endofreach

threatofrain

AdieuToLogic

thaumasiotes

antonvs

AdieuToLogic

gunnarmorling

AdieuToLogic

threatofrain

imron

cortesoft

dmurray

imron

bknight1983

bokohut

pyrolistical

hippo22

plaguuuuuu

gunnarmorling

ivanbalepin

roncesvalles

jasonwatkinspdx

hobs

ekropotin

hobs

mrkeen

jackfranklyn

crote

pyuser583

hinkley

doctorpangloss

majormajor

whattheheckheck

doctorpangloss

leoqa

Lethalman

Groxx

zmj

amarant

otterley

eximius

ekjhgkejhgk

ewidar

ekjhgkejhgk

exitb

mkarrmann

Etheryte

ekjhgkejhgk

Etheryte

ekjhgkejhgk

vouwfietsman

ekjhgkejhgk

vouwfietsman

ekjhgkejhgk

vouwfietsman

jasonwatkinspdx

sethammons

mrkeen

podgietaru

ekjhgkejhgk

manoDev

majormajor

itishappy

hobs

9dev

amelius

ranger_danger

attila-lendvai

gunnarmorling

d4rkn0d3z

vouwfietsman

d4rkn0d3z

vouwfietsman