How to Make AI Agents Idempotent: Prevent Duplicate Actions, Double Charges, and Repeat Emails
If your AI agent retries a step and sends the same email twice, creates the same ticket twice, or charges the same customer twice, you do not have an intelligence problem.
You have a side-effect control problem.
This is one of the fastest ways a promising agent turns into an expensive liability.
A lot of teams discover this the hard way. The agent looks fine in the demo. Then production shows up with timeouts, network flakiness, partial failures, duplicate webhook deliveries, queue redeliveries, and workers that crash right after taking an action but before recording that they took it.
Now the model is not the issue. The issue is that your system cannot answer a simple question:
did this action already happen?
That is what idempotency is for.
If you are building AI agents for real workflows, idempotency is not a nice-to-have. It is how you keep retries from turning into damage.
What idempotent actually means#
In plain English:
an idempotent action can be retried without creating extra side effects.
If the same request gets executed twice, the second execution should produce the same effective result as the first one.
That matters because production systems retry all the time.
- queues redeliver jobs
- HTTP clients retry on timeout
- workers crash and get restarted
- humans click buttons twice
- webhook senders deliver the same event again
- agents themselves re-attempt a tool call when the outcome is ambiguous
Without idempotency, a retry becomes a second action.
That is how you get:
- duplicate outbound emails
- duplicate CRM updates
- duplicate support tickets
- duplicate invoices
- duplicate Slack alerts
- duplicate refunds
- duplicate purchases
The more autonomy you add, the more expensive this gets.
Where agent systems usually break#
Normal software already has retry problems. AI agents make them worse because they introduce more ambiguity.
A typical agent step does not just call one function. It often does something like this:
- inspect context
- choose a tool
- construct arguments
- call an external API
- wait for a response
- interpret the result
- update memory or state
- move to the next step
There are multiple places where the system can fail in the middle.
Example:
- the agent sends an email successfully
- the email provider accepts it
- your worker times out before writing the success receipt
- the runtime retries the step
- the agent sends the same email again
From the system’s perspective, the first attempt looked ambiguous. From the customer’s perspective, you just looked sloppy.
This is why production reliability is not just “did the API return 200.” It is also: can the system safely recover when the answer is unclear?
The rule: separate decisions from side effects#
A useful mental model is this:
- Decision: should the agent do the thing?
- Side effect: the actual external action
- Receipt: proof that the side effect already happened
If you blur those together, retries get dangerous fast.
A safer design is:
- compute the intended action
- generate a stable idempotency key for that action
- check whether that key already has a recorded result
- if yes, return the recorded result instead of acting again
- if no, perform the action and persist the receipt
That receipt is what makes retries safe.
The simplest idempotency pattern that works#
For most agent workflows, you do not need something exotic. You need a durable store and a consistent key strategy.
Use a record shaped roughly like this:
operation_type— for examplesend_emailorcreate_invoicetarget_entity— customer ID, ticket ID, order ID, thread IDintent_version— optional, useful if your logic changesidempotency_key— stable key derived from the action intentstatus— pending, succeeded, failed, ambiguousexternal_reference— provider message ID, charge ID, ticket IDrequest_payload_hash— optional sanity checkcreated_at/completed_at
Then enforce this rule:
for any side-effecting action, only one successful result may exist for a given idempotency key.
If the same action comes back through the system, you do not execute it again. You return the prior receipt.
How to generate a good idempotency key#
A bad idempotency key is one that changes every retry.
If you generate a new UUID on every attempt, you have built a duplicate machine.
A good idempotency key is derived from the business intent of the action.
Examples:
send_email:user_123:welcome_v2create_invoice:order_8472:finalrefund_payment:charge_9981:fullpost_slack_alert:incident_442:sev1-opened
The key should answer:
what exact real-world action is this supposed to represent?
Not “which attempt is this?” Not “which worker handled it?” Not “what time did it run?”
Those are retry details. They are not the intent.
Make ambiguous outcomes a first-class state#
One of the dumbest things agent systems do is treat every non-clean outcome as a normal failure.
That is wrong.
There is a huge difference between:
- failed before side effect happened
- failed after side effect happened
- cannot prove whether side effect happened
That third state matters.
If Stripe times out after you submit a charge request, or an email provider drops the connection after accepting the send, you may not know whether the action landed.
Do not blindly retry that as if nothing happened.
Mark it ambiguous. Then do one of these:
- query the downstream system using the idempotency key or external metadata
- look for the expected external object
- escalate to human review if the action is high risk
Ambiguous is not a bug. It is a real production state. Your runtime should model it explicitly.
The dangerous operations that always need idempotency#
In practice, any operation with a real-world side effect should be treated as unsafe by default.
Especially:
- payments
- refunds
- outbound email
- SMS and voice calls
- support ticket creation
- CRM object creation
- calendar invites
- order creation
- user provisioning
- permission changes
- document e-sign requests
Reads are cheap. Writes are where your reputation gets lit on fire.
A good runtime pattern for agent tools#
If you are building tool-enabled agents, do not let the model directly spray irreversible actions.
Wrap side-effecting tools with an execution layer that does four things:
1. Normalize the requested intent#
Turn the model output into a structured action.
Example:
- tool:
send_email - recipient:
[email protected] - template:
followup_v1 - campaign_id:
lead_882
2. Compute the idempotency key before execution#
Do not wait until after the tool call. The key is how you decide whether execution is allowed.
3. Check for an existing receipt#
If receipt exists and status is succeeded, return it.
If status is ambiguous, resolve it before proceeding.
If no receipt exists, continue.
4. Persist the result immediately#
Store:
- success or failure
- provider reference ID
- normalized payload
- timestamps
- retry count
- any evidence needed for later audit
This gives you replay safety and operational receipts.
Example: sending email safely#
Unsafe pattern:
- agent decides to send email
- tool sends email
- process crashes before logging success
- job retries
- same email gets sent again
Safer pattern:
- build key like
send_email:lead_882:followup_v1 - write execution row as
pending - send email with provider metadata carrying that same key if possible
- record provider message ID on success
- if retry occurs, check receipt first
- if receipt exists, return the previous send result instead of sending again
Now the retry becomes harmless.
Example: creating tickets or records#
This is where people get smoked by duplicates.
If your agent opens a support ticket, creates a CRM deal, or files a task in a project system, idempotency should usually be tied to the underlying event.
Examples:
- one ticket per inbound incident ID
- one CRM deal per qualified lead + offer combination
- one project task per verified exception event
If you do not bind object creation to a stable business event, retries and duplicate inbound events will quietly fill your systems with junk.
Then your team starts blaming the agent when the real problem is weak execution controls.
What not to do#
A few common mistakes:
“We only retry on errors, so we’re fine.”#
No. Errors are exactly where idempotency matters.
“The provider probably handles duplicates.”#
Sometimes they do. Sometimes they don’t. Sometimes only if you pass the right header. Sometimes only for a short retention window.
Relying on vibes from downstream systems is not a control strategy.
“We log after the action succeeds.”#
Too late. If the process dies in between, your retry logic is blind.
“We use a random request ID.”#
That tracks attempts, not intent.
“The model won’t repeat itself.”#
That is adorable.
The minimum viable stack#
If you want the practical version, start here:
-
Classify tools as read or write
- Read tools can usually retry freely.
- Write tools must be idempotent or guarded.
-
Create a durable action ledger
- SQLite, Postgres, whatever. Just make it durable and queryable.
-
Generate stable idempotency keys from business intent
- Not per attempt.
-
Store receipts for every side effect
- Provider IDs, timestamps, payload hashes, outcome state.
-
Add explicit ambiguous-state handling
- Do not treat uncertainty as permission to act again.
-
Require human review for high-risk ambiguous writes
- Especially money movement, permissions, and customer-facing actions.
That will get you most of the benefit fast.
Why this matters commercially#
Idempotency sounds technical, but it is not just an engineering hygiene topic. It is a trust topic.
Buyers do not care that your agent used a fancy planner or a new model if it sends duplicate invoices, spams customers, or creates cleanup work for humans.
The real commercial promise of AI agents is not “look how autonomous this is.”
It is:
- fewer expensive mistakes
- lower human cleanup load
- safer retries
- predictable operations
- auditability when something goes weird
That is what makes agents deployable. And deployable is what gets paid.
Final take#
If your agent can retry but cannot prove whether it already acted, it is not production-ready.
You do not fix that with a smarter prompt. You fix it with better execution design.
Make the intent stable. Give the action a key. Persist the receipt. Treat ambiguity like a real state.
That is how you keep “autonomous” from becoming “expensive twice.”
If you want help designing agent systems that are safe enough to ship and boring enough to trust, talk to Erik MacKinnon.