How to Make AI Agents Idempotent: Prevent Duplicate Actions, Double Charges, and Repeat Emails

If your AI agent retries a step and sends the same email twice, creates the same ticket twice, or charges the same customer twice, you do not have an intelligence problem.

You have a side-effect control problem.

This is one of the fastest ways a promising agent turns into an expensive liability.

A lot of teams discover this the hard way. The agent looks fine in the demo. Then production shows up with timeouts, network flakiness, partial failures, duplicate webhook deliveries, queue redeliveries, and workers that crash right after taking an action but before recording that they took it.

Now the model is not the issue. The issue is that your system cannot answer a simple question:

did this action already happen?

That is what idempotency is for.

If you are building AI agents for real workflows, idempotency is not a nice-to-have. It is how you keep retries from turning into damage.

What idempotent actually means#

In plain English:

an idempotent action can be retried without creating extra side effects.

If the same request gets executed twice, the second execution should produce the same effective result as the first one.

That matters because production systems retry all the time.

queues redeliver jobs
HTTP clients retry on timeout
workers crash and get restarted
humans click buttons twice
webhook senders deliver the same event again
agents themselves re-attempt a tool call when the outcome is ambiguous

Without idempotency, a retry becomes a second action.

That is how you get:

duplicate outbound emails
duplicate CRM updates
duplicate support tickets
duplicate invoices
duplicate Slack alerts
duplicate refunds
duplicate purchases

The more autonomy you add, the more expensive this gets.

Where agent systems usually break#

Normal software already has retry problems. AI agents make them worse because they introduce more ambiguity.

A typical agent step does not just call one function. It often does something like this:

inspect context
choose a tool
construct arguments
call an external API
wait for a response
interpret the result
update memory or state
move to the next step

There are multiple places where the system can fail in the middle.

Example:

the agent sends an email successfully
the email provider accepts it
your worker times out before writing the success receipt
the runtime retries the step
the agent sends the same email again

From the system’s perspective, the first attempt looked ambiguous. From the customer’s perspective, you just looked sloppy.

This is why production reliability is not just “did the API return 200.” It is also: can the system safely recover when the answer is unclear?

The rule: separate decisions from side effects#

A useful mental model is this:

Decision: should the agent do the thing?
Side effect: the actual external action
Receipt: proof that the side effect already happened

If you blur those together, retries get dangerous fast.

A safer design is:

compute the intended action
generate a stable idempotency key for that action
check whether that key already has a recorded result
if yes, return the recorded result instead of acting again
if no, perform the action and persist the receipt

That receipt is what makes retries safe.

The simplest idempotency pattern that works#

For most agent workflows, you do not need something exotic. You need a durable store and a consistent key strategy.

Use a record shaped roughly like this:

operation_type — for example send_email or create_invoice
target_entity — customer ID, ticket ID, order ID, thread ID
intent_version — optional, useful if your logic changes
idempotency_key — stable key derived from the action intent
status — pending, succeeded, failed, ambiguous
external_reference — provider message ID, charge ID, ticket ID
request_payload_hash — optional sanity check
created_at / completed_at

Then enforce this rule:

for any side-effecting action, only one successful result may exist for a given idempotency key.

If the same action comes back through the system, you do not execute it again. You return the prior receipt.

How to generate a good idempotency key#

A bad idempotency key is one that changes every retry.

If you generate a new UUID on every attempt, you have built a duplicate machine.

A good idempotency key is derived from the business intent of the action.

Examples:

send_email:user_123:welcome_v2
create_invoice:order_8472:final
refund_payment:charge_9981:full
post_slack_alert:incident_442:sev1-opened

The key should answer:

what exact real-world action is this supposed to represent?

Not “which attempt is this?” Not “which worker handled it?” Not “what time did it run?”

Those are retry details. They are not the intent.

Make ambiguous outcomes a first-class state#

One of the dumbest things agent systems do is treat every non-clean outcome as a normal failure.

That is wrong.

There is a huge difference between:

failed before side effect happened
failed after side effect happened
cannot prove whether side effect happened

That third state matters.

If Stripe times out after you submit a charge request, or an email provider drops the connection after accepting the send, you may not know whether the action landed.

Do not blindly retry that as if nothing happened.

Mark it ambiguous. Then do one of these:

query the downstream system using the idempotency key or external metadata
look for the expected external object
escalate to human review if the action is high risk

Ambiguous is not a bug. It is a real production state. Your runtime should model it explicitly.

The dangerous operations that always need idempotency#

In practice, any operation with a real-world side effect should be treated as unsafe by default.

Especially:

payments
refunds
outbound email
SMS and voice calls
support ticket creation
CRM object creation
calendar invites
order creation
user provisioning
permission changes
document e-sign requests

Reads are cheap. Writes are where your reputation gets lit on fire.

A good runtime pattern for agent tools#

If you are building tool-enabled agents, do not let the model directly spray irreversible actions.

Wrap side-effecting tools with an execution layer that does four things:

1. Normalize the requested intent#

Turn the model output into a structured action.

Example:

tool: send_email
recipient: [email protected]
template: followup_v1
campaign_id: lead_882

2. Compute the idempotency key before execution#

Do not wait until after the tool call. The key is how you decide whether execution is allowed.

3. Check for an existing receipt#

If receipt exists and status is succeeded, return it. If status is ambiguous, resolve it before proceeding. If no receipt exists, continue.

4. Persist the result immediately#

Store:

success or failure
provider reference ID
normalized payload
timestamps
retry count
any evidence needed for later audit

This gives you replay safety and operational receipts.

Example: sending email safely#

Unsafe pattern:

agent decides to send email
tool sends email
process crashes before logging success
job retries
same email gets sent again

Safer pattern:

build key like send_email:lead_882:followup_v1
write execution row as pending
send email with provider metadata carrying that same key if possible
record provider message ID on success
if retry occurs, check receipt first
if receipt exists, return the previous send result instead of sending again

Now the retry becomes harmless.

Example: creating tickets or records#

This is where people get smoked by duplicates.

If your agent opens a support ticket, creates a CRM deal, or files a task in a project system, idempotency should usually be tied to the underlying event.

Examples:

one ticket per inbound incident ID
one CRM deal per qualified lead + offer combination
one project task per verified exception event

If you do not bind object creation to a stable business event, retries and duplicate inbound events will quietly fill your systems with junk.

Then your team starts blaming the agent when the real problem is weak execution controls.

What not to do#

A few common mistakes:

“We only retry on errors, so we’re fine.”#

No. Errors are exactly where idempotency matters.

“The provider probably handles duplicates.”#

Sometimes they do. Sometimes they don’t. Sometimes only if you pass the right header. Sometimes only for a short retention window.

Relying on vibes from downstream systems is not a control strategy.

“We log after the action succeeds.”#

Too late. If the process dies in between, your retry logic is blind.

“We use a random request ID.”#

That tracks attempts, not intent.

“The model won’t repeat itself.”#

That is adorable.

The minimum viable stack#

If you want the practical version, start here:

Classify tools as read or write
- Read tools can usually retry freely.
- Write tools must be idempotent or guarded.
Create a durable action ledger
- SQLite, Postgres, whatever. Just make it durable and queryable.
Generate stable idempotency keys from business intent
- Not per attempt.
Store receipts for every side effect
- Provider IDs, timestamps, payload hashes, outcome state.
Add explicit ambiguous-state handling
- Do not treat uncertainty as permission to act again.
Require human review for high-risk ambiguous writes
- Especially money movement, permissions, and customer-facing actions.

That will get you most of the benefit fast.

Why this matters commercially#

Idempotency sounds technical, but it is not just an engineering hygiene topic. It is a trust topic.

Buyers do not care that your agent used a fancy planner or a new model if it sends duplicate invoices, spams customers, or creates cleanup work for humans.

The real commercial promise of AI agents is not “look how autonomous this is.”

It is:

fewer expensive mistakes
lower human cleanup load
safer retries
predictable operations
auditability when something goes weird

That is what makes agents deployable. And deployable is what gets paid.

Final take#

If your agent can retry but cannot prove whether it already acted, it is not production-ready.

You do not fix that with a smarter prompt. You fix it with better execution design.

Make the intent stable. Give the action a key. Persist the receipt. Treat ambiguity like a real state.

That is how you keep “autonomous” from becoming “expensive twice.”

If you want help designing agent systems that are safe enough to ship and boring enough to trust, talk to Erik MacKinnon.