When an AI agent breaks in production, the worst move is to treat it like a vague model problem.

Usually it isn’t.

Production incidents with agents cut across prompts, retrieval, tools, queues, APIs, permissions, validation, and runtime state. One bad output can be a model issue, a memory issue, a tool issue, or a control-flow issue pretending to be intelligence failure.

That’s why you need a runbook.

Not “let’s stare at the prompt for an hour.” A real operator runbook: what to do in the first five minutes, what to preserve, when to kill the agent, how to contain damage, and how to make sure the same failure does not return next week wearing a new hat.

If you are deploying AI agents in production in 2026, this is the practical version.

What counts as an incident?#

An agent incident is any production behavior that causes one of these outcomes:

  • wrong external action
  • dangerous external action
  • repeated failed runs
  • quality collapse at scale
  • cost spike
  • data leakage risk
  • broken business-critical workflow
  • silent failure where the agent looks alive but stops doing useful work

That last one matters.

Traditional software often fails loudly. Agents often fail plausibly. The runtime is healthy. The logs are moving. The payload is formatted correctly. But the work quality is dead.

That is still an incident.

The first five minutes: contain first, explain second#

In the first five minutes, your job is not root-cause analysis.

Your job is blast-radius control.

1. Stop new damage#

Ask this immediately:

Can the agent still take external action right now?

If yes, and the incident touches money, customer communication, records, or permissions, hit the kill switch.

That can mean:

  • pause the worker
  • disable the scheduler
  • revoke write tokens
  • turn off outbound delivery
  • force human approval mode

Do not leave a misbehaving agent running because you want better evidence. That is how one bad run becomes fifty.

2. Freeze the current version#

Before anyone starts “fixing” things, capture:

  • current prompt version
  • model and routing settings
  • deployed commit hash
  • active environment flags
  • changed tool/API versions if relevant

If you change the system before capturing this, you just damaged the crime scene.

3. Open an incident record#

Even if you are a team of one, write down:

  • incident ID
  • time detected
  • who detected it
  • affected workflow
  • current impact
  • containment status

This prevents the classic problem where everyone remembers a different story six hours later.

The first hour: collect evidence in a fixed order#

Most agent teams start with the prompt because it feels like the “AI part.” That’s backwards.

Collect evidence in this order.

1. Save the failed run receipt#

For the bad run, capture:

  • trigger
  • input payload
  • retrieved context/memory
  • selected model
  • tool calls made
  • tool outputs returned
  • final output
  • validation result
  • latency
  • token usage / estimated cost
  • run ID / trace ID

If your system does not already store this, fix that before the next incident.

Without a run receipt, debugging becomes AI ghost hunting.

2. Scope the blast radius#

Figure out how wide the problem is.

Questions:

  • one run or many?
  • one workflow or all workflows?
  • one customer or many?
  • one model route or all routes?
  • one tool integration or multiple?
  • one deploy version or preexisting?

This tells you whether you are dealing with an isolated bad input, a broken dependency, a bad deploy, or systemic drift.

3. Check the five failure layers#

You can debug most agent incidents by walking these layers in order:

Layer 1: input#

Was the incoming task malformed, incomplete, contradictory, or unexpectedly shaped?

Layer 2: retrieval/memory#

Did the agent receive stale, irrelevant, missing, or duplicated context?

Layer 3: tools#

Did a tool fail, time out, return partial data, or return success-shaped garbage?

Layer 4: control flow#

Did retries, branching, approvals, or queue state send the run down the wrong path?

Layer 5: output validation#

Did the agent produce a bad output that should have been blocked before delivery?

This is the fastest way to stop blaming “the model” for infrastructure mistakes.

When to kill the agent versus degrade gracefully#

Not every incident requires a full stop.

Full stop if:#

  • it can send harmful outbound messages
  • it can mutate customer or financial records incorrectly
  • there is any chance of data leakage
  • cost is running away because of loops or retries
  • approvals or guardrails are being bypassed
  • you do not understand the blast radius yet

Degrade gracefully if:#

  • the agent can safely switch to draft-only mode
  • outputs can queue for human review
  • a broken tool can be disabled without breaking safety
  • the workflow can fall back to read-only behavior

A good production system should have a “brain injured but harmless” mode where it can still gather context or create drafts without being allowed to execute.

Customer communication: boring, fast, factual#

If customers are affected, do not wait for a perfect explanation.

Send a short update covering:

  • what is affected
  • what is not affected
  • whether you paused the workflow
  • when the next update will come

Do not pretend nothing happened. Do not over-explain an unverified root cause.

The right tone is calm and specific.

Example:

We identified an issue affecting automated agent-driven updates for some workflows. Outbound actions have been paused while we verify scope and restore safe operation. We’ll provide the next update within 60 minutes.

That buys time without lying.

The rollback decision#

If the incident started after a deploy, rollback should be your default unless you have hard evidence that rollback will not help.

Rollback candidates:

  • prompt version
  • orchestration logic
  • retrieval settings
  • tool wrapper changes
  • model routing changes
  • validator changes

Do not combine incident response with opportunistic cleanup. Roll back first. Stabilize. Then patch.

Preserve an evidence pack#

For every meaningful incident, save:

  • incident summary
  • affected run IDs
  • relevant logs
  • input/output examples
  • tool responses
  • screenshots or message links if externally visible
  • deployed commit hash
  • prompt/version identifiers
  • containment actions taken
  • recovery decision

If you ever need to explain the incident to a client, teammate, or future-you, this package matters more than memory.

Turn the incident into a permanent defense#

If the incident is real enough to hurt, it is real enough to deserve a regression test.

After recovery, create at least one of these:

1. A replay test#

Use the same input/context/tool-response pattern and verify the system now behaves correctly.

2. A validation rule#

If the output should never have escaped, write the validator that blocks it next time.

3. A tool health assertion#

If the tool returned nonsense, add detection for empty, partial, or malformed results before they reach the model.

4. A routing or approval rule#

If the task was too risky for autonomy, tighten the gate so similar tasks require human approval.

5. An alert#

If the signal was visible but ignored, encode it:

  • retry spike
  • cost spike
  • abnormal empty retrievals
  • repeated validation failures
  • zero completed tasks in an expected window

Every incident should make the system harder to hurt the same way twice.

The postmortem questions that matter#

A useful postmortem is not “what did the model do?”

Ask:

  • Why was the agent allowed to act in that state?
  • What signal existed before the incident fully surfaced?
  • What should have stopped it automatically?
  • What evidence was missing that slowed diagnosis?
  • What single control would have reduced blast radius the most?

Do not leave with ten vague action items. Leave with one or two hard controls that materially improve safety.

A compact incident checklist#

When an agent incident hits, do this in order:

  • pause external actions if there is any risk
  • capture deploy/prompt/model version
  • open incident record with timestamp and scope
  • save failed run receipts
  • determine blast radius
  • inspect input, retrieval, tools, control flow, validation
  • choose rollback or degraded mode
  • send factual customer update if needed
  • preserve evidence pack
  • convert the failure into a regression test or guardrail

That’s the runbook.

Because in production, the question is not whether your agent will fail.

It will.

The real question is whether the failure becomes a contained incident, or a public self-own with a token bill attached.


If you’re building agent systems that need real production guardrails, incident runbooks, and operator-grade workflow design, check out the services page. Safe systems beat clever demos.