Back to blog
AI Systemsadvanced

AI Agents and Tool Calling Workflows: Production Patterns

Design reliable AI agents with tool calling, planning loops, memory boundaries, retries, and human-in-the-loop safeguards.

Asma HafeezMay 6, 20263 min read
AI AgentsTool CallingFunction CallingLLM OrchestrationAgentic WorkflowsReliabilitySafety
Share:𝕏

Agent systems fail when they are treated as magic. Reliable agents are just deterministic workflows wrapped around probabilistic reasoning.


Agent Architecture That Scales

TEXT
User Task -> Planner -> Tool Selector -> Tool Executor -> Verifier -> Finalizer
                         ^                 |
                         |----- Retry -----|

Design each step as an explicit state transition.


1) Tool Contracts First

Every tool should have:

  • strict input schema
  • bounded side effects
  • idempotency where possible
  • machine-readable error codes

Example tool schema:

JSON
{
  "name": "create_ticket",
  "input": { "title": "string", "priority": "low|med|high" },
  "output": { "ticket_id": "string", "status": "created|failed" }
}

2) Planning and Execution Loop

Keep planning shallow and bounded:

  • max steps (e.g., 8)
  • max tool calls per step
  • explicit stop criteria
Python
MAX_STEPS = 8
for step in range(MAX_STEPS):
    plan = planner(state)
    action = choose_action(plan)
    result = run_tool(action)
    state = update_state(state, action, result)
    if state.done:
        break

3) Reliability Patterns

  • Retry transient errors (timeouts, rate limits)
  • Never retry unsafe writes blindly
  • Verify tool outputs before next step
  • Fallback to deterministic path for critical flows

Use circuit breakers for flaky external tools.


4) Memory Strategy

Split memory into:

  • session memory (ephemeral conversation state)
  • task memory (current objective + intermediate artifacts)
  • long-term memory (approved facts only)

Do not mix raw user chat history into long-term memory without curation.


5) Human-in-the-Loop Controls

Require approval for:

  • financial actions
  • data deletion
  • external communication
  • permission changes

Pattern:

TEXT
Agent proposes action -> show summary/risk -> human approve/reject -> execute

6) Agent Security Guardrails

  • Prompt injection filtering on retrieved content
  • Tool allowlist per task type
  • Context isolation between tenants/projects
  • Output policy checks before final response

Assume hostile input by default.


7) Metrics to Track in Production

  • task success rate
  • average steps per task
  • tool error rate by tool
  • human override rate
  • latency and cost per completed task

If success rate drops while step count rises, your planner is drifting.


Minimal FastAPI Orchestrator Pattern

Python
from fastapi import FastAPI

app = FastAPI()

@app.post("/agent/run")
async def run_agent(task: dict):
    # validate task
    # run bounded planner/executor loop
    # enforce approvals for sensitive actions
    # return audit trail + result
    return {"status": "ok", "steps": []}

Production Readiness Checklist

  • Tool contracts versioned
  • Unsafe tools gated behind approvals
  • Replay logs available for failed runs
  • Regression tasks run on every release
  • Cost budget enforced per workflow

Ship agents as workflows, not personalities.

Enjoyed this article?

Explore the AI Systems learning path for more.

Found this helpful?

Share:𝕏

Leave a comment

Have a question, correction, or just found this helpful? Leave a note below.