Learnixo
Back to blog
AI Systemsadvanced

Interview: AI Agents, Orchestration & Frameworks (LangChain, LangGraph, CrewAI, AutoGen, Semantic Kernel, MCP)

Senior interview Q&A on agent architecture, orchestration systems, framework tradeoffs, MCP servers, and production patterns for multi-agent workflows.

Asma Hafeez KhanMay 16, 20266 min read
AI AgentsInterviewLangChainLangGraphCrewAIAutoGenSemantic KernelMCPOrchestration
Share:𝕏

Q1: What is an AI agent vs a chatbot, and what makes "orchestration" necessary?

Answer:

| Chatbot | Agent | |---------|-------| | Single LLM turn or short thread | Multi-step plan → act → observe loop | | May use RAG only | Uses tools (APIs, DB, search, code) | | Stateless or simple memory | State across steps (workflow position, variables) | | User drives each turn | Model decides when to stop |

Orchestration is the control layer that answers: Which step runs next? Who calls the LLM? How is state updated? What happens on failure? Without it, you get spaghetti if/else around tool calls.

Production orchestration concerns: max iterations, timeouts, human-in-the-loop gates, idempotent tools, compensating transactions, observability per step.


Q2: Compare LangChain, LangGraph, CrewAI, AutoGen, and Semantic Kernel — when do you pick each?

Answer:

| Framework | Mental model | Best for | |-----------|--------------|----------| | LangChain | Chains, LCEL, retrievers, tools | RAG pipelines, standard agents, Python ecosystem | | LangGraph | Graph of nodes/edges, cyclic flows | Stateful agents, loops, approval steps, recovery | | CrewAI | Roles (researcher, writer) + tasks | Readable multi-agent demos, role-play workflows | | AutoGen | Conversable agents in group chat | Research/coding loops, human-in-the-loop chat | | Semantic Kernel | Planners + plugins in .NET/C# | Azure enterprises, existing .NET services, MCP plugins |

Decision guide:

  • .NET + Azure team → Semantic Kernel first; LangGraph for complex graphs if Python microservice OK
  • Cyclic agent with checkpoints → LangGraph
  • Quick multi-role prototype → CrewAI
  • Code-gen pair programming → AutoGen patterns
  • Don't need a framework → raw OpenAI SDK + 200 lines when flow is linear

Senior line: "Frameworks buy observability and state — not intelligence. I pick based on team language and whether the workflow has cycles."


Q3: Explain LangGraph-style orchestration: nodes, edges, state, and conditional routing.

Answer: LangGraph models the workflow as a directed graph:

  • State — typed object (messages, retrieved docs, step_count, approved)
  • Nodes — functions (retrieve, generate, tool_call, human_review)
  • Edges — fixed transitions or conditional (if tool_called → tools_node else → end)
  • Cycles — agent can loop until done or max steps

Why graphs beat linear chains: Agents need to retry tools, branch on errors, and pause for human approval — chains can't express loops cleanly.

Pseudocode pattern:

START → classify_intent
classify_intent → [needs_rag] retrieve → generate → END
classify_intent → [needs_tool] tools → generate → END
generate → [low_confidence] human_review → END

Production add-ons: checkpointing (resume after crash), time-travel debugging, LangSmith traces.


Q4: What is MCP (Model Context Protocol) and how does it differ from traditional function calling?

Answer: MCP standardises how AI applications discover and call tools/resources exposed by external servers — like USB-C for agent tools.

| Function calling (OpenAI tools) | MCP | |---------------------------------|-----| | Tools defined in your app code | Tools hosted by MCP servers (filesystem, DB, GitHub, custom) | | Per-provider schema | Shared protocol across clients (Claude Desktop, IDEs, agents) | | You implement each integration | Reuse community/enterprise MCP servers |

Components:

  • MCP server — exposes tools (search_formulary, read_file) and resources
  • MCP client — your agent runtime connects and lists capabilities
  • Transport — stdio or HTTP/SSE

Interview use case: Pharmacy assistant connects to MCP servers for: internal drug DB, order status API, and document store — without hardcoding every schema in the monolith.

Security: Treat MCP servers like microservices — auth, network policy, least-privilege tools, audit every invocation.


Q5: How would you build agent architecture for a pharmacy customer assistant?

Answer:

Layers:

  1. Gateway — auth, rate limit, session, PII redaction
  2. Triage agent — classify: drug_info | order_status | interaction_check | off_topic (temperature 0, JSON output)
  3. Specialists — each with narrow tools and stricter system prompts
  4. RAG — formulary + FAQ chunks with metadata filters (OTC vs Rx)
  5. Safety — output validator, mandatory disclaimer, block dosing advice for named patients
  6. Human escalation — low confidence or high-risk intents

Orchestration choice: LangGraph or Semantic Kernel planner for triage → specialist routing with max 5 tool calls per session.

Anti-pattern: One mega-agent with 40 tools — model picks wrong tool; hard to test.


Q6: What is the ReAct pattern and how do you implement it safely in production?

Answer: ReAct = interleaved Reason (thought) → Act (tool call) → Observe (tool result) → repeat.

Why it works: Grounds reasoning in real API/DB results instead of hallucinated state.

Safety controls:

  • Allowlist tools per agent role
  • Validate tool arguments (schema, SQL parameterisation)
  • Max iterations (e.g. 8)
  • Loop detection — same tool + same args twice → abort
  • Timeout per tool call
  • No destructive tools without human approval

Q7: Design orchestration for AI workflow automation (e.g. intake → classify → enrich → notify).

Answer: Treat this as workflow engine + LLM steps, not a chat session.

Event (form submitted)
  → Step 1: Extract fields (LLM structured output)
  → Step 2: Classify priority (mini model)
  → Step 3: Enrich from CRM API (deterministic code)
  → Step 4: Draft summary for human (LLM)
  → Step 5: Post to Slack (tool)
  → Persist state after each step (Durable Functions / Temporal / LangGraph checkpoint)

Key design:

  • Idempotent steps with step IDs
  • Retry transient failures per step
  • Dead letter queue for poison messages
  • Human task node when confidence below threshold
  • Full audit log — inputs/outputs hashed, not raw PHI in logs

Azure fit: Durable Functions orchestrator + Azure OpenAI + Service Bus triggers.


Q8: How do you test and observe multi-agent systems in production?

Answer:

Testing:

  • Unit — tool functions with mocked APIs
  • Contract — JSON schema validation on agent outputs
  • Scenario eval — 50–200 labelled conversations with expected tool/route
  • Regression — run eval on every prompt/model change in CI

Observability:

  • Trace ID per session across agent hops
  • Log: intent, tools called, latency, token cost, retrieval scores
  • Metrics: tool success rate, escalation rate, loop abort rate
  • LangSmith / Application Insights / OpenTelemetry

SLIs: P95 end-to-end latency, cost per resolved ticket, hallucination rate on golden set.

Enjoyed this article?

Explore the AI Systems learning path for more.

Found this helpful?

Share:𝕏

Leave a comment

Have a question, correction, or just found this helpful? Leave a note below.