Orchestration Patterns

Pattern 01 · ReAct Loop

What is an Agent?

An agent thinks, then acts, then observes — in a loop

A regular AI model takes input and returns output. An agent goes further — it can use tools, check results, and keep going until the job is done.

The most common pattern is ReAct (Reason + Act). The agent alternates between thinking about what to do next and acting by calling a tool.

New to agents? Think of a ReAct agent like a person solving a puzzle: they look at the clues (Observe), think about what to try (Reason), make a move (Act), then look again. They keep going until solved.

Each cycle is a step. Steps continue until the agent decides it has a final answer or hits a maximum step limit.

Tools might include: web_search code_interpreter file_read api_call

Real example: "Find the latest AI safety papers and summarize the top 3." The agent searches (Act), reads results (Observe), decides what else to look up (Reason), fetches the papers (Act), then writes the summary. Each turn is one step of the loop.

Inside the Agent

The four parts of every agent

Every agent, no matter how complex, is built from four components working together:

Model — The LLM that reasons and decides. It reads the context and decides what action to take next.

Memory — What the agent remembers. This ranges from the short-term conversation window to long-term vector databases.

Tools — The agent's hands. Functions it can call to interact with the world: search engines, code runners, APIs.

Instructions — The system prompt. Defines the agent's persona, goals, constraints, and how to use its tools.

The context window is the agent's working memory — everything it can "see" right now. When it fills up, older information may be forgotten or summarized.

Tool Use

How agents use tools

When the model decides it needs a tool, it emits a structured tool call — a JSON object naming the tool and providing arguments. The runtime intercepts this, executes the function, and returns the result.

The result gets added back into the context as a tool result message, and the model continues reasoning with this new information.

Key insight: The model never directly executes code or makes API calls. It just describes what it wants to do. The runtime does the actual work and reports back. This separation is what makes agents safe to deploy.

This design means you can give an agent access to powerful tools while controlling exactly what each tool can do.

Multi-Agent Systems

Why one agent isn't always enough

A single agent has hard limits: its context window fills up, it can only do one thing at a time, and mixing too many responsibilities degrades quality.

Multi-agent systems solve this by splitting work across specialized agents that coordinate with each other.

The benefits: parallelism (multiple agents work at once), specialization (each agent is an expert at one thing), and scale (unlimited context via handoffs).

Think of it like a company: there's a project manager who delegates, researchers who find information, writers who create content, and reviewers who check quality. Each person is specialized; together they accomplish more.

Agent Communication

How agents talk to each other

Agents communicate by passing messages. An orchestrating agent calls a sub-agent just like it would call any other tool — by emitting a structured request.

The sub-agent receives the request, runs its own reasoning loop, and returns a result. From the orchestrator's perspective, this looks exactly like a tool call result.

This agent-as-tool pattern is powerful: you can nest agents arbitrarily deep, composing complex systems from simple building blocks.

Modern frameworks like Google's A2A protocol and Anthropic's Claude Agent SDK standardize how agents discover each other and exchange messages, even across different systems.

Pattern 02 · Orchestrator-Worker

Delegation

Orchestrator-Worker

A central orchestrator agent plans the work and delegates tasks to specialized worker agents. Workers report back; the orchestrator assembles the final result.

The orchestrator never does the "real work" itself — it focuses on planning, decomposition, and synthesis. Workers focus on execution without worrying about the big picture.

Orchestrator-Worker topology

Best for: Research pipelines, content generation, multi-step data tasks where work can be parallelized.

Real example: A competitive analysis assistant: orchestrator delegates web research to a search worker, data extraction to a parser worker, and chart generation to a visualization worker — then assembles the final report.

planning delegation synthesis

Pattern 03 · Sequential Pipeline

Assembly Line

Sequential Pipeline

Agents are chained like an assembly line. Agent A's output becomes Agent B's input, which feeds Agent C, and so on. Each agent transforms the data before passing it forward.

There's no central coordinator — each agent just receives input, does its job, and hands off to the next.

Pipeline: each stage refines the output

Best for: Document processing, content workflows, ETL pipelines where each stage has a clear responsibility.

Real example: Blog post pipeline — Agent 1 researches a topic, Agent 2 writes a draft, Agent 3 edits for tone and grammar, Agent 4 generates an SEO title and meta description, Agent 5 formats for CMS.

sequential transformation modular

Pattern 04 · Parallel Fan-out

Speed

Parallel Fan-out / Fan-in

A dispatcher sends the same task (or parts of it) to multiple agents simultaneously. All agents work at the same time. A reducer waits for all results and merges them.

Instead of processing 10 documents one by one (slow), fan-out processes all 10 at once — then combines the results.

Fan-out: work splits, then recombines

Best for: Batch processing, competitive analysis across many sources, parallel document ingestion.

Real example: Processing 50 customer support tickets simultaneously — dispatcher sends each ticket to a separate classifier agent, all run in parallel, reducer aggregates the category counts into a dashboard update.

parallel speed aggregation

Pattern 05 · Hierarchical

Scale

Hierarchical Multi-Agent

Like a management tree. A top-level strategic agent breaks work into sub-goals and delegates to mid-level managers, who in turn spin up worker agents for execution.

Each tier only knows about its immediate reports and manager. This keeps the context clean and makes very large tasks tractable.

Hierarchical: three tiers of agents

Best for: Enterprise automation, long-running projects, systems with thousands of sub-tasks.

Real example: Automated codebase migration — a strategic agent decomposes the project by module, manager agents own each module, worker agents rewrite individual files and run tests. Thousands of files processed in parallel tiers.

scale delegation isolation

Pattern 06 · Plan-and-Execute

Strategy First

Plan-and-Execute

Two dedicated agents: a Planner that creates a full task plan upfront, and an Executor that carries out the plan step by step.

Unlike ReAct (which interleaves planning and acting), Plan-and-Execute commits to a full plan first. This produces more coherent, consistent multi-step work.

Plan first, then execute each step

Best for: Complex research tasks, software projects, long documents where consistency across steps matters.

Real example: Writing a 20-page technical report — the Planner outlines all sections upfront, the Executor writes each section in order. Because the plan is fixed, section 12 won't contradict section 3.

planning execution consistency

Pattern 07 · Reflection / Self-Critique

Quality

Reflection & Self-Critique

The agent (or a separate critic agent) reviews its own output and identifies flaws before declaring the task done. The critique feeds back into a revision cycle.

This dramatically improves output quality: the generator focuses on creation, the critic focuses on flaws, and the cycle repeats until quality is acceptable.

Generate → Critique → Revise loop

Best for: Writing, code generation, analysis — any task where quality matters and first drafts are rarely perfect.

Real example: Code review assistant — Generator writes a function, Critic checks for edge cases, off-by-one errors, and security issues, Generator revises. Loop repeats until Critic finds no issues.

critique revision quality

Pattern 08 · Multi-Agent Debate

Robustness

Multi-Agent Debate

Multiple agents independently produce answers, then challenge each other's reasoning in rounds of debate. A judge (or consensus mechanism) picks the winner or synthesizes the best answer.

Debate reduces single-model hallucinations and blind spots — an error that one agent makes is likely to be caught by another.

Agents debate; judge picks best answer

Best for: High-stakes decisions, factual verification, medical/legal analysis where accuracy is critical.

Real example: Medical diagnosis assistant — three specialist agents each independently assess symptoms. Where they agree, confidence is high. Where they disagree, a Judge agent asks each to defend its reasoning and picks the most evidence-backed answer.

adversarial consensus accuracy

Pattern 09 · Blackboard / Shared State

Coordination

Blackboard / Shared State

All agents read from and write to a shared workspace (the "blackboard"). There's no central orchestrator — agents subscribe to changes and activate when relevant new data appears.

Any agent can contribute facts, hypotheses, or partial results. Others build on top. The system converges on a solution organically.

Agents share a common workspace

Best for: Complex problem-solving, scientific reasoning, systems where multiple expertise domains need to collaborate without a fixed sequence.

Real example: Drug discovery pipeline — a chemistry agent posts a candidate molecule to the blackboard. A toxicity agent reads it, flags a risk, posts a warning. A synthesis agent reads the warning and proposes a safer alternative. No coordinator; each agent reacts to what others post.

shared memory event-driven emergent

Pattern 10 · Human-in-the-Loop

Safety

Human-in-the-Loop (HITL)

The agent pauses at critical decision points and requests human approval before continuing. This is essential for irreversible, high-risk, or high-cost actions.

HITL doesn't mean humans do the work — they just act as checkpoints for decisions that shouldn't be fully automated.

Agent pauses for human approval at gate

Common HITL triggers: before sending an email, before executing a financial transaction, before deleting data, or when confidence is below a threshold.

Design tip: Design your HITL gates carefully. Too many pauses create friction; too few creates risk. Gate only on actions that are irreversible, expensive, or legally significant.

approval oversight safety

Architecture

Bounded Context & Context Management

Every LLM has a finite context window. In long-running agent tasks, you must proactively manage what's in that window — keeping it relevant, trimming the stale, and summarizing where needed.

Strategies include rolling summaries (compress old turns), RAG (retrieve only what's relevant), and memory tiers (hot/warm/cold based on access frequency).

A well-designed agent treats its context window like a surgeon's tray — only the tools needed for the current operation are on the tray. Everything else is stored nearby but not occupying space.

Context boundaries also define what one agent knows vs. another. In multi-agent systems, each agent should only receive the context it needs for its specific role — this is the principle of least privilege for context.

context window RAG summarization memory tiers

Enterprise

Enterprise Agent Architecture

Production agent systems require more than just the agents themselves. They need an entire supporting infrastructure:

Observability — Every agent action, tool call, and LLM response should be traced. You need to know why an agent did what it did.

Guardrails — Input and output validation, content policies, rate limiting, and cost controls prevent runaway agents.

Auth & Permissions — Agents must operate under scoped credentials. An agent should never have more access than required for its task.

Evals — Automated test suites that verify agent behavior doesn't regress as you update models or prompts.

The hardest part of enterprise agent deployment isn't the AI — it's the surrounding infrastructure: monitoring, access control, cost management, and failure recovery. Plan for these from day one.

observability guardrails auth evals

Which Pattern?

How to choose the right pattern

With 10 patterns to choose from, the hardest question is: which one fits my problem? Use these three questions to narrow it down quickly.

① Is the task a single loop or many coordinated steps?
Single loop with tool use → ReAct (P01). Multiple coordinated agents needed → continue.

② Does order matter?
Strict order, each step depends on previous → Pipeline (P03) or Plan-Execute (P06). Order doesn't matter, work is independent → Fan-out (P04).

③ How much oversight do you need?
High stakes, irreversible → add HITL (P10). Need quality checking → add Reflection (P07). Need verified facts → Debate (P08).

Rule of thumb: Start with the simplest pattern that solves the problem. Add complexity only when you hit a real limit — context overflow, quality issues, or speed. Premature orchestration is as bad as premature optimization.

decision tree selection guide architecture

What Goes Wrong

Common agent failure modes

Most agent failures fall into predictable categories. Knowing them helps you design defenses upfront.

Infinite loops — The agent keeps calling tools without making progress. The LLM is "stuck" reasoning in circles. Fix: max step limits, loop detection, progress checks.

Context overflow — The conversation history fills the context window and earlier instructions are forgotten. Fix: rolling summaries, RAG, clear system prompt at every turn.

Tool call hallucination — The agent invents tool arguments or calls tools that don't exist. Fix: strict tool schemas, output validation, error handling that re-prompts the agent.

Cascading errors — In multi-agent systems, one bad output propagates and poisons every downstream agent. Fix: validation gates between agents, confidence thresholds, explicit error states.

Cost runaway is often overlooked: an agent in a loop can burn through thousands of LLM calls before you notice. Always set hard token/step budgets and cost alerts in production.

loops overflow hallucination cascades

Visibility

Observability & tracing

When an agent does something unexpected, you need to know exactly what happened — which LLM call produced which output, which tool was called with what arguments, how long each step took.

Traces — A tree of every LLM call, tool invocation, and sub-agent call with inputs, outputs, latency, and token counts. Think of it as a flight recorder for your agent.

Spans — Each node in the trace tree. A span has a start time, end time, and metadata. Spans nest: an LLM call span may contain tool call spans inside it.

Evals — Automated tests that run your agent against a known dataset and score the outputs. Catch regressions before users do.

Tools to know: LangSmith (LangChain tracing), Weights & Biases Weave, Arize Phoenix, and Anthropic's own workbench all provide agent tracing. The OpenTelemetry standard is emerging as the common wire format.

traces spans evals OpenTelemetry

A Visual Guide toAI Agents & Orchestration

An agent thinks, then acts, then observes — in a loop

The four parts of every agent

How agents use tools

Why one agent isn't always enough

How agents talk to each other

Orchestrator-Worker

Sequential Pipeline

Parallel Fan-out / Fan-in

Hierarchical Multi-Agent

Plan-and-Execute

Reflection & Self-Critique

Multi-Agent Debate

Blackboard / Shared State

Human-in-the-Loop (HITL)

Bounded Context & Context Management

Enterprise Agent Architecture

How to choose the right pattern

Common agent failure modes

Observability & tracing

A Visual Guide to
AI Agents & Orchestration