ReAct (agent pattern)

The ReAct agent pattern (short for Reasoning and Acting) is an LLM prompting and inference architecture in which the model alternates between generating explicit reasoning traces ("Thought" steps) and executing tool calls ("Action" steps), with the results of those tool calls ("Observation" steps) fed back into the model's context before the next reasoning step. This interleaved loop continues until the model determines it has sufficient information to produce a final answer. First described in a 2022 research paper, ReAct became the foundational pattern for most modern tool-using AI agents.

A concrete trace from a customer support ReAct agent handling "What is the status of order 8842?" looks like: Thought: The user wants order status. I need to look up order 8842. Action: lookup_order(order_id="8842"). Observation: Order 8842, placed 2026-06-10, status: Shipped, carrier: FedEx, tracking: 794687. Thought: I have the status. The order has shipped. Answer: Order 8842 shipped on June 10th via FedEx with tracking number 794687. Each step is visible in the model's context, creating an interpretable reasoning trail.

How the Thought-Action-Observation cycle works

The ReAct loop is powered by function calling capabilities in modern LLMs combined with a prompting structure that elicits explicit reasoning before each action. The system prompt instructs the model to produce a Thought before each Action — articulating what it knows, what it needs, and why it is choosing a particular tool. The Action is a structured tool call with typed parameters. The Observation is the tool's return value, injected into the context by the orchestration layer. The model then generates the next Thought based on the updated context.

The explicit Thought step is the key differentiator from simpler tool-using patterns. A model that calls tools without generating explicit reasoning may select tools correctly on straightforward tasks but struggles on tasks requiring multi-step logic or error recovery. When the Thought is explicit, the model can recognize inconsistencies in prior observations, decide to retry a failed action with different parameters, or determine that it has reached a dead end and should fall back to a different strategy.

ReAct vs other agent architectures

ReAct is not the only agent architecture. Understanding when to use it requires contrast with alternatives:

Plan-and-Execute: A two-phase approach where the model first generates a complete plan and then executes each step in sequence. More structured than ReAct; better suited to tasks where the full action sequence can be determined upfront. Less adaptive when observations change the appropriate course of action mid-execution.
Reflexion: An architecture that adds a retrospective self-evaluation step after task completion or failure, using the reflection to improve performance on subsequent attempts. ReAct focuses on in-context reasoning during a single attempt; Reflexion adds a learning loop across attempts.
Single-step tool use: A single LLM call with access to tools, without explicit reasoning traces. Simpler and lower-latency than ReAct; appropriate for tasks with a single, deterministic tool call rather than multi-step workflows.
Prompt chaining: A developer-specified sequence of LLM calls where each step's output feeds the next. Unlike ReAct, the sequence is determined at design time rather than dynamically at inference time. See prompt chaining for a direct comparison.

Failure modes in ReAct agents

ReAct agents fail in characteristic ways. Tool call loops are a common failure: the model enters a cycle of calling the same tool repeatedly with slightly different parameters, never converging on an answer. This is especially common when the tool returns unexpected data formats or empty results that the model interprets as requiring further investigation. Production ReAct agents require explicit loop detection and maximum-step limits to prevent runaway inference.

Reasoning trace drift is another failure pattern: over long Thought-Action-Observation sequences, the model's Thought steps begin to lose track of the original task goal and focus on intermediate artifacts. Adding periodic goal-restatement steps mitigates this in some architectures.

The transparency of the Thought-Action-Observation trace is a double-edged property. While it makes the agent's behavior more interpretable, it also exposes intermediate reasoning to prompt injection attacks: a malicious tool response could include instructions in the Observation field that redirect the agent's subsequent reasoning. Defense requires sanitizing tool outputs before injection into the agent's context.

ReAct in production systems

Most commercial agent frameworks implement ReAct or a variation of it. In production, Thoughts are suppressed from the user-facing output (they are internal reasoning) and Observation injection is handled by the framework rather than manually formatted in the prompt. AI observability tooling that captures the full Thought-Action-Observation trace for each run is essential for debugging — errors often lie in a specific Observation that produced an unexpected model response. Integrating human-in-the-loop review for agent actions where incorrect tool calls have irreversible consequences is standard in regulated deployments.

ReAct (agent pattern)

How the Thought-Action-Observation cycle works

ReAct vs other agent architectures

Failure modes in ReAct agents

ReAct in production systems

Learn more

Deliver the concierge experiences your customers deserve

Product

Industries

Resources

Company