DevInterviewMasterStart free →
AI & AutomationFree to read

Prompt Patterns (ReAct, ToT, Self-Refine)

Advanced Reasoning Frameworks for AI Agents

Learn the advanced prompting patterns that power modern AI agents: ReAct (Reasoning + Acting), Tree-of-Thought (exploring multiple solutions), and Self-Refine (iterative improvement). These patterns transform LLMs from simple responders into sophisticated problem solvers.

What Are Advanced Prompt Patterns?

Structured Thinking Frameworks for AI

Beyond Basic Prompting:

Basic prompting (zero-shot, few-shot, CoT) handles simple tasks well. But real-world problems are messy, multi-step, and require iteration. Advanced prompt patterns give the AI a structured framework for tackling these complex problems.

Think of it this way: basic prompting is like giving someone a task. Advanced patterns are like teaching them a problem-solving methodology - how to break down problems, explore options, take actions, and refine their work.

Real-World Analogy - Cricket Strategy:

A batsman does not just swing at every ball (basic prompting). A great batsman has patterns: read the bowler (observe), decide the shot (reason), play the shot (act), assess if it worked (evaluate), adjust strategy for next ball (refine). Advanced prompt patterns teach AI to think like a strategic cricketer, not just react.

The Three Patterns We Will Cover:

  • ReAct (Reason + Act): AI thinks about what to do, takes an action (like calling a tool), observes the result, and repeats. Powers most AI agents.
  • Tree-of-Thought (ToT): AI explores multiple solution paths simultaneously, evaluates each, and picks the best one. For creative and strategic problems.
  • Self-Refine: AI generates a response, critiques its own work, and iteratively improves it. For quality-critical outputs.

Note: These patterns are not academic theories - they power real AI products. ReAct is used in Claude, ChatGPT agents, and every tool-using AI. Understanding them is essential for building AI applications.

ReAct Pattern - Reasoning + Acting

The Foundation of Every AI Agent

What is ReAct?

ReAct (Reasoning + Acting) is a pattern where the AI alternates between thinking (reasoning about what to do next) and acting (taking an action like calling a tool or searching). After each action, it observes the result and reasons again. This loop continues until the task is complete.

The ReAct Loop:

  THOUGHT -> ACTION -> OBSERVATION -> THOUGHT -> ACTION -> ...

Thought: "I need to find the user's order status.
          Let me search the orders database."

Action:  search_orders(user_id="12345")

Observation: "Order #789: Shipped, arriving March 5"

Thought: "Found the order. It is shipped and arriving
          March 5. Let me format this for the user."

Action:  respond("Your order #789 has been shipped...
                  Expected delivery: March 5.")

Why ReAct Works So Well:

  • Grounded reasoning: The AI does not hallucinate because it bases answers on real data from tool calls
  • Transparent process: You can see WHY the AI made each decision (the thought step)
  • Adaptive: If the first action does not give the right info, the AI can reason about what to try next
  • Error recovery: If a tool call fails, the AI can reason about alternatives

Real Example - Travel Agent:

User: "Plan a weekend trip to Goa under Rs 15,000"

Thought 1: I need to find flight prices to Goa.
Action 1:  search_flights("BLR", "GOA", "weekend")
Observe 1: Cheapest: Rs 3,500 round trip

Thought 2: Budget has Rs 11,500 left. Find hotels.
Action 2:  search_hotels("Goa", budget=6000, nights=2)
Observe 2: Beach Resort: Rs 5,000 for 2 nights

Thought 3: Rs 6,500 left for food and activities.
           Let me find top activities.
Action 3:  search_activities("Goa", budget=3000)
Observe 3: Water sports Rs 1,500, Old Goa tour Rs 500

Thought 4: Complete plan within budget. Let me compile.
Final:     "Here is your Goa weekend plan: ..."

Note: ReAct is the pattern behind every AI agent that uses tools - Claude Code, ChatGPT plugins, Cursor AI. If you understand ReAct, you understand how AI agents work.

Tree-of-Thought (ToT) - Exploring Multiple Paths

When One Path Is Not Enough

What is Tree-of-Thought?

Normal prompting follows ONE path of reasoning (linear). Tree-of-Thought (ToT) makes the AI explore MULTIPLE possible paths simultaneously, evaluate each, and select the best one. It is like a chess player thinking 3 moves ahead on multiple possible sequences.

Linear (Chain-of-Thought):
  Step 1 -> Step 2 -> Step 3 -> Answer

Tree-of-Thought:
  Step 1 -> Branch A -> Branch A.1 -> Evaluate: Score 7/10
         -> Branch B -> Branch B.1 -> Evaluate: Score 9/10 (Best!)
         -> Branch C -> Branch C.1 -> Evaluate: Score 4/10
  
  Select Branch B as the answer.

When to Use Tree-of-Thought:

  • Creative tasks: Writing, design, brainstorming - where multiple approaches exist
  • Strategy problems: Architecture decisions, business planning, game playing
  • Optimization: Finding the best solution among many possible ones
  • Debugging: When the bug could be in multiple places, explore each hypothesis

Practical ToT Prompt:

"I need to design a notification system for an e-commerce app.

Please generate 3 different architecture approaches.
For each approach:
1. Describe the architecture in 3-4 sentences
2. List pros and cons
3. Rate suitability 1-10 for our scale (10K users)

Then select the best approach and explain why."

AI generates:
  Approach 1: Polling-based (Score: 4/10)
  Approach 2: WebSocket (Score: 8/10)
  Approach 3: SSE + Message Queue (Score: 9/10)
  
  Recommendation: Approach 3 because...

ToT vs CoT:

FeatureChain-of-ThoughtTree-of-Thought
Paths exploredOneMultiple
Self-evaluationNoYes - scores each path
Best forLogic, mathCreative, strategic
Token costLowerHigher (multiple paths)

Note: ToT is expensive in tokens because it explores multiple paths. Use it for high-value decisions where quality matters more than speed or cost.

Self-Refine - AI That Improves Its Own Work

Generate, Critique, Improve - The Quality Loop

What is Self-Refine?

Self-Refine is a pattern where the AI generates a response, then critiques its own work, then improves it based on the critique. This loop can repeat 2-3 times, each iteration producing better output. It is like a writer who drafts, edits, and polishes their work.

The Self-Refine Loop:

  GENERATE (Draft 1)
      |
  CRITIQUE ("What is wrong with this?")
      |
  REFINE (Draft 2 - addresses the critique)
      |
  CRITIQUE ("Better, but still missing X")
      |
  REFINE (Draft 3 - final polished version)

Self-Refine Prompt Structure:

Step 1 - Generate:
"Write a product description for a wireless earphone
 targeted at college students in India."

Step 2 - Critique:
"Review this product description. Score it on:
 - Clarity (1-10)
 - Persuasiveness (1-10)
 - Target audience fit (1-10)
 List specific improvements needed."

Step 3 - Refine:
"Based on the critique above, rewrite the product
 description addressing every point. Make it better."

(Repeat Steps 2-3 if needed)

When Self-Refine Shines:

  • Writing: Blog posts, marketing copy, documentation - where polish matters
  • Code: Generate code, then review for bugs/performance, then fix
  • Emails: Draft, check tone and clarity, refine
  • Presentations: Create slides, critique structure and flow, improve

Diminishing Returns:

Self-Refine has diminishing returns after 2-3 iterations. The first critique catches major issues. The second catches minor ones. A third rarely adds value and wastes tokens. Know when to stop.

Note: Self-Refine consistently produces 20-40% better quality output compared to single-pass generation. The key is writing good critique prompts that catch real issues.

Combining Patterns - Building Intelligent Agents

Mix Patterns for Maximum Intelligence

Pattern Selection Guide:

Problem TypeBest PatternWhy
Need real-time dataReActCan call tools/APIs
Multiple solutions existToTExplores and evaluates
Quality-critical outputSelf-RefineIterative improvement
Complex agent taskReAct + Self-RefineAct on data, polish output
Architecture decisionsToT + Self-RefineExplore options, polish best

Combined Example - AI Code Reviewer:

Phase 1: ReAct (Gather Context)
  Thought: Need to understand the code change
  Action:  read_file("auth-middleware.js")
  Observe: [file contents]
  Thought: Need to check related tests
  Action:  read_file("auth-middleware.test.js")
  Observe: [test file contents]

Phase 2: ToT (Explore Issues)
  Branch A: Security review -> Found SQL injection risk
  Branch B: Performance review -> Found N+1 query
  Branch C: Code style review -> Minor naming issues
  Prioritize: Security > Performance > Style

Phase 3: Self-Refine (Polish Review)
  Draft review comment
  Critique: "Too harsh, need to be constructive"
  Refine: Rewrite with suggestions, not just criticism
  
Final: Professional, thorough code review

Other Notable Patterns:

  • Reflexion: Like Self-Refine but with memory across episodes. The AI remembers what went wrong before and avoids the same mistakes.
  • Plan-and-Solve: AI creates a plan first, then executes step by step. Good for complex multi-step tasks.
  • Role-Play Debate: AI takes multiple perspectives (for/against) and debates with itself to reach a balanced conclusion.

Note: The best AI systems combine multiple patterns. Use ReAct for data gathering, ToT for decision making, and Self-Refine for output quality. Match the pattern to the problem.

Interview Questions

Q: What is the ReAct pattern and why is it important?

ReAct (Reasoning + Acting) alternates between thinking, acting, and observing. The AI reasons about what to do, takes an action (tool call), observes the result, and reasons again. It is the foundation of every tool-using AI agent (Claude Code, ChatGPT plugins, etc.) because it grounds the AI in real data and enables adaptive multi-step problem solving.

Q: How does Tree-of-Thought differ from Chain-of-Thought?

Chain-of-Thought follows ONE linear reasoning path. Tree-of-Thought explores MULTIPLE paths simultaneously, evaluates each, and selects the best. CoT is cheaper and great for math/logic. ToT is more expensive but better for creative, strategic, and optimization problems where multiple valid approaches exist and you want the best one.

Q: What is Self-Refine and when should you use it?

Self-Refine is generate-critique-improve: the AI creates output, critiques its own work, then improves it based on the critique. Use it for quality-critical outputs like writing, code generation, and email drafting. It consistently produces 20-40% better results. Limit to 2-3 iterations due to diminishing returns.

Q: How would you combine these patterns in a real AI agent?

Use ReAct for data gathering (call tools, read files, search), ToT for decision making (explore options, evaluate, select best), and Self-Refine for output polish (draft, critique, improve). Example: a code review agent uses ReAct to read code, ToT to explore different issues, and Self-Refine to write a professional review comment.

Q: What are the trade-offs of using advanced prompt patterns?

Trade-offs: (1) Cost - more tokens used (especially ToT and Self-Refine). (2) Latency - more reasoning steps mean slower responses. (3) Complexity - harder to debug and maintain. (4) Diminishing returns - Self-Refine helps for 2-3 iterations, then adds cost without quality gain. Use basic patterns for simple tasks and reserve advanced patterns for complex, high-value problems.

Frequently Asked Questions

What is Prompt Patterns?

Learn the advanced prompting patterns that power modern AI agents: ReAct (Reasoning + Acting), Tree-of-Thought (exploring multiple solutions), and Self-Refine (iterative improvement). These patterns transform LLMs from simple responders into sophisticated problem solvers.

How does Prompt Patterns work?

Structured Thinking Frameworks for AI Beyond Basic Prompting: Basic prompting (zero-shot, few-shot, CoT) handles simple tasks well. But real-world problems are messy, multi-step, and require iteration .

Browse all AI & Automation topics →

Practice this on DevInterviewMaster

Read the full Prompt Patterns (ReAct, ToT, Self-Refine) breakdown with interactive demos, quizzes, and Hinglish notes.

Open the interactive topic →

800+ system-design, LLD, coding, and design-pattern topics. Unlock everything with Pro (₹499, one-time) or Ultimate (₹999, one-time) — lifetime access, no subscription.