AI & AutomationFree to read

AI Agents Intro

From Chatbots to Autonomous Problem Solvers

Learn how AI agents reason, plan, and take actions to solve complex tasks autonomously. Understand the ReAct and Plan-and-Execute paradigms that power modern agentic systems.

What Are AI Agents?

Beyond Simple Chatbots - Agents That Think and Act

Simple Definition:

An AI Agent is an LLM-powered system that can reason about a task, decide what tools to use, execute actions, and observe results to accomplish a goal - all without constant human hand-holding.

Think of it like this: a regular chatbot is like a customer care executive who can only answer questions from a script. An AI agent is like a smart employee who can think on their feet, use different tools (computer, phone, email), and figure out the best way to solve your problem.

Real-World Analogy - Swiggy Order System:

Imagine you tell a Swiggy support agent: "My order is late, find out why and get me a refund."

Regular Chatbot: "Sorry for the inconvenience. Please contact support." (just text response)
AI Agent: Checks order status API -> Sees delivery partner stuck -> Checks refund policy -> Initiates refund -> Sends you confirmation. All autonomously!

The Core Agent Loop:

Perceive: Understand the user request and current context
Reason: Think about what needs to be done (this is where LLM shines)
Plan: Decide which tool/action to use next
Act: Execute the chosen tool or action
Observe: Look at the result and decide if goal is achieved
Repeat: If not done, go back to step 2 with new information

Agent vs Chatbot vs Workflow - Key Differences:

Feature	Chatbot	Workflow	AI Agent
Decision Making	Rule-based	Pre-defined steps	Dynamic, LLM-driven
Tool Usage	None	Fixed sequence	Chooses dynamically
Error Handling	Falls back to human	Retry logic	Reasons and adapts
Flexibility	Low	Medium	High

Note: AI Agents are the next evolution of LLM applications. Instead of just generating text, they can take real actions in the world - search the web, write code, call APIs, and make decisions.

The ReAct Pattern - Reasoning + Acting

ReAct: The Most Popular Agent Architecture

What is ReAct?

ReAct (Reasoning + Acting) is a paradigm where the LLM alternates between thinking (reasoning about the problem) and acting (using tools). It was introduced in a 2022 paper by Google and Princeton researchers.

The key insight: when you let an LLM "think out loud" before acting, it makes much better decisions about which tool to use and what input to provide.

ReAct Loop - Step by Step:

User: "What is the current stock price of Reliance?"

--- Iteration 1 ---
Thought: I need to find the current stock price of Reliance Industries.
Action: stock_price("RELIANCE.NSE")
Observation: Current price: Rs 2,847. 52-week high: Rs 3,024.

--- Iteration 2 ---
Thought: I have the price. Now I should check analyst recommendations.
Action: search("Reliance Industries stock analysis 2026")
Observation: Analysts suggest Reliance is fairly valued...

--- Iteration 3 ---
Thought: I now have both price data and analyst opinions.
Action: respond_to_user(final_answer)

Why ReAct Works So Well:

Transparency: You can see reasoning at each step, making debugging easy
Grounding: Actions provide real data that prevents hallucination
Flexibility: Agent decides dynamically which tool to use
Error Recovery: If a tool call fails, the agent can reason about alternatives

ReAct Limitations:

Sequential Thinking: One step at a time - cannot plan ahead efficiently
Can Get Stuck: Sometimes loops doing the same action repeatedly
Token Cost: Each thought + action + observation consumes tokens
No Global Plan: Does not create an upfront plan, just reacts step-by-step

Note: ReAct is the default agent pattern used by LangChain, LlamaIndex, and most agent frameworks. It is simple, effective, and works well for tasks that need 2-5 tool calls.

Plan-and-Execute Pattern

Plan First, Then Execute - For Complex Multi-Step Tasks

What is Plan-and-Execute?

Instead of thinking one step at a time (like ReAct), Plan-and-Execute first creates a complete plan with all the steps needed, then executes each step one by one. If a step fails or new information changes things, it replans.

Think of it like this: ReAct is like navigating without a map, making decisions at each turn. Plan-and-Execute is like using Google Maps - you plan the full route first, then follow it, rerouting if there is a road block.

Example - Flipkart Price Comparison Agent:

User: "Compare iPhone 16 prices across Flipkart, Amazon, and Croma"

--- PLANNING PHASE ---
Plan:
  Step 1: Search Flipkart for iPhone 16 price and offers
  Step 2: Search Amazon for iPhone 16 price and offers
  Step 3: Search Croma for iPhone 16 price and offers
  Step 4: Check current bank offers on each platform
  Step 5: Compare all prices and recommend best deal

--- EXECUTION PHASE ---
Step 1: Rs 79,900 (MRP Rs 89,900), 10% HDFC discount
Step 2: Rs 78,499, 5% cashback on Amazon Pay
Step 3: Rs 81,990, exchange bonus Rs 5,000

--- REPLANNING ---
Exchange offers change the math. Updating plan...

Final Answer: Best deal is Amazon at Rs 78,499...

When to Use Which Pattern:

Scenario	Best Pattern	Why
Simple Q&A with 1-2 tools	ReAct	Planning overhead not worth it
Research tasks (5+ steps)	Plan-and-Execute	Needs structured approach
Data analysis pipelines	Plan-and-Execute	Clear sequential dependencies
Real-time customer support	ReAct	Fast response needed

Note: Plan-and-Execute shines for complex tasks with 5+ steps. The upfront planning reduces errors and wasted tool calls compared to pure ReAct.

Tool Calling - The Hands of an Agent

How Agents Interact With the Real World

What is Tool Calling?

Tool calling (function calling) is the mechanism by which an LLM can request to execute external functions. Instead of hallucinating an answer, the agent says "I need to call this function with these parameters" and the system executes it.

Modern LLMs like GPT-4, Claude, and Gemini output structured tool calls as part of their response.

Types of Tools Agents Use:

Information Retrieval: Web search, database queries, API calls, RAG retrieval
Data Processing: Calculator, code interpreter, data transformation
Communication: Send email, Slack message, push notification
System Actions: Create file, deploy code, update database record
Human-in-the-Loop: Ask user for clarification, get approval

Best Practices for Tool Design:

Clear Names: search_products not sp
Rich Descriptions: Explain when to use the tool and what it returns
Typed Parameters: Use JSON Schema with types and constraints
Helpful Errors: Return error messages the LLM can reason about
Limit Count: 10-15 tools max per agent
Idempotent: Tools should be safe to retry on failure

Note: The quality of your tool definitions directly impacts agent performance. Invest time in writing clear descriptions and well-typed parameters.

Building a Production Agent - Architecture

From Theory to Practice

Key Implementation Details:

System Prompt: Defines the agent persona, available tools, safety constraints, and output format. A bad system prompt = a bad agent.
Message History: Full conversation (thoughts + actions + observations) passed to LLM each iteration. Token usage grows with each step.
Max Iterations: Always set a limit (e.g., 10) to prevent infinite loops.
Token Budget: Track usage per run. Set cost limits to prevent runaway spending.
Timeout: Set per-tool and total timeouts.

Common Pitfalls:

Infinite Loops: Agent keeps calling same tool. Solution: detect repetition, force termination.
Hallucinated Tools: Agent invents non-existent tools. Solution: strong system prompt + validation.
Context Overflow: Too many iterations fill context. Solution: summarize older turns.
No Guardrails: Agent takes destructive actions. Solution: human-in-the-loop for dangerous actions.

Advanced Patterns:

Reflexion: Agent reviews its own output and self-corrects before finalizing
LATS (Tree Search): Explores multiple action paths and picks the best one
ReWOO: Plans all tool calls upfront, executes in parallel - faster but less flexible
Human-in-the-Loop: Agent pauses for human approval before high-stakes actions

Note: Never deploy an agent without proper guardrails. Always have human approval for irreversible actions (sending emails, modifying databases, deploying code).

Interview Questions - AI Agents

Q: What is the difference between ReAct and Plan-and-Execute?

ReAct interleaves thinking and acting one step at a time - it is reactive. Plan-and-Execute creates a full plan upfront, then executes each step, replanning if needed. ReAct is simpler for easy tasks; Plan-and-Execute is better for complex multi-step tasks needing a coherent strategy.

Q: How do you prevent an AI agent from going into infinite loops?

(1) Set a max iteration limit (e.g., 10 steps). (2) Detect repeated actions. (3) Set a token budget per run. (4) Implement timeouts at per-tool and total-run levels. (5) Use a supervisor LLM that can intervene.

Q: What guardrails would you put on a production AI agent?

(1) Human approval for irreversible actions. (2) Rate limiting on tool calls. (3) Input/output validation. (4) Sandboxed execution for code tools. (5) Audit logging. (6) Cost monitoring with kill switch. (7) Content filtering on outputs.

Q: How does tool calling work in modern LLMs?

Modern LLMs are fine-tuned to output structured tool call objects. You provide tool definitions (name, description, JSON Schema parameters) in the system message. The LLM outputs a structured JSON with tool name and arguments. The runtime executes it and feeds the result back as an observation message.

Frequently Asked Questions

What is AI Agents Intro?

Learn how AI agents reason, plan, and take actions to solve complex tasks autonomously. Understand the ReAct and Plan-and-Execute paradigms that power modern agentic systems.

How does AI Agents Intro work?

Beyond Simple Chatbots - Agents That Think and Act Simple Definition: An AI Agent is an LLM-powered system that can reason about a task , decide what tools to use , execute actions , and observe results to accomplish a goal - all without constant human hand-holding. Think of it like this: a regular chatbot is like a…

Browse all AI & Automation topics →

Practice this on DevInterviewMaster

Read the full AI Agents Intro breakdown with interactive demos, quizzes, and Hinglish notes.

Open the interactive topic →

800+ system-design, LLD, coding, and design-pattern topics. Unlock everything with Pro (₹499, one-time) or Ultimate (₹999, one-time) — lifetime access, no subscription.

AI Agents Intro

What Are AI Agents?

The ReAct Pattern - Reasoning + Acting

Plan-and-Execute Pattern

Tool Calling - The Hands of an Agent

Building a Production Agent - Architecture

Interview Questions - AI Agents

Frequently Asked Questions

Related topics

Practice this on DevInterviewMaster