AI & AutomationFree to read

OpenAI Agents SDK

Build Production AI Agents the OpenAI Way

Learn OpenAI's official Agents SDK for building multi-agent systems with tool use, handoffs, and guardrails. The evolution from Swarm to a production-ready agent framework.

What is OpenAI Agents SDK?

OpenAI's Official Framework for Building AI Agents

Analogy - The Office Team

Think of an office where you have a receptionist, an accountant, and a manager. When a customer calls, the receptionist handles simple queries. For billing questions, the receptionist hands off to the accountant. For escalations, the accountant hands off to the manager. Each person (agent) has their own skills (tools) and knows when to transfer. That is exactly how OpenAI Agents SDK works - multiple specialized agents coordinating through handoffs.

The SDK in Simple Terms:

OpenAI Agents SDK (released March 2025) is OpenAI's official Python framework for building AI agents. It evolved from Swarm (an experimental multi-agent framework) into a production-ready SDK. It gives you three core primitives:

Agents: LLMs configured with instructions, tools, and handoff targets
Handoffs: The ability for one agent to transfer a conversation to another specialized agent
Guardrails: Input/output validation to ensure agents behave safely

Why Not Just Use the API Directly?

You CAN build agents using raw OpenAI API calls + your own loop. But the SDK handles the boring stuff: tool call parsing, conversation state management, multi-agent routing, tracing, and error handling. Like using React instead of raw DOM manipulation - same result, much less code.

Note: The Agents SDK is OpenAI's opinionated way to build agents. It is lightweight, minimal abstraction, and designed to be easy to learn in minutes but powerful enough for production.

Core Concepts - Agents, Handoffs, Guardrails

The Three Building Blocks

1. Agent - The Worker:

An Agent wraps an LLM with specific instructions, tools, and handoff targets. Think of it as creating a specialized employee with a job description and a toolkit.

Instructions: System prompt telling the agent who it is and what it should do
Tools: Python functions the agent can call (decorated with @function_tool)
Handoffs: Other agents this agent can transfer conversations to
Model: Which OpenAI model to use (gpt-4o, gpt-4o-mini, etc.)

2. Handoffs - The Transfer Mechanism:

Handoffs let one agent transfer control to another. The first agent recognizes "this is not my area" and passes the conversation to a more specialized agent. Like a hospital receptionist sending you to the right specialist.

Agent A can hand off to Agent B, C, or D based on the conversation
The conversation history travels with the handoff
The receiving agent picks up seamlessly
Handoffs can be conditional or always available

3. Guardrails - The Safety Net:

Guardrails run in parallel with the agent to validate inputs and outputs. They can block harmful requests before the agent processes them, or catch unsafe outputs before they reach the user.

Input guardrails: Check if the user's message is safe/appropriate before processing
Output guardrails: Validate the agent's response before sending to user
Run asynchronously - do not slow down the agent
Can use a separate (cheaper/faster) LLM for validation

Note: These three primitives - Agents, Handoffs, and Guardrails - are intentionally simple. Complex behaviors emerge from composing these simple building blocks together.

The Agent Loop & Runner

How the SDK Orchestrates Everything

The Runner:

The Runner is the engine that drives your agents. You give it a starting agent and user input, and it manages the entire agentic loop:

Step 1: Send user message + tools to the LLM
Step 2: LLM responds with text or tool calls
Step 3: If tool call - execute the tool, append result, go back to Step 1
Step 4: If handoff - switch to new agent, go back to Step 1
Step 5: If text response - return final result to user

Three Ways to Run:

Runner.run(): Async, returns final result. Best for simple use cases.
Runner.run_sync(): Synchronous version. Good for scripts and notebooks.
Runner.run_streamed(): Streams events as they happen. Best for real-time UIs where you want to show progress.

Built-in Tracing:

Every agent run is automatically traced. You can see each LLM call, tool execution, handoff, and guardrail check in the OpenAI dashboard. This is invaluable for debugging. You can also export traces to external tools like Langfuse or Arize Phoenix.

Note: The Runner handles all the complexity of the agent loop - tool execution, handoffs, error recovery, and tracing. You just define agents and let the Runner orchestrate.

Building a Multi-Agent System

Real-World Architecture Example

Customer Support System:

Imagine building a customer support system for an Indian e-commerce company like Flipkart:

Triage Agent (gpt-4o-mini - cheap, fast)
  Instructions: "Route customer queries to the right team"
  Handoffs: [OrderAgent, PaymentAgent, ReturnAgent]
  |
  +-- Order Agent (gpt-4o)
  |   Instructions: "Handle order tracking and delivery queries"
  |   Tools: [track_order, get_delivery_status, contact_courier]
  |
  +-- Payment Agent (gpt-4o)
  |   Instructions: "Handle payment, refund, and billing issues"
  |   Tools: [check_payment_status, initiate_refund, verify_upi]
  |   Guardrails: [verify_customer_identity]
  |
  +-- Return Agent (gpt-4o)
      Instructions: "Handle product returns and exchanges"
      Tools: [check_return_eligibility, create_return_request]
      Handoffs: [PaymentAgent]  // for refund after return

How It Flows:

Customer: "Mera order 3 din se nahi aaya, refund chahiye"
Triage Agent recognizes: order + refund issue, hands off to Order Agent
Order Agent checks order status using track_order tool
Finds order is delayed. Since refund is needed, hands off to Payment Agent
Payment Agent verifies customer identity (guardrail), then initiates refund
Final response: "Aapka refund Rs 1,299 initiate ho gaya hai. 3-5 din mein UPI mein aa jayega."

Cost Optimization Tip:

Use gpt-4o-mini for the triage agent (cheap, fast, good at routing). Use gpt-4o for specialized agents that need deep reasoning. This way, simple queries cost almost nothing, and only complex ones use the expensive model.

Note: Start with 2-3 agents. Adding more agents increases complexity and debugging difficulty. Each agent should have a clear, focused responsibility.

Limitations & When Not to Use

Know the Boundaries

OpenAI Lock-in:

The Agents SDK only works with OpenAI models. If you want to use Claude, Gemini, Llama, or open-source models, you need a different framework. This is a deliberate choice by OpenAI.

Python Only:

Currently the SDK is Python-only. If your backend is Node.js, Java, Go, or Rust, you cannot use this SDK directly. You would need a Python microservice or use a different approach.

Handoff Limitations:

Handoffs are powerful but have constraints. The conversation history can grow large with multiple handoffs (increasing cost). Handoff loops (A hands to B, B hands back to A) can occur if not carefully designed. Always set a maximum number of turns.

When to Use Something Else:

Multi-model: Need Claude + GPT + Gemini? Use LangChain or custom orchestration
Complex workflows: Need DAGs, parallel execution, human-in-the-loop approvals? Consider LangGraph
Enterprise features: Need advanced auth, audit logs, compliance? Look at Semantic Kernel or enterprise platforms

Note: The Agents SDK is great for OpenAI-centric projects with straightforward multi-agent needs. For complex orchestration or multi-provider setups, consider alternatives.

Interview Questions - OpenAI Agents SDK

Q: What are the three core primitives of the OpenAI Agents SDK?

(1) Agents - LLMs configured with instructions, tools, and handoff targets. Each agent is a specialized worker. (2) Handoffs - mechanism for one agent to transfer conversation control to another agent. Enables multi-agent routing. (3) Guardrails - input/output validation that runs in parallel to ensure safety. These three simple primitives compose together to build complex agentic systems.

Q: How do handoffs work and why are they useful?

Handoffs let one agent transfer a conversation to another specialized agent. The first agent recognizes that a query is outside its expertise and routes it to the right specialist - like a hospital receptionist sending you to the correct department. The full conversation history travels with the handoff so the receiving agent has context. This enables building systems with specialized agents rather than one monolithic agent.

Q: What is the main limitation of the OpenAI Agents SDK?

The biggest limitation is vendor lock-in - it only works with OpenAI models. You cannot use Claude, Gemini, or open-source models. It is also Python-only. For multi-model setups or non-Python backends, you need alternatives like LangChain, LangGraph, or custom orchestration. However, if you are committed to OpenAI, the SDK is the simplest and most well-integrated option.

Frequently Asked Questions

What is OpenAI Agents SDK?

Learn OpenAI's official Agents SDK for building multi-agent systems with tool use, handoffs, and guardrails. The evolution from Swarm to a production-ready agent framework.

How does OpenAI Agents SDK work?

OpenAI's Official Framework for Building AI Agents Analogy - The Office Team Think of an office where you have a receptionist, an accountant, and a manager. When a customer calls, the receptionist handles simple queries.

Browse all AI & Automation topics →

Practice this on DevInterviewMaster

Read the full OpenAI Agents SDK breakdown with interactive demos, quizzes, and Hinglish notes.

Open the interactive topic →

800+ system-design, LLD, coding, and design-pattern topics. Unlock everything with Pro (₹499, one-time) or Ultimate (₹999, one-time) — lifetime access, no subscription.

OpenAI Agents SDK

What is OpenAI Agents SDK?

Core Concepts - Agents, Handoffs, Guardrails

The Agent Loop & Runner

Building a Multi-Agent System

Limitations & When Not to Use

Interview Questions - OpenAI Agents SDK

Frequently Asked Questions

Related topics

Practice this on DevInterviewMaster