name

openai-agents-sdk

description

Expert guidance for building multi-agent AI applications using the OpenAI Agents SDK for Python. Use when (1) creating agents with handoffs, tools, guardrails, or sessions, (2) implementing structured outputs with Pydantic models, (3) building agentic workflows, (4) debugging and tracing agent execution, (5) working with provider-agnostic LLM applications (OpenAI, Anthropic, LiteLLM), or (6) implementing customer support, legal research, financial analysis, or autonomous task completion systems.

OpenAI Agents SDK

A lightweight, powerful framework for building multi-agent AI workflows in Python.

Installation

pip install openai-agents

# With LiteLLM for multi-provider support
pip install openai-agents[litellm]

Set your API key:

export OPENAI_API_KEY=your-key

Quick Start

from agents import Agent, Runner

agent = Agent(
    name="Assistant",
    instructions="You are a helpful assistant",
)

result = Runner.run_sync(agent, "Hello!")
print(result.final_output)

Core Primitives

Primitive	Purpose
Agent	LLM with instructions and tools
Tools	Python functions agents can call
Handoffs	Delegate to specialized agents
Guardrails	Validate inputs/outputs
Sessions	Maintain conversation history
Runner	Execute agent workflows

Agent with Tools

from agents import Agent, Runner, function_tool

@function_tool
def get_weather(city: str) -> str:
    """Get weather for a city."""
    return f"Weather in {city}: Sunny, 72F"

agent = Agent(
    name="Weather Bot",
    instructions="Help with weather questions",
    tools=[get_weather],
)

result = Runner.run_sync(agent, "What's the weather in NYC?")

Structured Output

from pydantic import BaseModel
from agents import Agent, Runner

class WeatherReport(BaseModel):
    city: str
    temperature: float
    conditions: str

agent = Agent(
    name="Weather Reporter",
    instructions="Extract weather data",
    output_type=WeatherReport,
)

result = Runner.run_sync(agent, "NYC is sunny and 72 degrees")
report: WeatherReport = result.final_output

Multi-Agent Handoffs

from agents import Agent

billing = Agent(name="Billing", instructions="Handle billing issues")
technical = Agent(name="Technical", instructions="Handle tech issues")

triage = Agent(
    name="Triage",
    instructions="Route to billing or technical support",
    handoffs=[billing, technical],
)

Sessions (Conversation Memory)

from agents import Agent, Runner
from agents.extensions.session import SQLiteSession

session = SQLiteSession("user_123", "chats.db")
agent = Agent(name="Assistant", instructions="Be helpful")

# Remembers across calls
result = await Runner.run(agent, "My name is Alice", session=session)
result = await Runner.run(agent, "What's my name?", session=session)

Custom OpenAI-Compatible Providers

Use OpenAIChatCompletionsModel for custom OpenAI-compatible APIs (Google Gemini, local models, etc.):

import os
from openai import AsyncOpenAI
from agents import Agent, OpenAIChatCompletionsModel, set_tracing_disabled

# Disable tracing if not needed
set_tracing_disabled(True)

# Configure client for Google's Generative Language API
client = AsyncOpenAI(
    api_key=os.getenv("GEMINI_API_KEY"),
    base_url="https://generativelanguage.googleapis.com/v1beta/openai/",
)

# Create model with custom client
llm_model = OpenAIChatCompletionsModel(
    model="gemini-2.5-flash",
    openai_client=client,
)

# Use in agent
agent = Agent(
    name="Gemini Agent",
    instructions="You are a helpful assistant",
    model=llm_model,
    tools=[...],
)

Multi-Provider (LiteLLM)

Alternative approach using LiteLLM for multi-provider support:

from agents import Agent

# Anthropic Claude
claude = Agent(
    name="Claude",
    instructions="Be helpful",
    model="litellm/anthropic/claude-sonnet-4-20250514",
)

# Google Gemini via LiteLLM
gemini = Agent(
    name="Gemini",
    instructions="Be helpful",
    model="litellm/gemini/gemini-2.5-flash",
)

Fallback Model Pattern

Create a custom model that falls back to a secondary provider when the primary API is exhausted:

import os
from collections.abc import AsyncIterator

from openai import AsyncOpenAI, RateLimitError, APIStatusError
from agents import Agent, OpenAIChatCompletionsModel, ModelSettings
from agents.models.interface import Model, ModelTracing
from agents.agent_output import AgentOutputSchemaBase
from agents.handoffs import Handoff
from agents.items import ModelResponse, TResponseInputItem, TResponseStreamEvent
from agents.tool import Tool


class FallbackModel(Model):
    """Model wrapper that falls back to secondary provider on rate limit errors."""

    def __init__(self, primary_model: Model, fallback_model: Model):
        self.primary_model = primary_model
        self.fallback_model = fallback_model
        self._use_fallback = False

    async def get_response(
        self,
        system_instructions: str | None,
        input: str | list[TResponseInputItem],
        model_settings: ModelSettings,
        tools: list[Tool],
        output_schema: AgentOutputSchemaBase | None,
        handoffs: list[Handoff],
        tracing: ModelTracing,
        *,
        previous_response_id: str | None,
        conversation_id: str | None,
        prompt=None,
    ) -> ModelResponse:
        if self._use_fallback and self.fallback_model:
            return await self.fallback_model.get_response(
                system_instructions, input, model_settings, tools,
                output_schema, handoffs, tracing,
                previous_response_id=previous_response_id,
                conversation_id=conversation_id,
                prompt=prompt,
            )

        try:
            return await self.primary_model.get_response(
                system_instructions, input, model_settings, tools,
                output_schema, handoffs, tracing,
                previous_response_id=previous_response_id,
                conversation_id=conversation_id,
                prompt=prompt,
            )
        except (RateLimitError, APIStatusError) as e:
            is_quota_error = (
                isinstance(e, RateLimitError) or
                (isinstance(e, APIStatusError) and e.status_code in (429, 503)) or
                "quota" in str(e).lower()
            )
            if is_quota_error and self.fallback_model:
                print(f"[Fallback] Primary API exhausted, switching to fallback")
                self._use_fallback = True
                return await self.fallback_model.get_response(
                    system_instructions, input, model_settings, tools,
                    output_schema, handoffs, tracing,
                    previous_response_id=previous_response_id,
                    conversation_id=conversation_id,
                    prompt=prompt,
                )
            raise

    def stream_response(self, *args, **kwargs) -> AsyncIterator[TResponseStreamEvent]:
        # Implement similar fallback logic for streaming
        pass


# Usage: Gemini primary, DeepSeek fallback
gemini_client = AsyncOpenAI(
    api_key=os.getenv("GEMINI_API_KEY"),
    base_url="https://generativelanguage.googleapis.com/v1beta/openai/",
)
deepseek_client = AsyncOpenAI(
    api_key=os.getenv("DEEPSEEK_API_KEY"),
    base_url="https://api.deepseek.com",
)

gemini_model = OpenAIChatCompletionsModel(model="gemini-2.5-flash", openai_client=gemini_client)
deepseek_model = OpenAIChatCompletionsModel(model="deepseek-chat", openai_client=deepseek_client)

fallback_model = FallbackModel(gemini_model, deepseek_model)

agent = Agent(
    name="MyAgent",
    instructions="You are a helpful assistant",
    model=fallback_model,
    tools=[...],
)

DeepSeek Integration

DeepSeek offers two main models via OpenAI-compatible API:

Model	Use Case	Tool Support
`deepseek-chat`	General chat, tool calling	Yes
`deepseek-reasoner`	Complex reasoning (requires `reasoning_content` field)	Limited

Important: Use deepseek-chat for agents with tools. The deepseek-reasoner model requires special handling for tool calls.

from openai import AsyncOpenAI
from agents import Agent, OpenAIChatCompletionsModel

deepseek_client = AsyncOpenAI(
    api_key=os.getenv("DEEPSEEK_API_KEY"),
    base_url="https://api.deepseek.com",
)

# Use deepseek-chat for tool-calling agents
model = OpenAIChatCompletionsModel(
    model="deepseek-chat",  # NOT deepseek-reasoner for tool calls
    openai_client=deepseek_client,
)

agent = Agent(
    name="DeepSeek Agent",
    instructions="You are helpful",
    model=model,
    tools=[my_tool],
)

Reference Documentation

For detailed guidance on specific topics:

agents.md - Agent creation, dynamic instructions, context injection, lifecycle hooks
tools.md - Function tools, hosted tools, Pydantic validation, error handling
handoffs.md - Multi-agent delegation, input filters, conditional handoffs
guardrails.md - Input/output validation, tripwires, execution modes
sessions.md - SQLite, SQLAlchemy, encrypted sessions, memory operations
running.md - Runner class, streaming, configuration, exception handling
models.md - Model settings, LiteLLM integration, multi-provider workflows
patterns.md - Customer support, research assistant, RAG, human-in-the-loop

Common Patterns

Customer Support

from agents import Agent
from agents.extensions.handoff_prompt import RECOMMENDED_PROMPT_PREFIX

triage = Agent(
    name="Triage",
    instructions=f"""{RECOMMENDED_PROMPT_PREFIX}
Route customers to billing, technical, or sales.""",
    handoffs=[billing_agent, tech_agent, sales_agent],
)

Input Validation

from agents import Agent, input_guardrail, GuardrailFunctionOutput

@input_guardrail
async def block_harmful(ctx, agent, input_text):
    is_harmful = check_content(input_text)
    return GuardrailFunctionOutput(
        output_info={"harmful": is_harmful},
        tripwire_triggered=is_harmful,
    )

agent = Agent(
    name="Safe Agent",
    instructions="Be helpful",
    input_guardrails=[block_harmful],
)

Streaming Responses

from agents import Agent, Runner

async def stream():
    result = await Runner.run_streamed(agent, "Write a story")
    async for event in result.stream_events():
        if event.type == "text_delta":
            print(event.delta, end="")

Key Classes

Class	Import	Purpose
`Agent`	`from agents import Agent`	Create agents
`Runner`	`from agents import Runner`	Execute agents
`function_tool`	`from agents import function_tool`	Decorator for tools
`handoff`	`from agents import handoff`	Custom handoff config
`input_guardrail`	`from agents import input_guardrail`	Input validation
`output_guardrail`	`from agents import output_guardrail`	Output validation
`SQLiteSession`	`from agents.extensions.session import SQLiteSession`	Conversation memory
`RunConfig`	`from agents import RunConfig`	Execution settings
`OpenAIChatCompletionsModel`	`from agents import OpenAIChatCompletionsModel`	Custom OpenAI-compatible models
`set_tracing_disabled`	`from agents import set_tracing_disabled`	Disable tracing
`Model`	`from agents.models.interface import Model`	Base class for custom models
`ModelTracing`	`from agents.models.interface import ModelTracing`	Tracing configuration enum
`ModelResponse`	`from agents.items import ModelResponse`	Model response type
`TResponseInputItem`	`from agents.items import TResponseInputItem`	Input item type
`TResponseStreamEvent`	`from agents.items import TResponseStreamEvent`	Stream event type