Claude Code Plugins

Community-maintained marketplace

Feedback

openai-responses

@jezweb/claude-skills
16
0

|

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name openai-responses
description This skill provides comprehensive knowledge for working with OpenAI's Responses API, the unified stateful API for building agentic applications. It should be used when building AI agents that preserve reasoning across turns, integrating MCP servers for external tools, using built-in tools (Code Interpreter, File Search, Web Search, Image Generation), managing stateful conversations, implementing background processing, or migrating from Chat Completions API. Use when building agentic workflows, conversational AI with memory, tools-based applications, RAG systems, data analysis agents, or any application requiring OpenAI's reasoning models with persistent state. Covers both Node.js SDK and Cloudflare Workers implementations. Keywords: responses api, openai responses, stateful openai, openai mcp, code interpreter openai, file search openai, web search openai, image generation openai, reasoning preservation, agentic workflows, conversation state, background mode, chat completions migration, gpt-5, polymorphic outputs
license MIT

OpenAI Responses API

Status: Production Ready Last Updated: 2025-10-25 API Launch: March 2025 Dependencies: openai@5.19.1+ (Node.js) or fetch API (Cloudflare Workers)


What Is the Responses API?

The Responses API (/v1/responses) is OpenAI's unified interface for building agentic applications, launched in March 2025. It fundamentally changes how you interact with OpenAI models by providing stateful conversations and a structured loop for reasoning and acting.

Key Innovation: Preserved Reasoning State

Unlike Chat Completions where reasoning is discarded between turns, Responses keeps the notebook open. The model's step-by-step thought processes survive into the next turn, improving performance by approximately 5% on TAUBench and enabling better multi-turn interactions.

Why Use Responses Over Chat Completions?

Feature Chat Completions Responses API Benefit
State Management Manual (you track history) Automatic (conversation IDs) Simpler code, less error-prone
Reasoning Dropped between turns Preserved across turns Better multi-turn performance
Tools Client-side round trips Server-side hosted Lower latency, simpler code
Output Format Single message Polymorphic (messages, reasoning, tool calls) Richer debugging, better UX
Cache Utilization Baseline 40-80% better Lower costs, faster responses
MCP Support Manual integration Built-in Easy external tool connections

Quick Start (5 Minutes)

1. Get API Key

# Sign up at https://platform.openai.com/
# Navigate to API Keys section
# Create new key and save securely
export OPENAI_API_KEY="sk-proj-..."

Why this matters:

  • API key required for all requests
  • Keep secure (never commit to git)
  • Use environment variables

2. Install SDK (Node.js)

npm install openai
import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'What are the 5 Ds of dodgeball?',
});

console.log(response.output_text);

CRITICAL:

  • Always use server-side (never expose API key in client code)
  • Model defaults to gpt-5 (can use gpt-5-mini, gpt-4o, etc.)
  • input can be string or array of messages

3. Or Use Direct API (Cloudflare Workers)

// No SDK needed - use fetch()
const response = await fetch('https://api.openai.com/v1/responses', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${env.OPENAI_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'gpt-5',
    input: 'Hello, world!',
  }),
});

const data = await response.json();
console.log(data.output_text);

Why fetch?

  • No dependencies in edge environments
  • Full control over request/response
  • Works in Cloudflare Workers, Deno, Bun

Responses vs Chat Completions: Complete Comparison

When to Use Each

Use Responses API when:

  • ✅ Building agentic applications (reasoning + actions)
  • ✅ Need preserved reasoning state across turns
  • ✅ Want built-in tools (Code Interpreter, File Search, Web Search)
  • ✅ Using MCP servers for external integrations
  • ✅ Implementing conversational AI with automatic state management
  • ✅ Background processing for long-running tasks
  • ✅ Need polymorphic outputs (messages, reasoning, tool calls)

Use Chat Completions when:

  • ✅ Simple one-off text generation
  • ✅ Fully stateless interactions (no conversation continuity needed)
  • ✅ Legacy integrations (existing Chat Completions code)
  • ✅ Very simple use cases without tools

Architecture Differences

Chat Completions Flow:

User Input → Model → Single Message → Done
(Reasoning discarded, state lost)

Responses API Flow:

User Input → Model (preserved reasoning) → Polymorphic Outputs
            ↓ (server-side tools)
    Tool Call → Tool Result → Model → Final Response
(Reasoning preserved, state maintained)

Performance Benefits

Cache Utilization:

  • Chat Completions: Baseline performance
  • Responses API: 40-80% better cache utilization
  • Result: Lower latency + reduced costs

Reasoning Performance:

  • Chat Completions: Reasoning dropped between turns
  • Responses API: Reasoning preserved across turns
  • Result: 5% better on TAUBench (GPT-5 with Responses vs Chat Completions)

Stateful Conversations

Automatic State Management

The Responses API can automatically manage conversation state using conversation IDs.

Creating a Conversation

// Create conversation with initial message
const conversation = await openai.conversations.create({
  metadata: { user_id: 'user_123' },
  items: [
    {
      type: 'message',
      role: 'user',
      content: 'Hello!',
    },
  ],
});

console.log(conversation.id); // "conv_abc123..."

Using Conversation ID

// First turn
const response1 = await openai.responses.create({
  model: 'gpt-5',
  conversation: 'conv_abc123',
  input: 'What are the 5 Ds of dodgeball?',
});

console.log(response1.output_text);

// Second turn - model remembers previous context
const response2 = await openai.responses.create({
  model: 'gpt-5',
  conversation: 'conv_abc123',
  input: 'Tell me more about the first one',
});

console.log(response2.output_text);
// Model automatically knows "first one" refers to first D from previous turn

Why this matters:

  • No manual history tracking required
  • Reasoning state preserved between turns
  • Automatic context management
  • Lower risk of context errors

Manual State Management (Alternative)

If you need full control, you can manually manage history:

let history = [
  { role: 'user', content: 'Tell me a joke' },
];

const response = await openai.responses.create({
  model: 'gpt-5',
  input: history,
  store: true, // Optional: store for retrieval later
});

// Add response to history
history = [
  ...history,
  ...response.output.map(el => ({
    role: el.role,
    content: el.content,
  })),
];

// Next turn
history.push({ role: 'user', content: 'Tell me another' });

const secondResponse = await openai.responses.create({
  model: 'gpt-5',
  input: history,
});

When to use manual management:

  • Need custom history pruning logic
  • Want to modify conversation history programmatically
  • Implementing custom caching strategies

Built-in Tools (Server-Side)

The Responses API includes server-side hosted tools that eliminate costly backend round trips.

Available Tools

Tool Purpose Use Case
Code Interpreter Execute Python code Data analysis, calculations, charts
File Search RAG without vector stores Search uploaded files for answers
Web Search Real-time web information Current events, fact-checking
Image Generation DALL-E integration Create images from descriptions
MCP Connect external tools Stripe, databases, custom APIs

Code Interpreter

Execute Python code server-side for data analysis, calculations, and visualizations.

const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Calculate the mean, median, and mode of: 10, 20, 30, 40, 50',
  tools: [{ type: 'code_interpreter' }],
});

console.log(response.output_text);
// Model writes and executes Python code, returns results

Advanced Example: Data Analysis

const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Analyze this sales data and create a bar chart showing monthly revenue: [data here]',
  tools: [{ type: 'code_interpreter' }],
});

// Check output for code execution results
response.output.forEach(item => {
  if (item.type === 'code_interpreter_call') {
    console.log('Code executed:', item.input);
    console.log('Result:', item.output);
  }
});

Why this matters:

  • No need to run Python locally
  • Sandboxed execution environment
  • Automatic chart generation
  • Can process uploaded files

File Search (RAG Without Vector Stores)

Search through uploaded files without building your own RAG pipeline.

// 1. Upload files first (one-time setup)
const file = await openai.files.create({
  file: fs.createReadStream('knowledge-base.pdf'),
  purpose: 'assistants',
});

// 2. Use file search
const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'What does the document say about pricing?',
  tools: [
    {
      type: 'file_search',
      file_ids: [file.id],
    },
  ],
});

console.log(response.output_text);
// Model searches file and provides answer with citations

Supported File Types:

  • PDFs, Word docs, text files
  • Markdown, HTML
  • Code files (Python, JavaScript, etc.)
  • Max: 512MB per file

Web Search

Get real-time information from the web.

const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'What are the latest updates on GPT-5?',
  tools: [{ type: 'web_search' }],
});

console.log(response.output_text);
// Model searches web and provides current information with sources

Why this matters:

  • No cutoff date limitations
  • Automatic source citations
  • Real-time data access
  • No need for external search APIs

Image Generation (DALL-E)

Generate images directly in the Responses API.

const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Create an image of a futuristic cityscape at sunset',
  tools: [{ type: 'image_generation' }],
});

// Find image in output
response.output.forEach(item => {
  if (item.type === 'image_generation_call') {
    console.log('Image URL:', item.output.url);
  }
});

Models Available:

  • DALL-E 3 (default)
  • Various sizes and quality options

MCP Server Integration

The Responses API has built-in support for Model Context Protocol (MCP) servers, allowing you to connect external tools.

What Is MCP?

MCP is an open protocol that standardizes how applications provide context to LLMs. It allows you to:

  • Connect to external APIs (Stripe, databases, CRMs)
  • Use hosted MCP servers
  • Build custom tool integrations

Basic MCP Integration

const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Roll 2d6 dice',
  tools: [
    {
      type: 'mcp',
      server_label: 'dice',
      server_url: 'https://example.com/mcp',
    },
  ],
});

// Model discovers available tools on MCP server and uses them
console.log(response.output_text);

MCP with Authentication (OAuth)

const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Create a $20 payment link',
  tools: [
    {
      type: 'mcp',
      server_label: 'stripe',
      server_url: 'https://mcp.stripe.com',
      authorization: process.env.STRIPE_OAUTH_TOKEN,
    },
  ],
});

console.log(response.output_text);
// Model uses Stripe MCP server to create payment link

CRITICAL:

  • API does NOT store authorization tokens
  • Must provide token with each request
  • Use environment variables for security

Polymorphic Output: MCP Tool Calls

const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Roll 2d4+1',
  tools: [
    {
      type: 'mcp',
      server_label: 'dice',
      server_url: 'https://dmcp.example.com',
    },
  ],
});

// Inspect tool calls
response.output.forEach(item => {
  if (item.type === 'mcp_call') {
    console.log('Tool:', item.name);
    console.log('Arguments:', item.arguments);
    console.log('Output:', item.output);
  }
  if (item.type === 'mcp_list_tools') {
    console.log('Available tools:', item.tools);
  }
});

Output Types:

  • mcp_list_tools - Tools discovered on server
  • mcp_call - Tool invocation and result
  • message - Final response to user

Reasoning Preservation

How It Works

The Responses API preserves the model's internal reasoning state across turns, unlike Chat Completions which discards it.

Visual Analogy:

  • Chat Completions: Model has a scratchpad, writes reasoning, then tears out the page before responding
  • Responses API: Model keeps the scratchpad open, previous reasoning visible for next turn

Performance Impact

TAUBench Results (GPT-5):

  • Chat Completions: Baseline score
  • Responses API: +5% better (purely from preserved reasoning)

Why This Matters:

  • Better multi-turn problem solving
  • More coherent long conversations
  • Improved step-by-step reasoning
  • Fewer context errors

Reasoning Summaries (Free!)

The Responses API provides reasoning summaries at no additional cost.

const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Solve this complex math problem: [problem]',
});

// Inspect reasoning
response.output.forEach(item => {
  if (item.type === 'reasoning') {
    console.log('Model reasoning:', item.summary[0].text);
  }
  if (item.type === 'message') {
    console.log('Final answer:', item.content[0].text);
  }
});

Use Cases:

  • Debugging model decisions
  • Audit trails for compliance
  • Understanding model thought process
  • Building transparent AI systems

Background Mode (Long-Running Tasks)

For tasks that take longer than standard timeout limits, use background mode.

const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Analyze this 500-page document and summarize key findings',
  background: true,
  tools: [{ type: 'file_search', file_ids: [fileId] }],
});

// Returns immediately with status
console.log(response.status); // "in_progress"
console.log(response.id); // Use to check status later

// Poll for completion
const checkStatus = async (responseId) => {
  const result = await openai.responses.retrieve(responseId);
  if (result.status === 'completed') {
    console.log(result.output_text);
  } else if (result.status === 'failed') {
    console.error('Task failed:', result.error);
  } else {
    // Still running, check again later
    setTimeout(() => checkStatus(responseId), 5000);
  }
};

checkStatus(response.id);

When to Use:

  • Large file processing
  • Complex calculations
  • Multi-step research tasks
  • Data analysis on large datasets

Timeout Limits:

  • Standard mode: 60 seconds
  • Background mode: Up to 10 minutes

Polymorphic Outputs

The Responses API returns multiple output types instead of a single message.

Output Types

Type Description Example
message Text response to user Final answer, explanation
reasoning Model's internal thought process Step-by-step reasoning summary
code_interpreter_call Code execution Python code + results
mcp_call Tool invocation Tool name, args, output
mcp_list_tools Available tools Tool definitions from MCP server
file_search_call File search results Matched chunks, citations
web_search_call Web search results URLs, snippets
image_generation_call Image generation Image URL

Processing Polymorphic Outputs

const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Search the web for the latest AI news and summarize',
  tools: [{ type: 'web_search' }],
});

// Process different output types
response.output.forEach(item => {
  switch (item.type) {
    case 'reasoning':
      console.log('Reasoning:', item.summary[0].text);
      break;
    case 'web_search_call':
      console.log('Searched:', item.query);
      console.log('Sources:', item.results);
      break;
    case 'message':
      console.log('Response:', item.content[0].text);
      break;
  }
});

// Or use helper for text-only
console.log(response.output_text);

Why This Matters:

  • Better debugging (see all steps)
  • Audit trails (track all tool calls)
  • Richer UX (show progress to users)
  • Compliance (log all actions)

Migration from Chat Completions

Breaking Changes

Feature Chat Completions Responses API Migration
Endpoint /v1/chat/completions /v1/responses Update URL
Parameter messages input Rename parameter
State Manual (messages array) Automatic (conversation ID) Use conversation IDs
Tools tools array with functions Built-in types + MCP Update tool definitions
Output choices[0].message.content output_text or output array Update response parsing
Streaming data: {"choices":[...]} SSE with multiple item types Update stream parser

Migration Example

Before (Chat Completions):

const response = await openai.chat.completions.create({
  model: 'gpt-5',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Hello!' },
  ],
});

console.log(response.choices[0].message.content);

After (Responses):

const response = await openai.responses.create({
  model: 'gpt-5',
  input: [
    { role: 'developer', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Hello!' },
  ],
});

console.log(response.output_text);

Key Differences:

  1. chat.completions.createresponses.create
  2. messagesinput
  3. system role → developer role
  4. choices[0].message.contentoutput_text

When to Migrate

Migrate now if:

  • ✅ Building new applications
  • ✅ Need stateful conversations
  • ✅ Using agentic patterns (reasoning + tools)
  • ✅ Want better performance (preserved reasoning)

Stay on Chat Completions if:

  • ✅ Simple one-off generations
  • ✅ Legacy integrations
  • ✅ No need for state management

Error Handling

Common Errors and Solutions

1. Session State Not Persisting

Error:

Conversation state not maintained between turns

Cause:

  • Not using conversation IDs
  • Using different conversation IDs per turn

Solution:

// Create conversation once
const conv = await openai.conversations.create();

// Reuse conversation ID for all turns
const response1 = await openai.responses.create({
  model: 'gpt-5',
  conversation: conv.id, // ✅ Same ID
  input: 'First message',
});

const response2 = await openai.responses.create({
  model: 'gpt-5',
  conversation: conv.id, // ✅ Same ID
  input: 'Follow-up message',
});

2. MCP Server Connection Failed

Error:

{
  "error": {
    "type": "mcp_connection_error",
    "message": "Failed to connect to MCP server"
  }
}

Causes:

  • Invalid server URL
  • Missing or expired authorization token
  • Server not responding

Solutions:

// 1. Verify URL is correct
const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Test MCP',
  tools: [
    {
      type: 'mcp',
      server_label: 'test',
      server_url: 'https://api.example.com/mcp', // ✅ Full URL
      authorization: process.env.AUTH_TOKEN, // ✅ Valid token
    },
  ],
});

// 2. Test server URL manually
const testResponse = await fetch('https://api.example.com/mcp');
console.log(testResponse.status); // Should be 200

// 3. Check token expiration
console.log('Token expires:', parseJWT(token).exp);

3. Code Interpreter Timeout

Error:

{
  "error": {
    "type": "code_interpreter_timeout",
    "message": "Code execution exceeded time limit"
  }
}

Cause:

  • Code runs longer than 30 seconds

Solution:

// Use background mode for long-running code
const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Process this large dataset',
  background: true, // ✅ Extended timeout
  tools: [{ type: 'code_interpreter' }],
});

// Poll for results
const result = await openai.responses.retrieve(response.id);

4. Image Generation Rate Limit

Error:

{
  "error": {
    "type": "rate_limit_error",
    "message": "DALL-E rate limit exceeded"
  }
}

Cause:

  • Too many image generation requests

Solution:

// Implement retry with exponential backoff
const generateImage = async (prompt, retries = 3) => {
  try {
    return await openai.responses.create({
      model: 'gpt-5',
      input: prompt,
      tools: [{ type: 'image_generation' }],
    });
  } catch (error) {
    if (error.type === 'rate_limit_error' && retries > 0) {
      const delay = (4 - retries) * 1000; // 1s, 2s, 3s
      await new Promise(resolve => setTimeout(resolve, delay));
      return generateImage(prompt, retries - 1);
    }
    throw error;
  }
};

5. File Search Relevance Issues

Problem:

  • File search returns irrelevant results

Solution:

// Use more specific queries
const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Find sections about pricing in Q4 2024 specifically', // ✅ Specific
  // NOT: 'Find pricing' (too vague)
  tools: [{ type: 'file_search', file_ids: [fileId] }],
});

// Or filter results manually
response.output.forEach(item => {
  if (item.type === 'file_search_call') {
    const relevantChunks = item.results.filter(
      chunk => chunk.score > 0.7 // ✅ Only high-confidence matches
    );
  }
});

6. Cost Tracking Confusion

Problem:

  • Billing different than expected

Explanation:

  • Responses API bills for: input tokens + output tokens + tool usage + stored conversations
  • Chat Completions bills only: input tokens + output tokens

Solution:

// Monitor usage
const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Hello',
  store: false, // ✅ Don't store if not needed
});

console.log('Usage:', response.usage);
// {
//   prompt_tokens: 10,
//   completion_tokens: 20,
//   tool_tokens: 5,
//   total_tokens: 35
// }

7. Conversation Not Found

Error:

{
  "error": {
    "type": "invalid_request_error",
    "message": "Conversation conv_xyz not found"
  }
}

Causes:

  • Conversation ID typo
  • Conversation deleted
  • Conversation expired (90 days)

Solution:

// Verify conversation exists before using
const conversations = await openai.conversations.list();
const exists = conversations.data.some(c => c.id === 'conv_xyz');

if (!exists) {
  // Create new conversation
  const newConv = await openai.conversations.create();
  // Use newConv.id
}

8. Tool Output Parsing Failed

Problem:

  • Can't access tool outputs correctly

Solution:

// Use helper methods
const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Search for AI news',
  tools: [{ type: 'web_search' }],
});

// Helper: Get text-only output
console.log(response.output_text);

// Manual: Inspect all outputs
response.output.forEach(item => {
  console.log('Type:', item.type);
  console.log('Content:', item);
});

Production Patterns

Cost Optimization

1. Use Conversation IDs (Cache Benefits)

// ✅ GOOD: Reuse conversation ID
const conv = await openai.conversations.create();
const response1 = await openai.responses.create({
  model: 'gpt-5',
  conversation: conv.id,
  input: 'Question 1',
});
// 40-80% better cache utilization

// ❌ BAD: New manual history each time
const response2 = await openai.responses.create({
  model: 'gpt-5',
  input: [...previousHistory, newMessage],
});
// No cache benefits

2. Disable Storage When Not Needed

// For one-off requests
const response = await openai.responses.create({
  model: 'gpt-5',
  input: 'Quick question',
  store: false, // ✅ Don't store conversation
});

3. Use Smaller Models When Possible

// For simple tasks
const response = await openai.responses.create({
  model: 'gpt-5-mini', // ✅ 50% cheaper
  input: 'Summarize this paragraph',
});

Rate Limit Handling

const createResponseWithRetry = async (params, maxRetries = 3) => {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await openai.responses.create(params);
    } catch (error) {
      if (error.type === 'rate_limit_error' && i < maxRetries - 1) {
        const delay = Math.pow(2, i) * 1000; // Exponential backoff
        console.log(`Rate limited, retrying in ${delay}ms`);
        await new Promise(resolve => setTimeout(resolve, delay));
      } else {
        throw error;
      }
    }
  }
};

Monitoring and Logging

const monitoredResponse = async (input) => {
  const startTime = Date.now();

  try {
    const response = await openai.responses.create({
      model: 'gpt-5',
      input,
    });

    // Log success metrics
    console.log({
      status: 'success',
      latency: Date.now() - startTime,
      tokens: response.usage.total_tokens,
      model: response.model,
      conversation: response.conversation_id,
    });

    return response;
  } catch (error) {
    // Log error metrics
    console.error({
      status: 'error',
      latency: Date.now() - startTime,
      error: error.message,
      type: error.type,
    });
    throw error;
  }
};

Node.js vs Cloudflare Workers

Node.js Implementation

import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

export async function handleRequest(input: string) {
  const response = await openai.responses.create({
    model: 'gpt-5',
    input,
    tools: [{ type: 'web_search' }],
  });

  return response.output_text;
}

Pros:

  • Full SDK support
  • Type safety
  • Streaming helpers

Cons:

  • Requires Node.js runtime
  • Larger bundle size

Cloudflare Workers Implementation

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const { input } = await request.json();

    const response = await fetch('https://api.openai.com/v1/responses', {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${env.OPENAI_API_KEY}`,
        'Content-Type': 'application/json',
      },
      body: JSON.stringify({
        model: 'gpt-5',
        input,
        tools: [{ type: 'web_search' }],
      }),
    });

    const data = await response.json();

    return new Response(data.output_text, {
      headers: { 'Content-Type': 'text/plain' },
    });
  },
};

Pros:

  • No dependencies
  • Edge deployment
  • Faster cold starts

Cons:

  • Manual request building
  • No type safety without custom types

Always Do / Never Do

✅ Always Do

  1. Use conversation IDs for multi-turn interactions

    const conv = await openai.conversations.create();
    // Reuse conv.id for all related turns
    
  2. Handle all output types in polymorphic responses

    response.output.forEach(item => {
      if (item.type === 'reasoning') { /* log */ }
      if (item.type === 'message') { /* display */ }
    });
    
  3. Use background mode for long-running tasks

    const response = await openai.responses.create({
      background: true, // ✅ For tasks >30s
      ...
    });
    
  4. Provide authorization tokens for MCP servers

    tools: [{
      type: 'mcp',
      authorization: process.env.TOKEN, // ✅ Required
    }]
    
  5. Monitor token usage for cost control

    console.log(response.usage.total_tokens);
    

❌ Never Do

  1. Never expose API keys in client-side code

    // ❌ DANGER: API key in browser
    const response = await fetch('https://api.openai.com/v1/responses', {
      headers: { 'Authorization': 'Bearer sk-proj-...' }
    });
    
  2. Never assume single message output

    // ❌ BAD: Ignores reasoning, tool calls
    console.log(response.output[0].content);
    
    // ✅ GOOD: Use helper or check all types
    console.log(response.output_text);
    
  3. Never reuse conversation IDs across users

    // ❌ DANGER: User A sees User B's conversation
    const sharedConv = 'conv_123';
    
  4. Never ignore error types

    // ❌ BAD: Generic error handling
    try { ... } catch (e) { console.log('error'); }
    
    // ✅ GOOD: Type-specific handling
    catch (e) {
      if (e.type === 'rate_limit_error') { /* retry */ }
      if (e.type === 'mcp_connection_error') { /* alert */ }
    }
    
  5. Never poll faster than 1 second for background tasks

    // ❌ BAD: Too frequent
    setInterval(() => checkStatus(), 100);
    
    // ✅ GOOD: Reasonable interval
    setInterval(() => checkStatus(), 5000);
    

References

Official Documentation

Skill Resources

  • templates/ - Working code examples
  • references/responses-vs-chat-completions.md - Feature comparison
  • references/mcp-integration-guide.md - MCP server setup
  • references/built-in-tools-guide.md - Tool usage patterns
  • references/stateful-conversations.md - Conversation management
  • references/migration-guide.md - Chat Completions → Responses
  • references/top-errors.md - Common errors and solutions

Next Steps

  1. ✅ Read templates/basic-response.ts - Simple example
  2. ✅ Try templates/stateful-conversation.ts - Multi-turn chat
  3. ✅ Explore templates/mcp-integration.ts - External tools
  4. ✅ Review references/top-errors.md - Avoid common pitfalls
  5. ✅ Check references/migration-guide.md - If migrating from Chat Completions

Happy building with the Responses API! 🚀