| name | exa-websets-search |
| description | Use for creating websets, running searches, importing CSV data, managing items, and adding enrichments to extract structured data. |
Exa Websets Search
Comprehensive webset management including creation, search, imports, items, and enrichments.
Use --help to see available commands and verify usage before running:
exa-ai <command> --help
Working with Complex Shell Commands
When using the Bash tool with complex shell syntax, follow these best practices for reliability:
- Run commands directly: Capture JSON output directly rather than nesting command substitutions
- Parse in subsequent steps: Use
jqto parse output in a follow-up command if needed - Avoid nested substitutions: Complex nested
$(...)can be fragile; break into sequential steps
Example:
# Less reliable: nested command substitution
webset_id=$(exa-ai webset-create --search '{"query":"tech startups","count":1}' | jq -r '.webset_id')
# More reliable: run directly, then parse
exa-ai webset-create --search '{"query":"tech startups","count":1}'
# Then in a follow-up command if needed:
webset_id=$(cat output.json | jq -r '.webset_id')
Critical Requirements
Universal rules across all operations:
- Start with minimal counts (1-5 results): Initial searches are test spikes to validate quality. ALWAYS default to count:1 unless user explicitly requests more.
- Three-step workflow - Validate, Expand, Enrich: (1) Create with count:1 to test search quality, (2) Expand search count if results are good, (3) Add enrichments only after validated, expanded results.
- No enrichments during validation: Never add enrichments when testing with count:1. Validate search quality first, expand count second, add enrichments last.
- Avoid --wait flag: Do NOT use
--waitflag in commands. It's designed for human interactive use, not automated workflows. - Maintain query AND criteria consistency: When scaling up or appending searches, use the EXACT same query AND criteria that you validated. Omitting criteria causes Exa to regenerate them on-the-fly, producing inconsistent results.
Credit Costs
Pricing: $50/month = 8,000 credits ($0.00625 per credit)
Cost per operation:
- Each webset item: 10 credits ($0.0625)
- Standard enrichment: 2 credits ($0.0125)
- Email enrichment: 5 credits ($0.03125)
Why start with count:1: Testing with 1 result costs 10 credits ($0.0625). A failed search with count:100 wastes 1,000 credits ($6.25) - 100x more expensive.
Why enrich last: Enriching bad results wastes credits. Always validate first, expand second, enrich last.
Quick Command Reference
exa-ai --help
Output Formats
All exa-ai webset commands support output formats:
- JSON (default): Pipe to
jqto extract specific fields (e.g.,| jq -r '.webset_id') - toon: Compact, readable format for direct viewing
- pretty: Human-friendly formatted output
- text: Plain text output
Webset Management
Core operations for managing webset collections.
Entity Types
company: Companies and organizationsperson: Individual peoplearticle: News articles and blog postsresearch_paper: Academic paperscustom: Custom entity types (define with --entity-description)
Create Webset from Search
webset_id=$(exa-ai webset-create \
--search '{"query":"AI startups in San Francisco","count":1}' | jq -r '.webset_id')
Create with Detailed Search Criteria
exa-ai webset-create \
--search '{
"query": "Technology companies focused on developer tools",
"count": 1,
"entity": {
"type": "company"
},
"criteria": [
{
"description": "Companies with 50-500 employees indicating growth stage"
},
{
"description": "Primary product is developer tools, APIs, or infrastructure"
}
]
}'
Create with Custom Entity
exa-ai webset-create \
--search '{
"query": "Nonprofits focused on economic justice",
"count": 1,
"entity": {
"type": "custom",
"description": "nonprofit"
},
"criteria": [
{
"description": "Primary focus on economic justice"
},
{
"description": "Annual operating budget between $1M and $10M"
}
]
}'
Create from CSV Import
import_id=$(exa-ai import-create companies.csv \
--count 100 \
--title "Companies" \
--format csv \
--entity-type company | jq -r '.import_id')
exa-ai webset-create --import $import_id
Three-Step Workflow: Validate → Expand → Enrich
Step 1: VALIDATE - Create with count:1 (NO enrichments)
webset_id=$(exa-ai webset-create \
--search '{"query":"tech startups","count":1}' | jq -r '.webset_id')
exa-ai webset-item-list $webset_id
⚠️ REQUIRED: Manually verify the result is relevant before continuing. If not, adjust the query and start over.
Step 2: EXPAND - Gradually increase count with verification at each stage
# Expand to 2 results (use same query and criteria from validation)
exa-ai webset-search-create $webset_id \
--query "tech startups" \
--behavior override \
--count 2
exa-ai webset-item-list $webset_id
⚠️ REQUIRED: Check quality at this scale. Repeat with larger counts (5, 10, 25, 50, 100) until you reach your target.
Loop this step: Keep expanding gradually (2 → 5 → 10 → 25 → 50 → 100) with verification between each expansion.
Step 3: ENRICH - Add enrichments only after confirming quality
exa-ai enrichment-create $webset_id \
--description "Company website" --format url --title "Website"
exa-ai enrichment-create $webset_id \
--description "Employee count" --format text --title "Team Size"
Interpreting Criterion Success Rates
CRITICAL: Criteria are evaluated conditionally - when one criterion fails, others may not run. A low success rate doesn't indicate that criterion is restrictive; it means OTHER criteria are filtering results first. Only interpret a low success rate as "restrictive" when OTHER criteria have high success rates (>80%).
Manage Websets
exa-ai webset-list
exa-ai webset-get ws_abc123
exa-ai webset-update ws_abc123 --metadata '{"status":"active","owner":"team"}'
exa-ai webset-delete ws_abc123
Search Operations
Run searches within a webset to add new items.
Search Behavior
Control how new search results are combined with existing items:
append (default): Add new items to existing collection
- Requires previous search results to exist
- Error if webset has no previous search: "No previous search found"
- Default behavior when
--behavioris omitted
override: Replace entire collection with search results
- REQUIRED for first search on a webset
- Use when starting fresh or completely replacing results
CRITICAL - First search requirement: The first webset-search-create on a webset MUST explicitly use --behavior override. Since the default is append, omitting --behavior will fail with "No previous search found" error. Subsequent searches can omit the flag (defaults to append).
Query and Criteria Consistency
CRITICAL: When appending or scaling up searches, maintain IDENTICAL query and criteria from your validated search.
Why This Matters
Using different criteria causes Exa to generate new search parameters on-the-fly, which:
- Violates consistency and produces mismatched results
- Reduces result quality compared to validated criteria
- Makes it impossible to reproduce or debug issues
Complete Example
# Step 1: Test search with criteria (MUST use override for first search)
exa-ai webset-search-create ws_abc123 \
--query "Progressive nonprofits in California" \
--behavior override \
--count 1 \
--criteria '[
{"description": "Annual budget between $1M and $10M"},
{"description": "Primary focus on economic justice, affordability, living wages, or worker power"},
{"description": "Established communications, narrative strategy, or messaging function"}
]'
# Verify quality, then append MORE results with IDENTICAL query and criteria
exa-ai webset-search-create ws_abc123 \
--query "Progressive nonprofits in California" \
--behavior append \
--count 5 \
--criteria '[
{"description": "Annual budget between $1M and $10M"},
{"description": "Primary focus on economic justice, affordability, living wages, or worker power"},
{"description": "Established communications, narrative strategy, or messaging function"}
]'
Best Practice: Save Criteria to File
# Create criteria file once
cat > criteria.json <<'EOF'
[
{"description": "Annual budget between $1M and $10M"},
{"description": "Primary focus on economic justice, affordability, living wages, or worker power"},
{"description": "Established communications, narrative strategy, or messaging function"}
]
EOF
# Use consistently across all searches (first search needs override)
exa-ai webset-search-create ws_abc123 \
--query "Progressive nonprofits in California" \
--behavior override \
--count 1 \
--criteria @criteria.json
exa-ai webset-search-create ws_abc123 \
--query "Progressive nonprofits in California" \
--behavior append \
--count 5 \
--criteria @criteria.json
Basic Search Operations
# First search on webset (must use override)
exa-ai webset-search-create ws_abc123 \
--query "AI startups in San Francisco" \
--behavior override \
--count 1
# Append to collection
exa-ai webset-search-create ws_abc123 \
--query "SaaS companies Series B" \
--behavior append \
--count 1
# Override collection
exa-ai webset-search-create ws_abc123 \
--query "top tech companies" \
--behavior override \
--count 1
Monitor Search Progress
webset_id="ws_abc123"
search_id=$(exa-ai webset-search-create $webset_id \
--query "fintech startups" \
--behavior override \
--count 1 | jq -r '.search_id')
exa-ai webset-search-get $webset_id $search_id
exa-ai webset-search-cancel $webset_id $search_id
CSV Imports
Upload CSV files to create websets from existing datasets.
CSV Format Requirements
- First row contains column headers
- Each row represents one entity
- Include at minimum a name or identifier column
Basic Import Workflow
# Create import
import_id=$(exa-ai import-create companies.csv \
--count 100 \
--title "Tech Companies" \
--format csv \
--entity-type company | jq -r '.import_id')
# Create webset from import
webset_id=$(exa-ai webset-create --import $import_id | jq -r '.webset_id')
Custom Entity Type
exa-ai import-create products.csv \
--count 5 \
--title "Product List" \
--format csv \
--entity-type custom \
--entity-description "Consumer electronics products"
Manage Imports
exa-ai import-list
exa-ai import-get imp_abc123
Import vs Search Scope
--import loads data for enrichment. search.scope filters searches to specific sources.
⚠️ NEVER use same ID in both - returns 400:
# ❌ INVALID
exa-ai webset-create --import import_abc \
--search '{"scope":[{"source":"import","id":"import_abc"}]}'
# ✅ Scoped search only
exa-ai webset-create \
--search '{"query":"CEOs","scope":[{"source":"import","id":"import_abc"}]}'
# ✅ Relationship traversal
exa-ai webset-search-create ws_abc --query "investors" --behavior override \
--scope '[{"source":"webset","id":"webset_abc","relationship":{"definition":"investors of","limit":5}}]'
Item Management
Manage individual items in websets.
Basic Operations
# List items
exa-ai webset-item-list ws_abc123
exa-ai webset-item-list ws_abc123 --output-format pretty
# Get item details
exa-ai webset-item-get item_xyz789
# Delete item
exa-ai webset-item-delete item_xyz789
Extract Item Data
# Get all item IDs
exa-ai webset-item-list ws_abc123 --output-format json | jq -r '.[].id'
# Count items
exa-ai webset-item-list ws_abc123 --output-format json | jq 'length'
Enrichments
Add structured data fields to all items in a webset using AI extraction.
Enrichment Formats
- text: Free-form text extraction (employee count, description, technology stack)
- url: Extract URLs only (website, LinkedIn, GitHub)
- options: Categorical data with predefined options (industry, funding stage, size range)
Key Concepts
- description: The primary AI prompt that drives extraction. This tells the enrichment WHAT to extract. (Can be updated)
- instructions: Optional additional guidance on HOW to extract or format. (Creation-only, cannot be updated)
- Use
exa-ai enrichment-create --helpandexa-ai enrichment-update --helpto see all available parameters
Create Enrichments
# Text enrichment
exa-ai enrichment-create ws_abc123 \
--description "Number of employees as of latest data" \
--format text \
--title "Team Size"
# URL enrichment
exa-ai enrichment-create ws_abc123 \
--description "Primary company website URL" \
--format url \
--title "Website"
# Options enrichment
exa-ai enrichment-create ws_abc123 \
--description "Current funding stage" \
--format options \
--options '[
{"label":"Pre-seed"},
{"label":"Seed"},
{"label":"Series A"},
{"label":"Series B"},
{"label":"Series C+"},
{"label":"Public"}
]' \
--title "Funding Stage"
Use Options from File
cat > industries.json <<'EOF'
[
{"label": "SaaS"},
{"label": "Developer Tools"},
{"label": "AI/ML"},
{"label": "Fintech"},
{"label": "Healthcare"},
{"label": "Other"}
]
EOF
exa-ai enrichment-create ws_abc123 \
--description "Primary industry or sector" \
--format options \
--options @industries.json \
--title "Industry"
Add Instructions for Precision
exa-ai enrichment-create ws_abc123 \
--description "Technology stack" \
--format text \
--instructions "Focus only on backend technologies and databases. Ignore frontend frameworks." \
--title "Backend Tech"
Manage Enrichments
# List enrichments
exa-ai enrichment-list ws_abc123
exa-ai enrichment-list ws_abc123 --output-format pretty
# Get details
exa-ai enrichment-get ws_abc123 enr_xyz789
# Update extraction prompt (description)
exa-ai enrichment-update ws_abc123 enr_xyz789 \
--description "Exact employee count from most recent source"
# Update format and options
exa-ai enrichment-update ws_abc123 enr_xyz789 \
--format options \
--options '[{"label":"Small"},{"label":"Medium"},{"label":"Large"}]'
# Update metadata
exa-ai enrichment-update ws_abc123 enr_xyz789 \
--metadata '{"source":"manual","updated":"2024-01-15"}'
# Note: Cannot update --instructions or --title (creation-only parameters)
# To change instructions, delete and recreate the enrichment
# Delete
exa-ai enrichment-delete ws_abc123 enr_xyz789
# Cancel running enrichment
exa-ai enrichment-cancel ws_abc123 enr_xyz789
Common Enrichment Patterns
Company websets: Website (url), Team Size (text), Funding Stage (options), Industry (options)
Person websets: LinkedIn (url), Job Title (text), Company (text), Location (text)
Research papers: Publication Year (text), Authors (text), Venue (text), Research Area (options)
Best Practices
- Start small, validate, then scale: Always use count:1 for initial searches
- Follow three-step workflow: Validate → Expand → Enrich
- Never enrich during validation: Only enrich after validated, expanded results
- Avoid --wait flag: Do NOT use
--waitin commands. It's designed for human interactive use, not automated workflows. - Maintain query AND criteria consistency: When appending or scaling up, use IDENTICAL query and criteria from validated search. Save criteria to file for consistency.
- CRITICAL - First search must use override: The library defaults to
--behavior append. First search on a webset MUST explicitly use--behavior overrideor it will fail with "No previous search found" error. - Use correct parameter names:
- Use
--behavior appendor--behavior override(NOT--mode) - Commands like
webset-search-getrequire both webset_id and search_id
- Use
- Choose specific entity types: Use company, person, etc. for better results
- Save IDs: Use
jqto extract and save IDs for subsequent commands
Detailed Reference
For complete command references, syntax, and all options, consult REFERENCE.md and component-specific reference files.
Shared Requirements
Schema Design
MUST: Use object wrapper for schemas
Applies to: answer, search, find-similar, get-contents
When using schema parameters (--output-schema or --summary-schema), always wrap properties in an object:
{"type":"object","properties":{"field_name":{"type":"string"}}}
DO NOT use bare properties without the object wrapper:
{"properties":{"field_name":{"type":"string"}}} // ❌ Missing "type":"object"
Why: The Exa API requires a valid JSON Schema with an object type at the root level. Omitting this causes validation errors.
Examples:
# ✅ CORRECT - object wrapper included
exa-ai search "AI news" \
--summary-schema '{"type":"object","properties":{"headline":{"type":"string"}}}'
# ❌ WRONG - missing object wrapper
exa-ai search "AI news" \
--summary-schema '{"properties":{"headline":{"type":"string"}}}'
Output Format Selection
MUST NOT: Mix toon format with jq
Applies to: answer, context, search, find-similar, get-contents
toon format produces YAML-like output, not JSON. DO NOT pipe toon output to jq for parsing:
# ❌ WRONG - toon is not JSON
exa-ai search "query" --output-format toon | jq -r '.results'
# ✅ CORRECT - use JSON (default) with jq
exa-ai search "query" | jq -r '.results[].title'
# ✅ CORRECT - use toon for direct reading only
exa-ai search "query" --output-format toon
Why: jq expects valid JSON input. toon format is designed for human readability and produces YAML-like output that jq cannot parse.
SHOULD: Choose one output approach
Applies to: answer, context, search, find-similar, get-contents
Pick one strategy and stick with it throughout your workflow:
Approach 1: toon only - Compact YAML-like output for direct reading
- Use when: Reading output directly, no further processing needed
- Token savings: ~40% reduction vs JSON
- Example:
exa-ai search "query" --output-format toon
Approach 2: JSON + jq - Extract specific fields programmatically
- Use when: Need to extract specific fields or pipe to other commands
- Token savings: ~80-90% reduction (extracts only needed fields)
- Example:
exa-ai search "query" | jq -r '.results[].title'
Approach 3: Schemas + jq - Structured data extraction with validation
- Use when: Need consistent structured output across multiple queries
- Token savings: ~85% reduction + consistent schema
- Example:
exa-ai search "query" --summary-schema '{...}' | jq -r '.results[].summary | fromjson'
Why: Mixing approaches increases complexity and token usage. Choosing one approach optimizes for your use case.
Shell Command Best Practices
MUST: Run commands directly, parse separately
Applies to: monitor, search (websets), research, and all skills using complex commands
When using the Bash tool with complex shell syntax, run commands directly and parse output in separate steps:
# ❌ WRONG - nested command substitution
webset_id=$(exa-ai webset-create --search '{"query":"..."}' | jq -r '.webset_id')
# ✅ CORRECT - run directly, then parse
exa-ai webset-create --search '{"query":"..."}'
# Then in a follow-up command:
webset_id=$(cat output.json | jq -r '.webset_id')
Why: Complex nested $(...) command substitutions can fail unpredictably in shell environments. Running commands directly and parsing separately improves reliability and makes debugging easier.
MUST NOT: Use nested command substitutions
Applies to: All skills when using complex multi-step operations
Avoid nesting multiple levels of command substitution:
# ❌ WRONG - deeply nested
result=$(exa-ai search "$(cat query.txt | tr '\n' ' ')" --num-results $(cat config.json | jq -r '.count'))
# ✅ CORRECT - sequential steps
query=$(cat query.txt | tr '\n' ' ')
count=$(cat config.json | jq -r '.count')
exa-ai search "$query" --num-results $count
Why: Nested command substitutions are fragile and hard to debug when they fail. Sequential steps make each operation explicit and easier to troubleshoot.
SHOULD: Break complex commands into sequential steps
Applies to: All skills when working with multi-step workflows
For readability and reliability, break complex operations into clear sequential steps:
# ❌ Less maintainable - everything in one line
exa-ai webset-create --search '{"query":"startups","count":1}' | jq -r '.webset_id' | xargs -I {} exa-ai webset-search-create {} --query "AI" --behavior override
# ✅ More maintainable - clear steps
exa-ai webset-create --search '{"query":"startups","count":1}'
webset_id=$(jq -r '.webset_id' < output.json)
exa-ai webset-search-create $webset_id --query "AI" --behavior override
Why: Sequential steps are easier to understand, debug, and modify. Each step can be verified independently.