| name | cohere-v2-python |
| description | Master Cohere v2 Chat API with Python, specializing in entity extraction using JSON Schema mode for structured outputs. Use when extracting entities from text, building data extraction pipelines, implementing NER systems, or requiring validated JSON responses from LLMs. |
Cohere v2 Python
Overview
Cohere's v2 Chat API provides powerful conversational AI capabilities with a specialized focus on structured outputs through JSON Schema mode. This skill covers entity extraction, data validation, and integration patterns for building production-ready systems that require consistent, validated responses from LLMs.
When to Use This Skill
Apply this skill when:
- Extracting structured entities from unstructured text (names, dates, locations, organizations)
- Building Named Entity Recognition (NER) systems
- Implementing data extraction pipelines with validated outputs
- Requiring JSON responses that conform to specific schemas
- Processing documents for information extraction
- Building classification systems with constrained outputs
- Integrating LLM responses with downstream databases or APIs
Core Capabilities
1. Basic Chat API
Initialize and use the Cohere Client for conversational tasks:
import cohere
co = cohere.ClientV2(api_key="<YOUR API KEY>")
response = co.chat(
model="command-a-03-2025",
messages=[
{"role": "user", "content": "Summarize the key features of quantum computing."}
],
)
print(response.message.content[0].text)
Available models:
command-a-03-2025- Latest generation model
For comprehensive API parameters, streaming, RAG, and tool use, refer to references/chat_api.md.
2. Entity Extraction with JSON Schema Mode
The primary strength of Cohere v2 is structured outputs using JSON Schema mode, which guarantees responses conform to your specified schema.
Simple Entity Extraction:
text = "Dr. Sarah Johnson from Stanford University will speak at the AI Conference in Seattle on March 15th."
response = co.chat(
model="command-a-03-2025",
messages=[
{"role": "user", "content": f"Extract all entities: {text}"}
],
response_format={
"type": "json_object",
"schema": {
"type": "object",
"properties": {
"person": {"type": "string"},
"title": {"type": "string"},
"organization": {"type": "string"},
"event": {"type": "string"},
"location": {"type": "string"},
"date": {"type": "string", "format": "date"}
},
"required": ["person"]
}
}
)
import json
entities = json.loads(response.message.content[0].text)
Key Principles:
- Top-level type must be
"object" - At least one field must be in
"required"array - Schema is strictly enforced - invalid responses are regenerated
- First request has latency overhead; subsequent requests are cached
3. Multiple Entity Extraction
Extract arrays of entities for batch processing:
text = """
John Smith works at Google as a Software Engineer in San Francisco.
Jane Doe is a Data Scientist at Meta in New York.
Bob Wilson leads the AI team at OpenAI in Seattle.
"""
response = co.chat(
model="command-a-03-2025",
messages=[
{"role": "user", "content": f"Extract all people and their details: {text}"}
],
response_format={
"type": "json_object",
"schema": {
"type": "object",
"properties": {
"people": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {"type": "string"},
"company": {"type": "string"},
"role": {"type": "string"},
"location": {"type": "string"}
},
"required": ["name", "company"]
}
}
},
"required": ["people"]
}
}
)
result = json.loads(response.message.content[0].text)
for person in result["people"]:
print(f"{person['name']} works at {person['company']}")
4. Classification with Enums
Use enums to constrain outputs to specific categories:
text = "I absolutely love this product! The quality is amazing and customer service was helpful."
response = co.chat(
model="command-a-03-2025",
messages=[
{"role": "user", "content": f"Analyze sentiment and aspects: {text}"}
],
response_format={
"type": "json_object",
"schema": {
"type": "object",
"properties": {
"overall_sentiment": {
"type": "string",
"enum": ["positive", "negative", "neutral", "mixed"]
},
"aspects": {
"type": "array",
"items": {
"type": "object",
"properties": {
"aspect": {"type": "string"},
"sentiment": {
"type": "string",
"enum": ["positive", "negative", "neutral"]
}
},
"required": ["aspect", "sentiment"]
}
}
},
"required": ["overall_sentiment", "aspects"]
}
}
)
Benefits of Enums:
- Guarantees valid category values
- Eliminates post-processing validation
- Enables direct database insertion
- Supports downstream logic without error handling
Common Entity Extraction Patterns
Named Entity Recognition (NER)
schema = {
"type": "object",
"properties": {
"entities": {
"type": "array",
"items": {
"type": "object",
"properties": {
"text": {"type": "string"},
"type": {
"type": "string",
"enum": ["PERSON", "ORGANIZATION", "LOCATION", "DATE", "EVENT", "PRODUCT"]
},
"context": {"type": "string"}
},
"required": ["text", "type"]
}
}
},
"required": ["entities"]
}
Resume/CV Parsing
schema = {
"type": "object",
"properties": {
"name": {"type": "string"},
"email": {
"type": "string",
"pattern": "^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$"
},
"phone": {"type": "string"},
"experience": {
"type": "array",
"items": {
"type": "object",
"properties": {
"company": {"type": "string"},
"role": {"type": "string"},
"start_date": {"type": "string", "format": "date"},
"end_date": {"type": "string", "format": "date"},
"description": {"type": "string"}
},
"required": ["company", "role"]
}
},
"education": {
"type": "array",
"items": {
"type": "object",
"properties": {
"institution": {"type": "string"},
"degree": {"type": "string"},
"field": {"type": "string"},
"graduation_year": {"type": "integer"}
},
"required": ["institution"]
}
},
"skills": {
"type": "array",
"items": {"type": "string"}
}
},
"required": ["name"]
}
Invoice/Receipt Extraction
schema = {
"type": "object",
"properties": {
"invoice_number": {"type": "string"},
"invoice_date": {"type": "string", "format": "date"},
"vendor": {
"type": "object",
"properties": {
"name": {"type": "string"},
"address": {"type": "string"},
"tax_id": {"type": "string"}
},
"required": ["name"]
},
"items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"description": {"type": "string"},
"quantity": {"type": "number"},
"unit_price": {"type": "number"},
"total": {"type": "number"}
},
"required": ["description", "total"]
}
},
"subtotal": {"type": "number"},
"tax": {"type": "number"},
"total": {"type": "number"}
},
"required": ["invoice_number", "vendor", "total"]
}
Medical Report Extraction
schema = {
"type": "object",
"properties": {
"patient": {
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer"},
"gender": {
"type": "string",
"enum": ["male", "female", "other", "unknown"]
}
},
"required": ["name"]
},
"diagnosis": {
"type": "array",
"items": {
"type": "object",
"properties": {
"condition": {"type": "string"},
"severity": {
"type": "string",
"enum": ["mild", "moderate", "severe"]
},
"notes": {"type": "string"}
},
"required": ["condition"]
}
},
"medications": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {"type": "string"},
"dosage": {"type": "string"},
"frequency": {"type": "string"}
},
"required": ["name"]
}
},
"visit_date": {"type": "string", "format": "date"}
},
"required": ["patient", "visit_date"]
}
Advanced Schema Features
Nested Objects with Validation
schema = {
"type": "object",
"properties": {
"company": {
"type": "object",
"properties": {
"name": {"type": "string"},
"headquarters": {
"type": "object",
"properties": {
"street": {"type": "string"},
"city": {"type": "string"},
"country": {"type": "string"}
},
"required": ["city", "country"]
}
},
"required": ["name"]
}
},
"required": ["company"]
}
Schema Reuse with $ref
schema = {
"type": "object",
"$defs": {
"person": {
"type": "object",
"properties": {
"name": {"type": "string"},
"email": {"type": "string"},
"phone": {"type": "string"}
},
"required": ["name"]
}
},
"properties": {
"primary_contact": {"$ref": "#/$defs/person"},
"secondary_contact": {"$ref": "#/$defs/person"}
},
"required": ["primary_contact"]
}
Format Validation
schema = {
"type": "object",
"properties": {
"created_at": {
"type": "string",
"format": "date-time" # ISO 8601: 2024-01-01T12:00:00Z
},
"birth_date": {
"type": "string",
"format": "date" # YYYY-MM-DD
},
"user_id": {
"type": "string",
"format": "uuid"
},
"email": {
"type": "string",
"pattern": "^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$"
}
},
"required": ["user_id"]
}
Workflow: Building an Entity Extraction Pipeline
Step 1: Define Your Schema
# Identify entities you need to extract
entity_schema = {
"type": "object",
"properties": {
"entities": {
"type": "array",
"items": {
"type": "object",
"properties": {
"text": {"type": "string"},
"type": {"type": "string", "enum": ["PERSON", "ORG", "LOCATION"]},
"confidence": {"type": "string", "enum": ["high", "medium", "low"]}
},
"required": ["text", "type"]
}
}
},
"required": ["entities"]
}
Step 2: Create Extraction Function
def extract_entities(text, schema):
response = co.chat(
model="command-a-03-2025",
messages=[
{
"role": "system",
"content": "Extract entities accurately with appropriate confidence levels."
},
{
"role": "user",
"content": f"Extract all entities: {text}"
}
],
response_format={
"type": "json_object",
"schema": schema
}
)
return json.loads(response.message.content[0].text)
Step 3: Batch Processing
documents = [
"Text 1...",
"Text 2...",
"Text 3..."
]
results = []
for doc in documents:
entities = extract_entities(doc, entity_schema)
results.append({
"document": doc,
"entities": entities["entities"]
})
Step 4: Store in Database
import surrealdb # Example with SurrealDB
async def store_entities(entities):
async with Surreal("ws://localhost:8000/rpc") as db:
await db.signin({"user": "root", "pass": "root"})
await db.use("entities", "database")
for entity in entities["entities"]:
await db.create("entity", entity)
Best Practices
Schema Design
- Start with required fields only, add optional fields iteratively
- Use enums for classification to guarantee valid outputs
- Leverage format validation (date, uuid, email) for data quality
- Use $ref for repeated structures to keep schemas DRY
Prompting
- System messages override user instructions - use for extraction guidelines
- Be explicit about what to extract in user messages
- Always instruct model to generate JSON in JSON mode (without schema)
- Provide examples in system message for complex extractions
Performance
- Schemas are cached after first request - reuse schemas across calls
- Simple schemas have minimal latency overhead
- Complex nested schemas add moderate processing time
- Consider batching extractions when processing multiple documents
Error Handling
- Always wrap JSON parsing in try-except blocks
- Validate required fields exist even with schema enforcement
- Handle API errors gracefully with exponential backoff
- Log failed extractions for debugging and reprocessing
Production Considerations
- Monitor token usage via
response.meta.tokens - Implement rate limiting and request queuing
- Cache common extractions to reduce API calls
- Use appropriate model for task complexity vs. cost
Limitations
Unsupported Schema Features
- Numeric ranges (minimum/maximum)
- Array length constraints (minItems/maxItems)
- String length constraints (minLength/maxLength)
- Some complex regex patterns
Current Restrictions
- RAG not supported in JSON mode
- Maximum 200 fields in tools mode
- Schema mode adds latency overhead
Reference Documentation
This skill includes comprehensive reference documentation:
references/chat_api.md- Complete Chat API reference including parameters, streaming, tool use, RAG, and conversation managementreferences/structured_outputs.md- In-depth structured outputs guide with JSON Schema mode, validation, entity extraction patterns, and advanced features
Load these references when implementing specific features or troubleshooting issues.
Additional Resources
- API Documentation: https://docs.cohere.com/v2/docs/chat-api
- Structured Outputs: https://docs.cohere.com/v2/docs/structured-outputs
- Python SDK: https://github.com/cohere-ai/cohere-python
- PyPI Package: https://pypi.org/project/cohere/
- JSON Schema Specification: https://json-schema.org/