Claude Code Plugins

Community-maintained marketplace

Feedback

AI model integration with Local-First strategy. Use FREE local models (Ollama) before API calls.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name ai-integration
description AI model integration with Local-First strategy. Use FREE local models (Ollama) before API calls.
allowed-tools Read, Glob, Grep, Edit, Write, Bash(python:*), Bash(pytest:*)

AI Integration - Local-First Strategy (Dec 2025)

Core Principle: FREE Before PAID

Use local models first, API only when necessary!

Tier Model Cost When to Use
0 Local Ollama FREE Try FIRST for everything
1 Gemini Flash $0.10/1M When local fails
2 DeepSeek API $0.14/1M Code fallback
3 Sonnet 4.5 $3/1M Quality when needed
4 Opus 4.5 $15/1M NEVER unless critical

Local Models (via Ollama)

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Pull recommended models
ollama pull llama3.2:3b       # 2GB - Fast, simple
ollama pull qwen2.5-coder:7b  # 4GB - Code tasks
ollama pull deepseek-r1:8b    # 5GB - Reasoning

LiteLLM Model IDs

MODELS = {
    # Tier 0: LOCAL FREE
    "local-llama": "ollama/llama3.2:3b",
    "local-qwen": "ollama/qwen2.5-coder:7b",
    "local-deepseek": "ollama/deepseek-r1:8b",
    
    # Tier 1-2: CHEAP API
    "gemini-flash": "vertex_ai/gemini-3-flash-preview",
    "deepseek-api": "deepseek/deepseek-chat",
    
    # Tier 3-4: QUALITY API (use sparingly!)
    "claude-sonnet": "vertex_ai/claude-sonnet-4@20250514",
    "claude-opus": "vertex_ai/claude-opus-4-5@20250514",
}

Task Routing (Local-First)

TASK_ROUTING = {
    # Simple → Local FREE
    TaskType.SIMPLE_TASK: ["local-llama", "gemini-flash"],
    
    # Validation → Local coder FREE
    TaskType.VALIDATION: ["local-qwen", "deepseek-api"],
    
    # Understanding → Local reasoning FREE
    TaskType.WORKFLOW_UNDERSTANDING: ["local-deepseek", "gemini-flash", "claude-sonnet"],
    
    # Code gen → Local coder FREE
    TaskType.CODE_GENERATION: ["local-qwen", "gemini-flash", "claude-sonnet"],
    
    # Pipeline → Quality matters
    TaskType.PIPELINE_GENERATION: ["local-qwen", "claude-sonnet", "gemini-pro"],
}

Regional Configuration

# For API models only
if "vertex_ai" in model_id:
    if "claude" in model_id:
        litellm.vertex_location = "us-east5"
    else:
        litellm.vertex_location = "us-central1"
elif model_id.startswith("ollama/"):
    litellm.api_base = "http://localhost:11434"

Cost Savings

Scenario API-Only Local-First Savings
100 simple tasks $3.00 $0 100%
100 validations $2.80 $0 100%
100 code gen $30.00 $0.30 99%

Environment

GOOGLE_CLOUD_PROJECT=gen-lang-client-0497834162
VERTEX_AI_LOCATION=us-central1
VERTEX_AI_CLAUDE_LOCATION=us-east5
CUSTOM_API_URL=http://localhost:11434