Claude Code Plugins

Community-maintained marketplace

Feedback

|

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name using-braintrust
description Enables AI agents to use Braintrust for LLM evaluation, logging, and observability. Includes scripts for querying logs with SQL, running evals, and logging data.
version 1.0.0

Using Braintrust

Braintrust is a platform for evaluating, logging, and monitoring LLM applications.

Querying logs with SQL

Use the query_logs.py script to run SQL queries against Braintrust logs.

Always share the SQL query you used when reporting results, so the user understands what was executed.

Script location: scripts/query_logs.py (relative to this file)

Run from the user's project directory (where .env with BRAINTRUST_API_KEY exists):

uv run /path/to/scripts/query_logs.py --project "Project Name" --query "SQL_QUERY"

Common queries

Count logs from last 24 hours:

SELECT count(*) as count FROM logs WHERE created > now() - interval 1 day

Get recent logs:

SELECT input, output, created FROM logs ORDER BY created DESC LIMIT 10

Filter by metadata:

SELECT input, output FROM logs WHERE metadata.user_id = 'user123' LIMIT 20

Filter by time range:

SELECT * FROM logs WHERE created > now() - interval 7 day LIMIT 50

Aggregate by field:

SELECT metadata.model, count(*) as count FROM logs GROUP BY metadata.model

Group by hour:

SELECT hour(created) as hr, count(*) as count FROM logs GROUP BY hour(created)

SQL quirks in Braintrust

  • Time functions: Use hour(), day(), month(), year() instead of date_trunc()
    • hour(created)
    • date_trunc('hour', created)
  • Intervals: Use interval 1 day, interval 7 day, interval 1 hour (no quotes, singular unit)
  • Nested fields: Use dot notation: metadata.user_id, scores.Factuality, metrics.duration
  • Table name: Always use FROM logs (the script handles project scoping)

SQL reference

Operators:

  • =, !=, >, <, >=, <=
  • IS NULL, IS NOT NULL
  • LIKE 'pattern%'
  • AND, OR, NOT

Aggregations:

  • count(*), count(field)
  • avg(field), sum(field)
  • min(field), max(field)

Time filters:

  • created > now() - interval 1 day
  • created > now() - interval 7 day
  • created > now() - interval 1 hour

Logging data

Use scripts/log_data.py to log data to a project:

uv run /path/to/scripts/log_data.py --project "Project Name" --input "query" --output "response"

With metadata:

--input "query" --output "response" --metadata '{"user_id": "123"}'

Batch from JSON:

--data '[{"input": "a", "output": "b"}, {"input": "c", "output": "d"}]'

Running evaluations

Use scripts/run_eval.py to run evaluations:

uv run /path/to/scripts/run_eval.py --project "Project Name" --data '[{"input": "test", "expected": "test"}]'

From file:

--data-file test_cases.json --scorer factuality

Setup

Create a .env file in your project directory:

BRAINTRUST_API_KEY=your-api-key-here

Writing evaluation code (SDK)

For custom evaluation logic, use the SDK directly.

IMPORTANT: First argument to Eval() is the project name (positional).

import braintrust
from autoevals import Factuality

braintrust.Eval(
    "My Project",  # Project name (required, positional)
    data=lambda: [{"input": "What is 2+2?", "expected": "4"}],
    task=lambda input: my_llm_call(input),
    scores=[Factuality],
)

Common mistakes:

  • Eval(project_name="My Project", ...) - Wrong!
  • Eval(name="My Project", ...) - Wrong!
  • Eval("My Project", data=..., task=..., scores=...) - Correct!

Writing logging code (SDK)

import braintrust

logger = braintrust.init_logger(project="My Project")
logger.log(input="query", output="response", metadata={"user_id": "123"})
logger.flush()  # Always flush!

Common issues

  • "Eval() got an unexpected keyword argument 'project_name'": Use positional argument
  • Logs not appearing: Call logger.flush() after logging
  • Authentication errors: Create .env file with BRAINTRUST_API_KEY=your-key