| name | local-clickhouse |
| description | Install, configure, and validate local ClickHouse for gapless-crypto-clickhouse development and backtesting. Use when setting up local development environment, enabling offline mode, improving query performance for backtesting, or running E2E validation. Includes mise/Homebrew/apt installation, mode detection, connection validation, and E2E workflow scripts. |
Local ClickHouse
Install, configure, and validate local ClickHouse as an alternative to ClickHouse Cloud for development and backtesting (ADR-0044, ADR-0045).
Purpose
Enable local ClickHouse deployment for:
- Backtesting: 50-100x faster queries (no network round-trip)
- Development: Offline work without Cloud credentials
- Evaluation: Try package before Cloud commitment
- Cost savings: Avoid Cloud costs during experimentation
When to Use
Use this skill when:
- Local development: Setting up development environment without Cloud
- Backtesting optimization: Need faster query performance
- Offline mode: Working without network access
- Package evaluation: Trying gapless-crypto-clickhouse before Cloud setup
Triggers: User mentions "local ClickHouse", "install clickhouse", "backtesting setup", "offline mode", "GCCH_MODE=local"
Prerequisites
System Requirements:
- macOS (Homebrew) or Linux (apt/installer)
- 2-4GB RAM minimum
- 10GB+ disk space for data
Network: Internet required for installation only (offline after setup)
Workflow
Step 1: Install ClickHouse
macOS (Homebrew):
# Install ClickHouse
brew install clickhouse
# Verify installation
clickhouse --version
Linux (Ubuntu/Debian):
# Quick installer (recommended)
curl https://clickhouse.com/ | sh
./clickhouse --version
# Or via apt
sudo apt-get install -y apt-transport-https ca-certificates
curl -fsSL https://packages.clickhouse.com/deb/lts/clickhouse.gpg | sudo gpg --dearmor -o /usr/share/keyrings/clickhouse-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/clickhouse-keyring.gpg] https://packages.clickhouse.com/deb stable main" | sudo tee /etc/apt/sources.list.d/clickhouse.list
sudo apt-get update
sudo apt-get install -y clickhouse-server clickhouse-client
Step 2: Start ClickHouse Server
macOS (Homebrew):
# Start server in background
clickhouse server --daemon
# Verify server is running
clickhouse client --query "SELECT 1"
Linux (systemd):
# Start and enable service
sudo service clickhouse-server start
sudo systemctl enable clickhouse-server
# Verify server is running
clickhouse-client --query "SELECT 1"
Step 3: Configure Environment
Set Local Mode:
# Explicit local mode (recommended)
export GCCH_MODE=local
# Or rely on auto-detection (localhost triggers local mode)
export CLICKHOUSE_HOST=localhost
Environment Variables:
# Full local configuration
export GCCH_MODE=local
export CLICKHOUSE_HOST=localhost
export CLICKHOUSE_HTTP_PORT=8123
export CLICKHOUSE_DATABASE=default
export CLICKHOUSE_USER=default
export CLICKHOUSE_PASSWORD=
Step 4: Verify Connection
Using Python:
from gapless_crypto_clickhouse import probe
# Check local ClickHouse status
status = probe.check_local_clickhouse()
print(f"Installed: {status['installed']}")
print(f"Running: {status['running']}")
print(f"Version: {status['version']}")
# Get current mode
mode = probe.get_current_mode()
print(f"Mode: {mode}") # Should be "local"
Using CLI:
# Quick connection test
GCCH_MODE=local python -c "
from gapless_crypto_clickhouse.clickhouse.config import ClickHouseConfig
config = ClickHouseConfig.from_env()
print(f'Host: {config.host}')
print(f'Port: {config.http_port}')
print(f'Secure: {config.secure}')
"
Step 5: Test with Real Data
Query with Auto-Ingestion:
import os
os.environ["GCCH_MODE"] = "local"
from gapless_crypto_clickhouse import query_ohlcv
# First query downloads and ingests data automatically
df = query_ohlcv(
"BTCUSDT",
"1h",
"2024-01-01",
"2024-01-07"
)
print(f"Rows: {len(df)}")
print(df.head())
Mode Detection Logic
The package auto-detects deployment mode based on environment:
GCCH_MODE=local → Local mode (explicit)
GCCH_MODE=cloud → Cloud mode (requires CLICKHOUSE_HOST)
GCCH_MODE=auto → Auto-detect:
CLICKHOUSE_HOST="" → Local mode
CLICKHOUSE_HOST="localhost" → Local mode
CLICKHOUSE_HOST="127.0.0.1" → Local mode
CLICKHOUSE_HOST="*.cloud" → Cloud mode
Introspection:
from gapless_crypto_clickhouse import probe
# Get available modes
modes = probe.get_deployment_modes()
print(modes["available_modes"]) # ["local", "cloud"]
# Get current mode
current = probe.get_current_mode() # "local" or "cloud"
# Get installation guide
guide = probe.get_local_installation_guide()
print(guide["macos"]["commands"])
Success Criteria
- ClickHouse server running on localhost:8123
probe.check_local_clickhouse()returns{"installed": True, "running": True}probe.get_current_mode()returns"local"query_ohlcv()executes without Cloud credentials
Troubleshooting
Issue: "Connection refused" on port 8123
Check: Is ClickHouse server running?
# macOS ps aux | grep clickhouse # Linux sudo service clickhouse-server statusAction: Start server with
clickhouse server --daemon
Issue: "Command not found: clickhouse"
- Check: Was ClickHouse installed correctly?
- macOS: Run
brew install clickhouse - Linux: Run
curl https://clickhouse.com/ | sh
Issue: Mode detected as "cloud" instead of "local"
- Check: Is
CLICKHOUSE_HOSTset to a remote hostname? - Action: Set
export GCCH_MODE=localexplicitly
Issue: Server crashes with memory error
- Check: System has 2-4GB RAM available
- Action: Increase available memory or reduce concurrent queries
Issue: Permission denied errors
- Check: ClickHouse data directory permissions
- Action: Ensure user has write access to
/var/lib/clickhouse(Linux)
Port Reference
| Mode | HTTP Port | Native Port | Secure |
|---|---|---|---|
| Local | 8123 | 9000 | False |
| Cloud | 8443 | 9440 | True |
Performance Comparison
| Metric | Local | Cloud |
|---|---|---|
| Query latency | 50ms | 100-500ms |
| First query (cold) | 50ms | 5-10s (idle resume) |
| Network dependency | None | Required |
| Credentials | None | Required |
| Best for | Backtesting | Production |
Server Management
Start Server:
# macOS
clickhouse server --daemon
# Linux
sudo service clickhouse-server start
Stop Server:
# macOS
pkill -f clickhouse-server
# Linux
sudo service clickhouse-server stop
Check Logs:
# macOS (Homebrew)
tail -f /opt/homebrew/var/log/clickhouse-server/clickhouse-server.log
# Linux
sudo tail -f /var/log/clickhouse-server/clickhouse-server.log
E2E Validation Workflow (ADR-0045)
This skill includes executable scripts for end-to-end validation of local ClickHouse.
Scripts Directory
| Script | Purpose |
|---|---|
scripts/start-clickhouse.sh |
Start mise-installed ClickHouse server |
scripts/deploy-schema.sh |
Deploy production schema (calls existing script) |
scripts/ingest-sample-data.py |
Ingest real Binance data via query_ohlcv() |
scripts/take-screenshot.py |
Capture Play UI screenshot via Playwright |
scripts/validate-data.py |
Validate data integrity, output JSON evidence |
Running E2E Validation
Full workflow (manual):
# 1. Start ClickHouse
./skills/local-clickhouse/scripts/start-clickhouse.sh
# 2. Deploy schema
./skills/local-clickhouse/scripts/deploy-schema.sh
# 3. Ingest sample data
uv run python skills/local-clickhouse/scripts/ingest-sample-data.py
# 4. Capture screenshot
uv run python skills/local-clickhouse/scripts/take-screenshot.py
# 5. Validate data
uv run python skills/local-clickhouse/scripts/validate-data.py
Via pytest (automated):
# Run E2E tests (requires mise ClickHouse installed)
uv run pytest tests/test_local_clickhouse_e2e.py -v
# Tests will FAIL (not skip) if ClickHouse not available
Evidence Output
Scripts output evidence to tests/screenshots/ (gitignored):
play-ui-{timestamp}.png- Playwright screenshotsvalidation-{timestamp}.json- Structured validation results
References
- ADR-0044: Local ClickHouse Option
- ADR-0045: Local ClickHouse E2E Validation
- Plan 0044: Implementation Plan
- Plan 0045: E2E Validation Plan
- llms.txt: AI agent documentation with deployment modes
- probe.py:
get_deployment_modes(),check_local_clickhouse()
Next Steps
After successful local setup:
- Schema Creation: Tables created automatically on first query
- Data Ingestion: Use
query_ohlcv()withauto_ingest=True - E2E Validation: Run
uv run pytest tests/test_local_clickhouse_e2e.py -v - Backtesting: Run analysis with 50-100x faster queries
- Production Migration: Switch to Cloud mode with
export GCCH_MODE=cloud