name

templar-miner-claude-skill

description

This skill should be used when setting up, optimizing, or managing Templar AI miners on Bittensor Subnet 3 (netuid 3). Use it for tasks involving miner configuration, performance optimization, troubleshooting gradient scoring issues, managing Bittensor wallets with btcli, monitoring miner metrics, renting GPUs via Basilica for mining operations, or strategizing to achieve top miner ranking in the Templar decentralized training network. Integrates seamlessly with the basilica-cli-helper skill for GPU rentals.

Templar Miner Claude Skill

Overview

Set up, optimize, and manage Templar AI miners to achieve top performance in the Templar decentralized training network (Bittensor Subnet 3, netuid 3). This skill provides expert guidance on miner operations, from initial wallet setup through advanced optimization strategies.

Core Capabilities

1. MINER SETUP & CONFIGURATION

Initial Setup Workflow:

Create Bittensor Wallet (see references/bittensor_btcli.md for complete commands):

btcli wallet new-coldkey --wallet.name default --n-words 12
btcli wallet new-hotkey --wallet.name default --wallet.hotkey miner --n-words 12

Register to Templar Subnet:

btcli subnet register --wallet.name default --wallet.hotkey miner \
  --netuid 3 --subtensor.network finney

Configure Environment (use scripts/setup_miner_env.sh):

./scripts/setup_miner_env.sh
# Then edit .env file with credentials

Verify Setup (use scripts/check_miner_health.py):
```
python scripts/check_miner_health.py
```

Install Templar:

git clone https://github.com/one-covenant/templar
cd templar
pip install -e .

Launch Miner:

python neurons/miner.py \
  --wallet.name default \
  --wallet.hotkey miner \
  --netuid 3 \
  --subtensor.network finney \
  --device cuda

Hardware Requirements:

Minimum for Competitive Mining: 8x H200 GPUs (141GB VRAM each)
CPU: 64+ cores
RAM: 512+ GB
Storage: 1TB+ NVMe SSD
Network: 10Gbps+ bandwidth

Note: Smaller configurations (4x H100, etc.) are unlikely to be competitive for rewards. For GPU rentals, see the GPU Rental via Basilica section below.

GPU Rental via Basilica:

For those without access to 8x H200 GPUs, rent them through Basilica's decentralized compute marketplace. The basilica-cli-helper Claude skill (https://github.com/synapz-org/basilica-cli-claude-skill) integrates seamlessly with this Templar skill.

Quick Basilica Workflow:

Check Available GPUs:

basilica ls
basilica price --gpu h200  # Check H200 pricing

Rent GPUs:
```
basilica up h200 --gpu-count 8
```
Check Active Rentals:
```
basilica ps  # Note the rental UID
```

Setup Miner on Rental:

# Copy setup script to rental
basilica cp scripts/setup_miner_env.sh [rental-id]:/root/

# Execute setup on rental
basilica exec --target [rental-id] "bash /root/setup_miner_env.sh"

# Copy environment file after editing
basilica cp .env [rental-id]:/root/templar/

# Install Templar
basilica exec --target [rental-id] "git clone https://github.com/one-covenant/templar && cd templar && pip install -e ."

Launch Miner on Rental:

basilica exec --target [rental-id] "cd /root/templar && python neurons/miner.py --wallet.name default --wallet.hotkey miner --netuid 3 --device cuda"

Monitor and Manage:

# Copy logs back
basilica cp [rental-id]:/root/templar/logs/ ./local-logs/

# Terminate when done
basilica down [rental-id]

Cost Optimization Tips:

Monitor Basilica pricing fluctuations with basilica price
Use basilica ps to track runtime and costs
Set up automated cost alerts
Test on smaller GPU counts first before scaling to 8x H200

Integration Note: When using both skills together, Claude can automatically coordinate between Templar mining operations and Basilica GPU rentals. Simply ask: "Help me set up a Templar miner on rented Basilica GPUs."

2. PERFORMANCE OPTIMIZATION

Memory Optimization Tactics:

Parameter Offloading (neurons/miner.py:493-496):
- Offload model parameters to CPU before inner_steps
- Restore to GPU after training
- Reduces peak GPU memory by ~20%
Error Feedback Management:
- Keep error_feedback on GPU during training
- Move to pinned CPU memory after gradient computation
- Prefetch back before next window
Mixed Precision:
- Use bfloat16 for better stability than fp16
- Configure in hparams.json: "mixed_precision_param": "bfloat16"

Gradient Quality Optimization:

Error Feedback Momentum (Critical for 64x compression):
- Preserves gradient information across compressions
- Configured via momentum_decay (default: 0.95)
- Reference: src/tplr/neurons.py:47-222

Compression Hyperparameters:

{
  "topk_compression": 64,      // Top 64 coefficients
  "target_chunk": 64,          // DCT chunk size
  "momentum_decay": 0.95       // Momentum factor
}

Data Assignment Synchronization:
- Ensure deterministic UID-based seeding matches validators
- Verify sample_digest in gradient metadata
- Rotates every 455 outer steps

Communication Optimization:

Peer Selection Strategy:

{
  "gather_peer_count": 20,     // Peers to gather from
  "gather_share": 0.75,        // 75% success target
  "gather_top_ratio": 2.0      // Bias toward high-stake peers
}

Async Transfers:
- Use non_blocking=True for GPU↔CPU transfers
- Pinned memory for host buffers
- Batch synchronization at end

3. MONITORING & METRICS

Key Metrics to Track:

WandB Dashboard (https://wandb.ai/tplr/templar):

miner/window_entry_loss: Loss at window start
miner/tokens_per_sec: Training throughput
miner/gather_success_rate: % of peers responded
miner/gpu_memory_allocated: VRAM usage
miner/global_grad_norm: Gradient magnitude

Grafana Dashboard (https://grafana.tplr.ai):

View all miner scores and weights
Compare performance vs other miners
Track synchronization status

Critical Alert Thresholds:

Gather Success Rate < 50%: Poor peer connectivity
Sync Score < 0.5: Falling behind validators
Gradient Score Negative: Producing harmful gradients
GPU Memory > 95%: Risk of OOM crashes
Window Time > 30 min: Too slow, missing windows

4. UNDERSTANDING THE INCENTIVE MECHANISM

For complete details, see references/incentive_mechanism.md.

Core Scoring Formula:

s_i = L_before - L_after

Where validators:

Compute loss BEFORE applying miner's gradient
Apply miner's gradient to model
Compute loss AFTER application
Score = improvement (positive is good)

Moving Average:

s̄_i = α * s_i + (1 - α) * s̄_i  (α = 0.05)

Penalties:

Missing Gradient: -99.0 score, 25% slash per window
Consecutive Negatives: >3 → exclusion from peer selection
Desynchronization: Exponential penalty if >5 steps behind
Index Overlap: >40% with others → plagiarism flag

Target Metrics for Top Miners:

Gather Success Rate: >75% (elite: >85%)
Sync Score: >0.9 (stay within 1 step)
Gradient Score: Consistently positive (>0.01 MA)
Tokens/sec: >1000 for 70B on 4x H100
Uptime: >99%

5. TROUBLESHOOTING COMMON ISSUES

Issue: Missing Gradients

Symptoms:

Penalty score of -99.0
Validators report gradient not found

Solutions:

Check R2 bucket write access:

aws s3 ls s3://$R2_GRADIENTS_BUCKET_NAME \
  --endpoint-url https://$R2_GRADIENTS_ACCOUNT_ID.r2.cloudflarestorage.com

Verify miner logs show "Successfully uploaded gradient"
Check network connectivity
Verify R2 credentials in environment

Issue: Low Gather Success Rate

Symptoms:

gather_success_rate < 50%
Timeouts in gather_with_reserve logs

Solutions:

Increase gather_peer_count in hparams.json
Check network bandwidth (need 1Gbps+)
Verify R2 read access
Consider adjusting peer_replacement_frequency

Issue: Negative Gradient Scores

Symptoms:

gradient_score consistently negative
Binary indicator shows -1
Moving average declining

Solutions:

Reload checkpoint from highest-stake validator
Verify hparams.json matches current network version
Check data assignment synchronization (sample_digest)
Review error_feedback buffer state
Ensure no bugs in local modifications

Issue: Desynchronization (Low Sync Score)

Symptoms:

sync_score < 0.5
Validator reports "steps behind"
Gradient fingerprint mismatches

Solutions:

Force checkpoint reload from top validator
Verify window synchronization with blockchain
Check for missed windows (uptime issues)
Ensure scheduler replay is correct

Issue: OOM (Out of Memory) Errors

Symptoms:

CUDA out of memory errors
Miner crashes during training
GPU memory at 100%

Solutions:

Enable parameter offloading (already default)
Enable optimizer state offloading
Reduce micro_batch_size in hparams
Enable activation checkpointing
Use gradient checkpointing for large models

Issue: Slow Training (Missing Windows)

Symptoms:

Window time > 25 minutes
Missing window boundaries
Low tokens/sec

Solutions:

Reduce batch_size or inner_steps
Enable torch.compile if not already enabled
Optimize DataLoader (increase num_workers)
Check for CPU bottlenecks (use profilers)
Verify network not rate-limiting R2 access

6. ADVANCED OPTIMIZATION STRATEGIES

Strategy 1: Maximize Uptime

Use systemd service for auto-restart
Implement health checks and alerting
Monitor for blockchain forks/reorgs
Ensure redundant network connectivity

Strategy 2: Optimize Gradient Quality

Fine-tune error feedback momentum (0.9-0.99 range)
Monitor validator evaluation patterns
Adjust compression ratio if bandwidth allows
Perfect data assignment synchronization

Strategy 3: Maximize Peer Connectivity

High-bandwidth network connection (10Gbps ideal)
Optimize R2 bucket configuration (Cloudflare zones)
Monitor peer success rates per UID
Maintain quality reserve peers

Strategy 4: Stay Synchronized

Check sync_score every window
Maintain <1 step lag from validators
Quick checkpoint reloads when needed
Monitor gradient fingerprint matching

Strategy 5: Continuous Monitoring

Set up Grafana alerts for critical thresholds
Monitor WandB dashboard in real-time
Track InfluxDB metrics for trends
Compare performance vs top 10 miners

Strategy 6: Network Participation

Monitor Discord/GitHub for updates
Track hparams.json version changes
Quickly adopt new optimizations
Report issues and contribute fixes

Strategy 7: Hardware Investment

More GPUs → higher throughput → more windows completed
Better network → higher gather success
More RAM → larger batch sizes possible
Faster storage → quicker checkpoint loads

7. HYPERPARAMETER TUNING

Learning Rate Optimization:

{
  "adamw": {
    "learning_rate": 1.17e-4,    // Default starting point
    "warmup_steps": 1500,        // Gradual ramp-up
    "scheduler": {
      "t_max": 140000,           // Cosine annealing period
      "eta_min_factor": 0.1      // Min LR = 10% of max
    }
  }
}

Momentum Decay Tuning:

Higher (0.95-0.99): Better gradient preservation across compression
Lower (0.85-0.9): Faster adaptation to changes
Default 0.95 works well for most cases

Compression Ratio Tuning:

topk_compression: 64 (standard)
Options: 32 (higher quality), 128 (more compression)
Trade-off: Gradient quality vs bandwidth/storage

Resources

Scripts

check_miner_health.py: Comprehensive health check for miner setup
- Verifies environment variables
- Checks GPU availability
- Tests wallet registration
- Validates R2 connectivity
- Monitors disk space and memory
setup_miner_env.sh: Generate template .env file with all required variables
- Creates properly formatted environment file
- Includes all Templar-specific variables
- Documents required credentials

References

templar_architecture.md: Complete codebase structure and file locations
- Repository organization
- Core implementations (miner.py, validator.py, trainer.py)
- Library components (neurons.py, compress.py, comms.py)
- Configuration parameters
incentive_mechanism.md: Detailed scoring and reward system
- Loss-based gradient evaluation
- Moving average smoothing
- Penalty mechanisms
- Synchronization scoring
- OpenSkill rating system
bittensor_btcli.md: Complete btcli command reference
- Wallet creation and recovery
- Hotkey management
- Subnet registration
- Balance checking
- Identity management

Task Execution Guidelines

When helping users with Templar mining:

Always verify wallet setup first - Use btcli commands from references
Check environment variables - Run check_miner_health.py script
Review hardware specifications - Ensure meets minimum requirements
Monitor key metrics - Track gather success, sync score, gradient score
Optimize systematically - Memory → gradient quality → communication
Track performance - Compare against top miners using Grafana
Stay updated - Monitor for codebase changes and hparam updates
Iterate continuously - Adjust based on validator feedback

Code References

When referencing Templar code, use the pattern file_path:line_number:

Miner main loop: neurons/miner.py:354-942
Gradient preparation: src/tplr/neurons.py:47-222
Compression pipeline: src/tplr/compress.py:51-142
Validator evaluation: neurons/validator.py:1644-1940
Error feedback: src/tplr/neurons.py:152
Peer gathering: neurons/miner.py:705-726

See references/templar_architecture.md for complete file structure and line references.

External Resources

GitHub: https://github.com/one-covenant/templar
Miner Docs: https://github.com/one-covenant/templar/blob/main/docs/miner.md
Validator Docs: https://github.com/one-covenant/templar/blob/main/docs/validator.md
WandB Dashboard: https://wandb.ai/tplr/templar
Grafana Metrics: https://grafana.tplr.ai/d/ceia6bwlwn8qof/eval-metrics
DeepWiki: https://deepwiki.com/one-covenant/templar
Bittensor Docs: https://docs.learnbittensor.org

The goal is to help users become top-performing miners in the Templar network through systematic setup, optimization, monitoring, and iterative improvement.

templar-miner-claude-skill

Install Skill

SKILL.md

Templar Miner Claude Skill

Overview

Core Capabilities

1. MINER SETUP & CONFIGURATION

2. PERFORMANCE OPTIMIZATION

3. MONITORING & METRICS

4. UNDERSTANDING THE INCENTIVE MECHANISM

5. TROUBLESHOOTING COMMON ISSUES

6. ADVANCED OPTIMIZATION STRATEGIES

7. HYPERPARAMETER TUNING

Resources

Scripts

References

Task Execution Guidelines

Code References

External Resources