Alpaca Algo Trader Plus Data Access
Experiment Overview
| Item |
Details |
| Date |
2024-12-29 |
| Goal |
Leverage Algo Trader Plus subscription for extended historical data |
| Environment |
training notebook, DataFetcher, Alpaca API |
| Status |
Success |
Context
Default Alpaca free tier limits historical data to ~2 years. The Algo Trader Plus subscription ($99/month) provides:
- 5+ years of historical bars for equities
- Extended crypto history
- Higher API rate limits
This enables training on 4 years of data (1460 days) for more robust models.
Verified Workflow
1. Configure Lookback Period (notebook)
# Data configuration
LOOKBACK_DAYS = 1460 # 4 years of data (Algo Trader Plus subscription)
# LOOKBACK_DAYS = 730 # 2 years (free tier limit)
2. Data Fetcher Configuration
from alpaca_trading.data.fetcher import DataFetcher
# Initialize with API keys
fetcher = DataFetcher(keys_file='API_key.txt')
# Fetch extended history
df = fetcher.get_bars(
symbol='AAPL',
timeframe='1Hour',
lookback_days=1460, # 4 years
use_cache=True
)
# Expected: ~10,000 1Hour bars (6.5 hours/day * ~252 days/year * 4 years)
print(f'Fetched {len(df):,} bars')
3. API Key Environment Variables
# Set in environment or notebook
import os
os.environ['APCA_API_KEY_ID'] = 'your_key_id'
os.environ['APCA_API_SECRET_KEY'] = 'your_secret_key'
# Or use keys file
os.environ['ALPACA_KEYS_FILE'] = 'API_key.txt'
4. Cache Configuration for Large Data
# Google Drive cache for Colab
DRIVE_DATA_DIR = '/content/drive/MyDrive/Colab_Projects/training_data'
SELECTION_CACHE_EXPIRY_DAYS = 3 # Shorter for fresh rankings
TRAINING_CACHE_EXPIRY_DAYS = 7 # Longer for stability
# Save 4 years of data to cache
save_to_cache(symbol, df, timeframe, lookback_days=1460)
5. Validate Data Availability
# Check date range
print(f'Date range: {df.index[0]} to {df.index[-1]}')
# Expected for 4 years:
# Start: ~2021-01-XX
# End: 2024-12-XX
# Verify bar count
expected_bars = 6.5 * 252 * 4 # ~6,552 for 1Hour
print(f'Expected ~{expected_bars:,.0f} bars, got {len(df):,}')
Data Source Comparison
| Feature |
yfinance (Free) |
Alpaca Free |
Algo Trader Plus |
| Equity History |
5+ years |
~2 years |
5+ years |
| Crypto History |
Limited |
~2 years |
Extended |
| Intraday Bars |
Limited |
1Hour+ |
1Min+ |
| API Rate Limit |
Low |
200/min |
Higher |
| Data Quality |
Inconsistent |
Clean |
Clean |
| Zero-Volume Bars |
Common |
Rare |
Rare |
| Cost |
Free |
Free |
$99/month |
Failed Attempts (Critical)
| Attempt |
Why it Failed |
Lesson Learned |
| yfinance for crypto training |
Zero-volume bars, gaps |
Alpaca required for crypto |
| 5 years lookback |
Some symbols have <5 years |
4 years is safe for most |
| No cache for 4-year data |
API timeouts, rate limits |
Always cache large fetches |
| Assuming all symbols available |
Some delisted/missing |
Check bar count after fetch |
Final Parameters
# Recommended settings for Algo Trader Plus
lookback_days: 1460 # 4 years
min_bars_required: 500 # For selection
min_bars_training: 2000 # For training
cache_expiry_days: 7 # Weekly refresh
timeframe: '1Hour' # Base timeframe
# Expected bar counts (1Hour, equities)
1_year: ~1,638 bars
2_years: ~3,276 bars
4_years: ~6,552 bars
5_years: ~8,190 bars
Key Insights
- 4 years is the sweet spot: Enough data for robust training, available for most symbols
- yfinance is unreliable for crypto: Zero-volume bars break training
- Cache aggressively: 4-year fetches are slow without caching
- Not all symbols have full history: Check bar count after fetch
- Cost-benefit: $99/month for quality data saves debugging time
Subscription Tiers
| Tier |
Historical Data |
Best For |
| Free |
~2 years |
Testing, paper trading |
| Algo Trader Plus |
5+ years |
Production training |
| Market Data Pro |
Real-time |
Live trading optimization |
References
notebooks/training.ipynb: LOOKBACK_DAYS configuration
alpaca_trading/data/fetcher.py: DataFetcher class
- Alpaca Subscription Plans