| name | tzurot-observability |
| description | Use when adding logging, debugging production issues, checking service health, or performing routine operations (adding personalities, database queries). Covers structured logging, correlation IDs, Railway log analysis, and common operations. |
| lastUpdated | 2025-12-20 |
Tzurot v3 Observability & Operations
Use this skill when: Adding logging, debugging production issues, checking health, performing database/Redis operations, or adding new personalities.
Quick Reference
# Check service health
curl https://api-gateway-development-83e8.up.railway.app/health
# View logs
railway logs --service api-gateway
railway logs --service ai-worker | grep "ERROR"
# Find request across services
railway logs | grep "requestId\":\"abc-123"
// Structured logging pattern
logger.info({ requestId, userId, personalityId }, 'Processing request');
logger.error({ err: error, context: 'additional data' }, 'Operation failed');
Structured Logging
The Golden Pattern
logger.info(
{ contextObject }, // First param: structured data
'Human readable message' // Second param: message string
);
// ✅ GOOD
logger.info({ personalityId, model }, 'Loaded personality');
logger.error({ err: error, requestId }, 'Failed to process');
// ❌ BAD - String interpolation loses structure
logger.info(`Loaded personality ${personalityId}`);
Log Levels
| Level | When to Use |
|---|---|
error |
Errors needing attention |
warn |
Potential issues, retries |
info |
Normal operations |
debug |
Debugging details |
trace |
Very detailed (disabled in prod) |
Error Logging
// Use 'err' key for Pino's special error serialization
logger.error({ err: error, requestId, userId }, 'Request failed');
Privacy Rules
NEVER log: API keys, tokens, passwords, email, IP, message content, DM content
SAFE to log: User IDs, channel IDs, counts, durations, error types
Correlation IDs
Track requests across services with requestId:
// bot-client: Generate and include in request
const requestId = randomUUID();
await fetch(url, { headers: { 'X-Request-ID': requestId } });
// All services: Include in logs
logger.info({ requestId, jobId }, 'Processing');
// Search across services
railway logs | grep "requestId\":\"abc-123"
Health & Metrics
Health endpoint: GET /health
curl https://api-gateway-development-83e8.up.railway.app/health
Check all services:
railway status
railway logs --service api-gateway --tail 50
Common Operations
Adding a Personality
- Create
personalities/name.json:
{
"name": "PersonalityName",
"systemPrompt": "Your description...",
"model": "anthropic/claude-sonnet-4.5",
"temperature": 0.8
}
- Commit and push (Railway auto-deploys)
Database Quick Commands
# Check migrations
railway run npx prisma migrate status
# Apply migrations
railway run npx prisma migrate deploy
# Direct database access
railway run psql
# Open Prisma Studio
npx prisma studio
Redis Queue Status
railway run redis-cli
LLEN bull:ai-generation:wait
LLEN bull:ai-generation:active
LLEN bull:ai-generation:failed
Debugging Checklist
- Check logs:
railway logs --service <name> - Check health:
curl .../health - Check env vars:
railway variables --service <name> - Find errors:
railway logs | grep '"level":"error"' - Trace request:
railway logs | grep "requestId\":\"..."
Common Issues
| Symptom | Likely Cause | Check |
|---|---|---|
| Bot not responding | bot-client crashed | Logs, DISCORD_TOKEN |
| Slow responses | AI worker overload | ai-worker logs, queue depth |
| 500 errors | Database connection | DATABASE_URL, migrations |
| Jobs stuck | Redis connection | REDIS_URL, ai-worker status |
Anti-Patterns
// ❌ BAD - console.log
console.log('Processing');
// ✅ GOOD - structured logger
logger.info({ messageId }, 'Processing');
// ❌ BAD - logging in loops
for (const item of items) {
logger.info({ item }, 'Processing'); // Spams logs!
}
// ✅ GOOD - log summary
logger.info({ count: items.length }, 'Processing items');
// ❌ BAD - swallowing errors
try {
doSomething();
} catch (e) {
/* nothing */
}
// ✅ GOOD - log and handle
try {
doSomething();
} catch (e) {
logger.error({ err: e }, 'Failed');
throw e;
}
Railway Log Commands
# Recent logs
railway logs --service api-gateway
# Follow in real-time
railway logs --service ai-worker --tail
# Since time
railway logs --since 1h
# Find errors
railway logs | grep '"level":"error"'
# Count error types
railway logs --since 24h | grep error | jq '.msg' | sort | uniq -c
Related Skills
- tzurot-deployment - Railway service management, rollbacks
- tzurot-db-vector - Database migrations, pgvector
- tzurot-security - Privacy in logging
- tzurot-async-flow - Job correlation, BullMQ
References
- Pino docs: https://getpino.io/
- Railway CLI:
docs/reference/RAILWAY_CLI_REFERENCE.md - Logger utility:
packages/common-types/src/utils/logger.ts