name	backend-dev-guidelines
description	Comprehensive backend development guide for Cheerful codebase. Covers Temporal.io workflows, SQLAlchemy 2.0 patterns, Gmail/Sheets APIs, FastAPI, Supabase, testing with pytest, and Fly.io deployment. Use when working with backend code, debugging issues, implementing features, database queries, API endpoints, durable workflows, email processing, or deployment.
version	1.0.0
license	MIT
allowed-tools	Read, Write, Edit, Bash, Grep, Glob

Backend Development Guide for Cheerful

This skill provides critical gotchas, pre-flight checklists, and quick fixes for backend development in the Cheerful codebase. The system implements an email campaign management platform using FastAPI, Temporal.io for durable workflows, SQLAlchemy 2.0 Core for database access, Gmail API for email processing, and Supabase for PostgreSQL, auth, and storage.

When to Use This Skill

Working with Temporal.io workflows or activities
Writing SQLAlchemy database queries or models
Implementing Gmail API integrations
Working with Google Sheets data
Writing or debugging tests
Deploying to Fly.io (staging/production)
Building FastAPI endpoints
Configuring Supabase (PostgreSQL, Storage, Auth with RLS)

Critical Gotchas (Top 10)

1. Temporal Activities Retry INDEFINITELY by Default

Activities retry forever without explicit limits. ALWAYS set retry_policy=RetryPolicy(maximum_attempts=N).

# BAD - Retries forever
await workflow.execute_activity(my_activity, params)

# GOOD - Explicit retry limit
await workflow.execute_activity(
    my_activity,
    params,
    retry_policy=RetryPolicy(maximum_attempts=3),
)

See {baseDir}/references/temporal.md for retry patterns.

2. Database Sessions Must Be Short-Lived

NEVER hold database sessions during LLM calls, API requests, or long operations. Extract scalars before closing session.

# BAD - Session held during 30+ second LLM call
with get_db_session_context() as db:
    campaign = CampaignRepository(db).get_by_id(id)
    llm_result = llm_service.generate(campaign.goal)  # Session stays open!

# GOOD - Extract scalar, close session, then call LLM
with get_db_session_context() as db:
    campaign = CampaignRepository(db).get_by_id(id)
    campaign__goal = campaign.goal  # Extract scalar
# Session closed

llm_result = llm_service.generate(campaign__goal)  # No session held

See {baseDir}/references/sqlalchemy.md for session patterns.

3. gmail_thread_id is NOT Globally Unique

gmail_thread_id is unique per Gmail account, NOT globally. Always scope by gmail_account_id.

# BAD - Can return wrong thread from different account
stmt = select(GmailThreadState).where(
    GmailThreadState.gmail_thread_id == thread_id
)

# GOOD - Scoped to account
stmt = select(GmailThreadState).where(
    GmailThreadState.gmail_account_id == account_id,
    GmailThreadState.gmail_thread_id == thread_id
)

See {baseDir}/references/gmail-api.md for Gmail patterns.

4. Gmail API Requires format='raw'

Must use format='raw' when fetching messages to get base64url-encoded RFC 2822 email for stdlib parsing.

# GOOD - Returns raw RFC 2822 email
response = (
    self.service.users()
    .messages()
    .get(userId="me", id=message_id, format="raw")
    .execute()
)

Missing or wrong format parameter will cause ValueError in create_gmail_message_from_raw().

See {baseDir}/references/gmail-api.md for details.

5. Tests Cannot Run in Parallel

Tests use shared database and storage bucket. No pytest-xdist support. Each test drops all tables.

# GOOD - Sequential execution
(cd apps/backend && uv run pytest)

# BAD - Will fail with conflicts
(cd apps/backend && uv run pytest -n auto)

See {baseDir}/references/testing.md for test setup.

6. Graceful Shutdown Timeout Must Be Less Than Kill Timeout

Worker graceful_shutdown_timeout (4min 30sec) must be less than Fly.io kill_timeout (5min) or workers will be SIGKILL'ed mid-operation.

Current configuration: Correctly set at 4.5min < 5min.

See {baseDir}/references/fly-deployment.md for deployment config.

7. USE_MOCK_WORKFLOW_TOOLS Must Be False in Production

Production MUST set USE_MOCK_WORKFLOW_TOOLS=false in .production.env or real API calls won't happen. Default is true (mock mode).

# In .production.env
USE_MOCK_WORKFLOW_TOOLS=false

See {baseDir}/references/fly-deployment.md for secrets management.

8. Use workflow.logger in Workflows, structlog in Activities

CRITICAL: Use workflow.logger in workflows, NOT structlog.get_logger().

# In workflows - GOOD
workflow.logger.info(f"Processing: {status}")

# In activities - GOOD
log = structlog.get_logger()
log.info("Activity started", state_id=params.state__id)

See {baseDir}/references/temporal.md for logging patterns.

9. Database Sessions via Context Manager, NOT Depends()

DB sessions use with get_db_session_context(), not FastAPI's Depends(). Auto-commits on success, auto-rollback on exception.

# GOOD - Context manager pattern
with get_db_session_context() as db:
    repo = CampaignRepository(db)
    campaign = repo.get_by_id(id)
    # Auto-commits here

# BAD - Don't use Depends() for DB sessions
async def endpoint(db: Session = Depends(get_db)):
    # Not the pattern used in this codebase

See {baseDir}/references/fastapi.md for API patterns.

10. SQLAlchemy Models Cannot Be Passed Through Temporal

Never pass SQLAlchemy models as Temporal workflow/activity parameters. Use Pydantic models or extract scalars.

# BAD - SQLAlchemy model through Temporal
await workflow.execute_activity(process_campaign, campaign)  # campaign is SQLAlchemy model

# GOOD - Pydantic model
campaign_dto = CampaignDto.model_validate(campaign)
await workflow.execute_activity(process_campaign, campaign_dto)

# ALSO GOOD - Extract scalars
campaign__id = campaign.id
campaign__goal = campaign.goal
await workflow.execute_activity(process_campaign, campaign__id, campaign__goal)

See {baseDir}/references/temporal.md and {baseDir}/references/sqlalchemy.md.

Pre-Flight Checklists

Before Writing Temporal Workflow

Workflow must be deterministic (no random, no datetime.now(), no direct I/O)
All I/O operations in activities, not workflows
Set retry_policy on ALL activity executions
Use Pydantic models for inputs/outputs (never SQLAlchemy models)
Use workflow.logger for logging (not structlog)
Set timeouts on activity executions
Activities must be idempotent (safe to retry)

Before Database Operations

Use SQLAlchemy 2.0 Core (not ORM methods like .query())
Extract scalars before long operations: campaign__goal = campaign.goal
Use with get_db_session_context() for auto-commit/rollback
Never pass SQLAlchemy models through Temporal
Use on_conflict_do_nothing() for idempotent inserts
Never hold session during LLM/API calls

Before Deployment

Verify USE_MOCK_WORKFLOW_TOOLS=false in .production.env
Check graceful shutdown timeout < kill timeout (4.5min < 5min)
Verify all required secrets in .production.env or .staging.env
Confirm encryption keys are exact hex length (64 and 32 chars)
Test health check endpoint returns correct environment

Before Gmail API Integration

Always scope gmail_thread_id queries by gmail_account_id
Use format='raw' for message fetching
Implement Gmail email normalization (dots, plus addressing, googlemail.com)
Handle Optional returns (drafts return None from processor)
Use idempotent operations: ON CONFLICT DO NOTHING
Test with actual API call (not just credentials.refresh())

Quick Fixes for Common Issues

"Session is already closed" Error

# WRONG - Session closed after context exit
with get_db_session_context() as db:
    campaign = CampaignRepository(db).get_by_id(id)
# Session closed!
llm_result = llm_service.generate(campaign.goal)  # Error: Session closed

# CORRECT - Extract scalar before session closes
with get_db_session_context() as db:
    campaign = CampaignRepository(db).get_by_id(id)
    campaign__goal = campaign.goal  # Extract to local variable
# Session closed
llm_result = llm_service.generate(campaign__goal)  # Works!

Temporal Activity Retries Forever

# WRONG - No retry limit, will retry forever
await workflow.execute_activity(
    check_is_latest_activity,
    state_id,
    start_to_close_timeout=timedelta(seconds=30),
)

# CORRECT - Explicit retry limit
await workflow.execute_activity(
    check_is_latest_activity,
    state_id,
    start_to_close_timeout=timedelta(seconds=30),
    retry_policy=RetryPolicy(maximum_attempts=1),
)

Wrong Gmail Thread Returned

# WRONG - gmail_thread_id not scoped to account
def get_latest_by_gmail_thread_id(self, gmail_thread_id: str):
    stmt = select(GmailThreadState).where(
        GmailThreadState.gmail_thread_id == gmail_thread_id
    )
    return self.db.execute(stmt).scalar_one_or_none()

# CORRECT - Scope by gmail_account_id
def get_latest_by_gmail_thread_id(
    self, gmail_account_id: uuid.UUID, gmail_thread_id: str
):
    stmt = select(GmailThreadState).where(
        GmailThreadState.gmail_account_id == gmail_account_id,
        GmailThreadState.gmail_thread_id == gmail_thread_id
    )
    return self.db.execute(stmt).scalar_one_or_none()

Tests Failing Due to Missing auth.users

# WRONG - Insert UserGmailAccount without auth.users entry
with get_db_session_context() as db:
    account = UserGmailAccount(user_id=user_id, ...)
    db.add(account)
    # Fails: Foreign key constraint on auth.users

# CORRECT - Insert into auth.users first
with engine.connect() as conn:
    conn.execute(text("""
        INSERT INTO auth.users (id, email)
        VALUES (:user_id, :email)
        ON CONFLICT DO NOTHING
    """), {"user_id": str(user_id), "email": test_email})
    conn.commit()

with get_db_session_context() as db:
    account = UserGmailAccount(user_id=user_id, ...)
    db.add(account)

Mock Tools Enabled in Production

# Check if mock tools are enabled
fly ssh console -a prd-cheerful
echo $USE_MOCK_WORKFLOW_TOOLS  # Should output: false

# If it outputs "true" or empty, update secrets:
# In .production.env:
USE_MOCK_WORKFLOW_TOOLS=false

# Apply secrets (will trigger deployment):
flyctl secrets import --app prd-cheerful < ./infra/prd/.production.env

Core Patterns by Technology

Temporal.io

Durable workflow orchestration for long-running operations (email processing, LLM calls).

Key Files:

Worker setup: apps/backend/src/temporal/worker.py
Workflows: apps/backend/src/temporal/workflow/ (all files end with _workflow.py)
Activities: apps/backend/src/temporal/activity/ (all files end with _activity.py)
Models: apps/backend/src/models/temporal/ (Pydantic only)

Full guide: {baseDir}/references/temporal.md

SQLAlchemy 2.0

Database access using SQLAlchemy 2.0 Core (not ORM). Short-lived sessions with auto-commit/rollback.

Key Files:

Database config: apps/backend/src/core/database.py
Models: apps/backend/src/models/database/
Repositories: apps/backend/src/repositories/

Full guide: {baseDir}/references/sqlalchemy.md

Gmail API

Email fetching, thread reconstruction, and state management with event-sourced versioning.

Key Files:

Gmail service: apps/backend/src/services/external/gmail.py
Email processor: apps/backend/src/services/email/processor.py
Thread state repository: apps/backend/src/repositories/gmail_thread_state.py

Full guide: {baseDir}/references/gmail-api.md

Google Sheets

Recipient data import and metrics export with rate-limit protection via separate Temporal task queue.

Key Files:

Sheets service: apps/backend/src/services/external/gsheet.py
Metrics activity: apps/backend/src/temporal/activity/thread_metrics_activity.py

Full guide: {baseDir}/references/google-sheets.md

Testing

Pytest with shared database (no parallel execution). Integration tests use real services, unit tests use mocks.

Key Files:

Test config: apps/backend/tests/conftest.py
Test directory: apps/backend/tests/

Full guide: {baseDir}/references/testing.md

Fly.io Deployment

Process groups (web + worker) with shared secrets. Staging and production environments.

Key Files:

Staging config: infra/stg/fly.toml
Production config: infra/prd/fly.toml
Deploy scripts: infra/stg/deploy.sh, infra/prd/deploy.sh

Full guide: {baseDir}/references/fly-deployment.md

FastAPI

API routes with JWT authentication, context-managed DB sessions, and Pydantic request/response models.

Key Files:

Main app: apps/backend/main.py
Routes: apps/backend/src/api/route/
Auth dependencies: apps/backend/src/api/dependencies/auth.py

Full guide: {baseDir}/references/fastapi.md

Supabase

PostgreSQL with Row Level Security (RLS), Storage for email content, and Auth with JWT verification.

Key Files:

Database config: apps/backend/src/core/database.py
Storage service: apps/backend/src/services/storage/storage.py
Migrations: supabase/migrations/

Full guide: {baseDir}/references/supabase.md

Architecture Insights

For comprehensive architectural patterns see:

Dual High Water Mark Pattern: {baseDir}/references/architecture-patterns.md
Event-sourced state management: {baseDir}/references/architecture-patterns.md
Idempotent operations: {baseDir}/references/architecture-patterns.md
AI Features as Pure Functions: {baseDir}/references/architecture-patterns.md
DTO Pattern: {baseDir}/references/sqlalchemy.md
Service Organization: {baseDir}/references/architecture-patterns.md

Testing Commands

# Prerequisites: Start local dev cluster
./infra/dev.sh start --fresh

# Check services are running
docker ps

# Run all tests (from repo root)
(cd apps/backend && uv run pytest)

# Run specific test file
(cd apps/backend && uv run pytest tests/repositories/test_campaign.py)

# Run specific test function
(cd apps/backend && uv run pytest tests/repositories/test_campaign.py::TestCampaignRepository::test_empty_database)

# Skip integration tests (faster)
(cd apps/backend && uv run pytest -m "not integration")

# Only integration tests (requires API keys)
(cd apps/backend && uv run pytest -m integration)

# Verbose output
(cd apps/backend && uv run pytest -v)

Deployment Commands

# Check deployment status
fly status -a prd-cheerful
fly status -a stg-cheerful

# View logs
fly logs -a prd-cheerful                # All processes
fly logs -a prd-cheerful --process web  # Web only
fly logs -a prd-cheerful --process worker  # Worker only

# Deploy staging
./infra/stg/deploy.sh

# Deploy production
./infra/prd/deploy.sh

# Preview secrets (doesn't apply)
flyctl secrets import --stage --app prd-cheerful < ./infra/prd/.production.env

# Apply secrets (triggers deployment!)
flyctl secrets import --app prd-cheerful < ./infra/prd/.production.env

# List current secrets
fly secrets list -a prd-cheerful

# Scale processes
fly scale count web=2 worker=3 -a prd-cheerful

# SSH into machine
fly ssh console -a prd-cheerful

# Health check
curl https://prd-cheerful.fly.dev/health

backend-dev-guidelines

Install Skill

SKILL.md

Backend Development Guide for Cheerful

When to Use This Skill

Critical Gotchas (Top 10)

1. Temporal Activities Retry INDEFINITELY by Default

2. Database Sessions Must Be Short-Lived

3. gmail_thread_id is NOT Globally Unique

4. Gmail API Requires format='raw'

5. Tests Cannot Run in Parallel

6. Graceful Shutdown Timeout Must Be Less Than Kill Timeout

7. USE_MOCK_WORKFLOW_TOOLS Must Be False in Production

8. Use workflow.logger in Workflows, structlog in Activities

9. Database Sessions via Context Manager, NOT Depends()

10. SQLAlchemy Models Cannot Be Passed Through Temporal

Pre-Flight Checklists

Before Writing Temporal Workflow

Before Database Operations

Before Deployment

Before Gmail API Integration

Quick Fixes for Common Issues

"Session is already closed" Error

Temporal Activity Retries Forever

Wrong Gmail Thread Returned

Tests Failing Due to Missing auth.users

Mock Tools Enabled in Production

Core Patterns by Technology

Temporal.io

SQLAlchemy 2.0

Gmail API

Google Sheets

Testing

Fly.io Deployment

FastAPI

Supabase

Architecture Insights

Testing Commands

Deployment Commands