| name | tzurot-db-vector |
| description | PostgreSQL and pgvector patterns for Tzurot v3 - Connection management, vector operations, migrations, and Railway-specific considerations. Use when working with database or memory retrieval. |
| lastUpdated | 2025-12-08 |
Tzurot v3 Database & Vector Memory
Use this skill when: Working with database queries, pgvector similarity search, migrations, or connection pool management.
Database Stack
- PostgreSQL 14+ (Railway managed)
- Prisma ORM (type-safe database access)
- pgvector extension (vector similarity search for personality memories)
Core Principles
- Connection pooling - Essential for Railway/containerized environments
- Typed queries - Use Prisma, avoid raw SQL where possible
- Migration-first - Schema changes via Prisma migrations
- Vector indexing - Use ivfflat for fast similarity search
- Cleanup policies - TTLs on conversation history, memory limits
Connection Management
Railway/Serverless Considerations
Problem: Railway containers restart frequently. Connection pools must handle:
- Cold starts
- Connection limits (Railway shared DB: 100 connections)
- Reconnection after network issues
Prisma Client Singleton
// packages/common-types/src/utils/prisma.ts
import { PrismaClient } from '@prisma/client';
let prisma: PrismaClient | undefined;
/**
* Get Prisma client singleton
* Reuses connection across requests
*/
export function getPrismaClient(): PrismaClient {
if (!prisma) {
prisma = new PrismaClient({
log: ['error', 'warn'],
// Connection pool configuration
datasources: {
db: {
url: process.env.DATABASE_URL,
},
},
});
// Graceful disconnect on process exit
process.on('beforeExit', async () => {
await prisma?.$disconnect();
});
}
return prisma;
}
Connection Pool Size
// prisma/schema.prisma
datasource db {
provider = "postgresql"
url = env("DATABASE_URL")
// Connection pool settings for Railway
// Railway shared DB limit: 100 connections
// Allocate: 20 per service (3 services = 60 total, leave headroom)
connectionLimit = 20
}
Usage in Services
// ✅ GOOD - Reuse singleton
import { getPrismaClient } from '@tzurot/common-types';
class PersonalityService {
constructor(private prisma: PrismaClient = getPrismaClient()) {}
async getPersonality(id: string): Promise<Personality | null> {
return this.prisma.personality.findUnique({ where: { id } });
}
}
// ❌ BAD - Creates new connection every time
class PersonalityService {
async getPersonality(id: string): Promise<Personality | null> {
const prisma = new PrismaClient(); // New connection!
const result = await prisma.personality.findUnique({ where: { id } });
await prisma.$disconnect();
return result;
}
}
pgvector Extension
Schema Setup
// prisma/schema.prisma
generator client {
provider = "prisma-client-js"
previewFeatures = ["postgresqlExtensions"]
}
datasource db {
provider = "postgresql"
url = env("DATABASE_URL")
extensions = [vector]
}
model Memory {
id String @id @default(uuid())
personalityId String
channelId String
content String
/// Vector embedding - managed outside Prisma (index created manually)
/// IMPORTANT: Prisma cannot manage vector indexes - see "pgvector Index Protection" section
embedding Unsupported("vector(1536)")? // OpenAI embedding dimension
metadata Json?
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
personality Personality @relation(fields: [personalityId], references: [id], onDelete: Cascade)
@@index([personalityId, channelId])
// NOTE: Vector index CANNOT be declared here - Prisma doesn't support it
// See "pgvector Index Protection" section for the required workflow
@@map("memories")
}
Vector Index (Managed Outside Prisma)
-- This index is created via manual migration and MUST NOT BE DROPPED
-- Prisma will try to drop it as "drift" - see "pgvector Index Protection" section
CREATE INDEX IF NOT EXISTS idx_memories_embedding
ON memories
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 50); -- MUST be 50 for consistency between dev/prod
pgvector Index Protection
The Problem
Prisma 7.0.0 does NOT natively support pgvector indexes. When you create a vector index manually:
- Prisma sees it in the database
- Prisma doesn't find it declared in the schema (because it CAN'T be declared)
- Prisma generates
DROP INDEX "idx_memories_embedding"as "drift correction" - Your similarity search becomes 100x slower
The Solution: --create-only Workflow
NEVER run prisma migrate dev directly. Always use this workflow:
# 1. Generate migration without applying
npx prisma migrate dev --create-only --name your_migration_name
# 2. MANUALLY REVIEW the generated SQL file
# Look for: DROP INDEX "idx_memories_embedding"
# If found, DELETE those lines from the migration file
# 3. Only then apply the migration
npx prisma migrate deploy
Safety Mechanisms
Pre-commit hook (hooks/pre-commit):
- Automatically checks staged migration files for
DROP INDEX.*idx_memories_embedding - Blocks commits that would drop the vector index
Manual check script:
pnpm check:migrations # Scans all migrations for dangerous operations
What If Prisma Already Dropped the Index?
If you accidentally ran prisma migrate dev and the index was dropped:
-- Recreate the index (run via psql or migration)
CREATE INDEX IF NOT EXISTS idx_memories_embedding
ON memories USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 50);
Why This Happens (Technical Details)
Prisma's migration engine:
- Diffs your schema against the database
- Considers anything in DB not in schema as "drift"
- Vector indexes CAN'T be declared in Prisma schema (Unsupported field limitation)
- Result: Every migration tries to "fix" the drift by dropping the index
There is no way to make Prisma ignore the index:
@ignorecannot be used onUnsupportedfields@@indexdoesn't support IVFFlat/HNSW typesextensions = [vector]doesn't help with index management
The only solution is disciplined use of --create-only and manual review.
Vector Operations
Storing Embeddings
// ai-worker/services/MemoryService.ts
import { getPrismaClient } from '@tzurot/common-types';
class MemoryService {
constructor(private prisma: PrismaClient = getPrismaClient()) {}
async storeMemory(data: {
personalityId: string;
channelId: string;
content: string;
embedding: number[];
}): Promise<void> {
// pgvector expects embedding as string "[0.1,0.2,...]"
const embeddingStr = `[${data.embedding.join(',')}]`;
await this.prisma.$executeRaw`
INSERT INTO memories (id, "personalityId", "channelId", content, embedding, "createdAt", "updatedAt")
VALUES (
gen_random_uuid(),
${data.personalityId}::uuid,
${data.channelId},
${data.content},
${embeddingStr}::vector,
NOW(),
NOW()
)
`;
}
}
Similarity Search
// ai-worker/services/MemoryService.ts
interface SimilarMemory {
id: string;
content: string;
similarity: number; // 1.0 = identical, 0.0 = opposite
createdAt: Date;
}
class MemoryService {
async findSimilarMemories(
personalityId: string,
queryEmbedding: number[],
limit: number = 5
): Promise<SimilarMemory[]> {
const embeddingStr = `[${queryEmbedding.join(',')}]`;
// Cosine similarity: 1 - cosine_distance
// Returns most similar memories first
const results = await this.prisma.$queryRaw<SimilarMemory[]>`
SELECT
id,
content,
1 - (embedding <-> ${embeddingStr}::vector) as similarity,
"createdAt"
FROM memories
WHERE "personalityId" = ${personalityId}::uuid
ORDER BY embedding <-> ${embeddingStr}::vector
LIMIT ${limit}
`;
return results;
}
}
Distance Functions
pgvector supports three distance metrics:
-- Cosine distance (most common for embeddings)
-- Range: 0 (identical) to 2 (opposite)
embedding <-> '[0.1,0.2,...]'::vector
-- L2 (Euclidean) distance
embedding <-> '[0.1,0.2,...]'::vector
-- Inner product (dot product, negative)
embedding <#> '[0.1,0.2,...]'::vector
For Tzurot: Use cosine distance (<->) - standard for text embeddings.
Query Patterns
Type-Safe Queries with Prisma
// ✅ GOOD - Type-safe Prisma query
const personalities = await prisma.personality.findMany({
where: {
isActive: true,
},
include: {
llmConfig: true,
},
orderBy: {
name: 'asc',
},
});
// Prisma infers return type: Personality[]
Raw SQL for Vector Operations
// When vector operations are needed, use $queryRaw
const memories = await prisma.$queryRaw<Memory[]>`
SELECT *
FROM memories
WHERE "personalityId" = ${personalityId}::uuid
AND embedding <-> ${embeddingStr}::vector < 0.5 -- Similarity threshold
ORDER BY embedding <-> ${embeddingStr}::vector
LIMIT 10
`;
Transactions
// Use transactions for multi-step operations
await prisma.$transaction(async tx => {
// Create personality
const personality = await tx.personality.create({
data: {
name: 'NewPersonality',
systemPrompt: 'Prompt',
},
});
// Create LLM config
await tx.llmConfig.create({
data: {
personalityId: personality.id,
model: 'anthropic/claude-sonnet-4.5',
temperature: 0.8,
},
});
});
Migrations
The One True Migration Workflow
This is the ONLY correct workflow for Railway development. Deviating causes checksum issues.
# 1. Create migration file (don't apply yet)
npx prisma migrate dev --create-only --name descriptive_name
# 2. Review/edit the generated SQL if needed (BEFORE applying)
# Edit: prisma/migrations/<timestamp>_<name>/migration.sql
# 3. Apply to Railway database
npx prisma migrate deploy
# That's it. Prisma handles checksums automatically.
Why this works:
--create-onlygenerates SQL but doesn't apply it- You can review/edit before applying
migrate deployapplies and updates checksums atomically- No manual checksum management needed
Common Anti-Patterns (DON'T DO THESE)
❌ Running SQL manually then marking as applied
# This causes checksum mismatches!
npx prisma db execute --file migration.sql
npx prisma migrate resolve --applied <migration_name>
❌ Editing migrations after applying
# Applied migration checksum won't match file!
# Edit: prisma/migrations/20250101_initial/migration.sql
❌ Using railway run prisma migrate dev
# We're not running Prisma inside Railway containers
# Always use local Prisma with DATABASE_URL from .env
✅ Instead, create a new migration
npx prisma migrate dev --create-only --name fix_previous_migration
# Edit the new migration to correct the issue
npx prisma migrate deploy
Migration Best Practices
1. Never modify applied migrations - create new ones
If you need to fix a migration that's already applied:
- Create a new migration that corrects the issue
- Use DROP/CREATE pattern for triggers/functions
- Document why the fix was needed
2. Migrations must be idempotent
-- ✅ GOOD - Idempotent (safe to run multiple times)
DROP TRIGGER IF EXISTS my_trigger ON my_table;
CREATE TRIGGER my_trigger...
DROP FUNCTION IF EXISTS my_function() CASCADE;
CREATE OR REPLACE FUNCTION my_function()...
CREATE INDEX IF NOT EXISTS idx_name ON table(column);
-- ❌ BAD - Fails if already exists
CREATE TRIGGER my_trigger...
CREATE FUNCTION my_function()...
CREATE INDEX idx_name ON table(column);
3. PostgreSQL triggers and functions
When creating triggers:
- Always
DROP IF EXISTSfirst - Use
CREATE OR REPLACEfor functions - CASCADE when dropping functions (drops dependent triggers)
-- Correct pattern for triggers
DROP FUNCTION IF EXISTS notify_cache_invalidation() CASCADE;
CREATE OR REPLACE FUNCTION notify_cache_invalidation()
RETURNS TRIGGER AS $$
BEGIN
PERFORM pg_notify('my_channel', row_to_json(NEW)::text);
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
DROP TRIGGER IF EXISTS cache_trigger ON my_table;
CREATE TRIGGER cache_trigger
AFTER UPDATE ON my_table
FOR EACH ROW
EXECUTE FUNCTION notify_cache_invalidation();
4. Test migrations before applying
# Check migration status
npx prisma migrate status
# Create migration
npx prisma migrate dev --create-only --name test_feature
# Review the SQL - does it look correct?
cat prisma/migrations/<timestamp>_test_feature/migration.sql
# Apply to Railway
npx prisma migrate deploy
5. Handle schema drift carefully
Prisma may generate DROP statements for indexes it can't represent (e.g., pgvector HNSW indexes). Always review generated migrations!
# Check the generated SQL for unwanted DROPs
grep -i "DROP INDEX" prisma/migrations/<timestamp>/migration.sql
Known drift issues are documented in: docs/database/PRISMA_DRIFT_ISSUES.md
Currently documented:
idx_memories_embedding- pgvector HNSW index (Unsupported types can't use vector operators)- GIN indexes are now supported natively in Prisma 6+ (fixed in schema)
Database Scripts (@tzurot/scripts)
The scripts/ folder is a pnpm workspace package. Use TypeScript scripts in scripts/src/ for database operations:
# Check for migration drift (checksums don't match)
pnpm --filter @tzurot/scripts run db:check-drift
# Fix drifted migrations
pnpm --filter @tzurot/scripts run db:fix-drift -- <migration_name> [<migration_name> ...]
Writing new scripts: Use scripts/src/db/check-migration-drift.ts as a template:
- Import from
@tzurot/common-types(e.g.,getPrismaClient,disconnectPrisma) - Use async/await with proper error handling
- Always call
disconnectPrisma()infinallyblock
Why TypeScript:
tsxhandles ESM/CJS interop automatically- Workspace dependency on
@tzurot/common-typesfor Prisma access - Type safety and better AI assistance
Checksum Issues
What causes them:
- Editing migration files after applying
- Running SQL manually via
db executethen marking as applied - File corruption or encoding issues
How to avoid them:
- Always use the workflow above
- Never edit applied migrations
- Never manually mark migrations as applied (except emergencies)
If you get checksum errors:
- Check
prisma migrate statusto see which migrations are affected - If the migration was applied correctly, use
migrate resolve:# EMERGENCY ONLY - when migration is correct but checksum is wrong npx prisma migrate resolve --applied <migration_name> - If the migration has wrong logic, create a NEW migration to fix it
Running Migrations in Microservices
Only one service should run migrations - Avoid race conditions
// ✅ GOOD - api-gateway runs migrations on startup
// services/api-gateway/src/index.ts
async function main(): Promise<void> {
logger.info('[Gateway] Running database migrations...');
try {
await prisma.$executeRaw`SELECT 1`; // Connection check
logger.info('[Gateway] Database connected');
} catch (error) {
logger.error({ err: error }, '[Gateway] Database connection failed');
process.exit(1);
}
// Start server...
}
// ❌ BAD - All services run migrations
// Causes race conditions and duplicate migrations
Railway-Specific Notes
We work directly against Railway database:
.envcontainsDATABASE_URLfor Railway dev database- No local PostgreSQL needed for development
- All migrations run against Railway directly
npx prisma migrate deployreadsDATABASE_URLfrom.env
Deployment:
- Push migration files to git
- Railway auto-deploys code
- Migrations run on api-gateway startup (via Prisma in application code)
- No manual migration commands needed in Railway CLI
Data Cleanup Patterns
TTL-Based Cleanup
// Clean up old conversation history (30 days)
class ConversationHistoryService {
async cleanupOldMessages(daysOld: number = 30): Promise<number> {
const cutoffDate = new Date();
cutoffDate.setDate(cutoffDate.getDate() - daysOld);
const result = await this.prisma.conversationHistory.deleteMany({
where: {
createdAt: {
lt: cutoffDate,
},
},
});
logger.info(`Cleaned up ${result.count} old messages (older than ${daysOld} days)`);
return result.count;
}
}
Memory Limits Per Personality
// Keep only N most recent memories per personality/channel
class MemoryService {
async enforceMemoryLimit(
personalityId: string,
channelId: string,
maxMemories: number = 1000
): Promise<void> {
// Get count of memories
const count = await this.prisma.memory.count({
where: { personalityId, channelId },
});
if (count <= maxMemories) {
return; // Within limit
}
// Delete oldest memories beyond limit
const toDelete = count - maxMemories;
await this.prisma.$executeRaw`
DELETE FROM memories
WHERE id IN (
SELECT id
FROM memories
WHERE "personalityId" = ${personalityId}::uuid
AND "channelId" = ${channelId}
ORDER BY "createdAt" ASC
LIMIT ${toDelete}
)
`;
logger.info(`Deleted ${toDelete} old memories for ${personalityId}`);
}
}
Indexing Strategy
What to Index
✅ Always index:
- Foreign keys
- Frequently filtered columns (WHERE clauses)
- Columns used in ORDER BY
- Vector columns (ivfflat index)
❌ Don't index:
- Text/blob columns
- Low cardinality columns (few unique values)
- Rarely queried columns
Index Examples
model ConversationHistory {
id String @id @default(uuid())
channelId String
personalityId String
userId String
role String // 'user' | 'assistant'
content String
createdAt DateTime @default(now())
// Composite index for common query pattern
@@index([channelId, personalityId, createdAt])
// Individual indexes for filtering
@@index([userId])
@@index([personalityId])
}
Error Handling
Connection Errors
class DatabaseService {
async executeWithRetry<T>(operation: () => Promise<T>, maxRetries: number = 3): Promise<T> {
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
return await operation();
} catch (error) {
if (attempt === maxRetries) throw error;
// Check if error is connection-related
if (
error instanceof Error &&
(error.message.includes('ECONNREFUSED') ||
error.message.includes('ETIMEDOUT') ||
error.message.includes('Connection terminated'))
) {
logger.warn(`Database connection error (attempt ${attempt}/${maxRetries}), retrying...`);
await new Promise(resolve => setTimeout(resolve, 1000 * attempt));
continue;
}
// Non-connection error, don't retry
throw error;
}
}
throw new Error('Should not reach here');
}
}
Unique Constraint Violations
async function createPersonality(data: PersonalityData): Promise<Personality> {
try {
return await prisma.personality.create({ data });
} catch (error) {
if (error instanceof Prisma.PrismaClientKnownRequestError) {
if (error.code === 'P2002') {
// Unique constraint violation
throw new Error(`Personality with name '${data.name}' already exists`);
}
}
throw error;
}
}
Performance Optimization
Query Optimization
// ❌ BAD - N+1 query problem
const personalities = await prisma.personality.findMany();
for (const personality of personalities) {
const config = await prisma.llmConfig.findUnique({
where: { personalityId: personality.id },
});
}
// ✅ GOOD - Use include to fetch relations
const personalities = await prisma.personality.findMany({
include: {
llmConfig: true,
},
});
Pagination
// Use cursor-based pagination for large datasets
async function getConversationHistory(
channelId: string,
cursor?: string,
limit: number = 50
): Promise<{ messages: Message[]; nextCursor?: string }> {
const messages = await prisma.conversationHistory.findMany({
where: { channelId },
take: limit + 1, // Fetch one extra to check if more exist
...(cursor && {
cursor: { id: cursor },
skip: 1, // Skip the cursor itself
}),
orderBy: { createdAt: 'desc' },
});
let nextCursor: string | undefined;
if (messages.length > limit) {
const nextItem = messages.pop()!;
nextCursor = nextItem.id;
}
return { messages, nextCursor };
}
Testing Database Code
Use Test Database
// vitest.config.ts
export default defineConfig({
test: {
env: {
DATABASE_URL: 'postgresql://test:test@localhost:5432/tzurot_test',
},
},
});
Mock Prisma in Unit Tests
// Create mock Prisma client
function createMockPrisma(): PrismaClient {
return {
personality: {
findUnique: vi.fn(),
findMany: vi.fn(),
create: vi.fn(),
update: vi.fn(),
},
$disconnect: vi.fn(),
} as unknown as PrismaClient;
}
describe('PersonalityService', () => {
it('should get personality by ID', async () => {
const mockPrisma = createMockPrisma();
vi.mocked(mockPrisma.personality.findUnique).mockResolvedValue({
id: 'test-id',
name: 'Test',
// ...
});
const service = new PersonalityService(mockPrisma);
const result = await service.getPersonality('test-id');
expect(result).toBeDefined();
expect(mockPrisma.personality.findUnique).toHaveBeenCalledWith({
where: { id: 'test-id' },
});
});
});
Related Skills
- tzurot-shared-types - Prisma schema and type definitions
- tzurot-observability - Query logging and performance monitoring
- tzurot-architecture - Database service placement
- tzurot-constants - Connection pool limits and timeouts
References
- Prisma docs: https://www.prisma.io/docs
- pgvector docs: https://github.com/pgvector/pgvector
- Railway PostgreSQL: https://docs.railway.app/databases/postgresql
- Schema:
prisma/schema.prisma