name	RAG & Vector Search
description	Expert knowledge of RAG (Retrieval-Augmented Generation) and vector search implementation for SEPilot Desktop. Use when implementing document search, semantic retrieval, or knowledge base features. Ensures efficient embeddings, vector storage, and retrieval patterns.

RAG & Vector Search Skill

Overview

SEPilot Desktop uses RAG to enhance LLM responses with relevant context from documents and knowledge bases:

Embeddings: Text를 vector로 변환
Vector Store: 효율적인 similarity search
Retrieval: 사용자 쿼리와 관련된 문서 찾기
Augmentation: LLM 프롬프트에 검색된 문서 추가

Architecture

┌──────────┐     Embed      ┌────────────┐    Search    ┌─────────────┐
│ Document │ ────────────→  │   Vector   │ ←──────────  │ User Query  │
│          │                │   Store    │              │             │
└──────────┘                └────────────┘              └─────────────┘
                                  │                            │
                                  │        Retrieved Docs      │
                                  └────────────────────────────┘
                                                │
                                                ↓
                                         ┌─────────────┐
                                         │  LLM with   │
                                         │  Context    │
                                         └─────────────┘

Embedding Models

OpenAI Embeddings

import { OpenAIEmbeddings } from '@langchain/openai';

const embeddings = new OpenAIEmbeddings({
  modelName: 'text-embedding-3-small', // 빠르고 저렴
  // modelName: 'text-embedding-3-large', // 더 정확하지만 비쌈
  apiKey: process.env.OPENAI_API_KEY,
});

const vector = await embeddings.embedQuery('사용자 쿼리');

Anthropic Embeddings (Claude)

import { AnthropicEmbeddings } from '@langchain/anthropic';

const embeddings = new AnthropicEmbeddings({
  modelName: 'claude-embed-3',
  apiKey: process.env.ANTHROPIC_API_KEY,
});

Local Embeddings (무료, 오프라인)

import { HuggingFaceTransformersEmbeddings } from '@langchain/community/embeddings/hf_transformers';

const embeddings = new HuggingFaceTransformersEmbeddings({
  modelName: 'Xenova/all-MiniLM-L6-v2', // 경량 모델
});

Vector Stores

In-Memory (개발/테스트용)

import { MemoryVectorStore } from 'langchain/vectorstores/memory';
import { Document } from 'langchain/document';

const vectorStore = await MemoryVectorStore.fromDocuments(
  [
    new Document({
      pageContent: '문서 내용',
      metadata: { source: 'doc1.txt' },
    }),
  ],
  embeddings
);

File-Based (Electron 앱에 적합)

import { FaissStore } from '@langchain/community/vectorstores/faiss';
import { app } from 'electron';
import * as path from 'path';

const vectorStorePath = path.join(app.getPath('userData'), 'vectorstores', 'main');

// Save vector store
const vectorStore = await FaissStore.fromDocuments(documents, embeddings);
await vectorStore.save(vectorStorePath);

// Load vector store
const loadedStore = await FaissStore.load(vectorStorePath, embeddings);

ChromaDB (고급 기능)

import { Chroma } from '@langchain/community/vectorstores/chroma';

const vectorStore = await Chroma.fromDocuments(documents, embeddings, {
  collectionName: 'sepilot-docs',
  url: 'http://localhost:8000', // ChromaDB 서버
});

Document Processing

Text Splitting

긴 문서를 작은 청크로 분할:

import { RecursiveCharacterTextSplitter } from 'langchain/text_splitter';

const splitter = new RecursiveCharacterTextSplitter({
  chunkSize: 1000, // 청크 크기 (토큰 수)
  chunkOverlap: 200, // 청크 간 겹침 (컨텍스트 유지)
});

const docs = await splitter.createDocuments([longText]);

Document Loaders

다양한 파일 형식 로드:

// PDF
import { PDFLoader } from 'langchain/document_loaders/fs/pdf';
const pdfLoader = new PDFLoader('file.pdf');

// Markdown
import { TextLoader } from 'langchain/document_loaders/fs/text';
const mdLoader = new TextLoader('README.md');

// JSON
import { JSONLoader } from 'langchain/document_loaders/fs/json';
const jsonLoader = new JSONLoader('data.json');

// CSV
import { CSVLoader } from 'langchain/document_loaders/fs/csv';
const csvLoader = new CSVLoader('data.csv');

const docs = await loader.load();

RAG Implementation

Basic RAG Pattern

// lib/rag/basic-rag.ts
import { ChatAnthropic } from '@langchain/anthropic';
import { MemoryVectorStore } from 'langchain/vectorstores/memory';

export class BasicRAG {
  constructor(
    private vectorStore: MemoryVectorStore,
    private llm: ChatAnthropic
  ) {}

  async query(question: string, k: number = 3): Promise<string> {
    // 1. Retrieve relevant documents
    const relevantDocs = await this.vectorStore.similaritySearch(question, k);

    // 2. Build context from retrieved docs
    const context = relevantDocs
      .map((doc) => `Source: ${doc.metadata.source}\n${doc.pageContent}`)
      .join('\n\n---\n\n');

    // 3. Augment prompt with context
    const prompt = `다음 문서들을 참고하여 질문에 답변해주세요:

${context}

질문: ${question}

답변:`;

    // 4. Generate response
    const response = await this.llm.invoke(prompt);
    return response.content as string;
  }
}

Advanced RAG with Reranking

import { CohereRerank } from '@langchain/cohere';

export class AdvancedRAG extends BasicRAG {
  private reranker = new CohereRerank({
    apiKey: process.env.COHERE_API_KEY,
    topN: 3,
  });

  async query(question: string): Promise<string> {
    // 1. Initial retrieval (get more docs)
    const candidateDocs = await this.vectorStore.similaritySearch(question, 10);

    // 2. Rerank for better relevance
    const rerankedDocs = await this.reranker.rerank(
      question,
      candidateDocs.map((doc) => doc.pageContent)
    );

    // 3. Use top reranked docs
    const relevantDocs = rerankedDocs.slice(0, 3).map((item) => candidateDocs[item.index]);

    // 4. Generate response
    const context = relevantDocs.map((doc) => doc.pageContent).join('\n\n---\n\n');

    const prompt = `${context}\n\n질문: ${question}\n\n답변:`;
    const response = await this.llm.invoke(prompt);
    return response.content as string;
  }
}

IPC Integration

Backend Handler

// electron/ipc/handlers/rag.ts
import { ipcMain, IpcMainInvokeEvent } from 'electron';
import { RAGService } from '../services/rag-service';

const ragService = new RAGService();

export function setupRAGHandlers() {
  // Index documents
  ipcMain.handle(
    'rag:index',
    async (
      event: IpcMainInvokeEvent,
      request: {
        documents: Array<{ content: string; metadata: Record<string, unknown> }>;
        collectionName: string;
      }
    ) => {
      try {
        await ragService.indexDocuments(request.documents, request.collectionName);
        return { success: true };
      } catch (error) {
        return {
          success: false,
          error: error instanceof Error ? error.message : 'Unknown error',
        };
      }
    }
  );

  // Query documents
  ipcMain.handle(
    'rag:query',
    async (
      event: IpcMainInvokeEvent,
      request: {
        question: string;
        collectionName: string;
        k?: number;
      }
    ) => {
      try {
        const response = await ragService.query(
          request.question,
          request.collectionName,
          request.k
        );
        return { success: true, response };
      } catch (error) {
        return {
          success: false,
          error: error instanceof Error ? error.message : 'Unknown error',
        };
      }
    }
  );

  // Search similar documents
  ipcMain.handle(
    'rag:search',
    async (
      event: IpcMainInvokeEvent,
      request: {
        query: string;
        collectionName: string;
        k?: number;
      }
    ) => {
      try {
        const documents = await ragService.search(request.query, request.collectionName, request.k);
        return { success: true, documents };
      } catch (error) {
        return {
          success: false,
          error: error instanceof Error ? error.message : 'Unknown error',
        };
      }
    }
  );
}

Frontend Usage

// lib/hooks/useRAG.ts
export function useRAG() {
  const indexDocuments = async (
    documents: Array<{ content: string; metadata: Record<string, unknown> }>,
    collectionName: string
  ): Promise<void> => {
    const result = await window.electron.invoke('rag:index', {
      documents,
      collectionName,
    });

    if (!result.success) {
      throw new Error(result.error);
    }
  };

  const query = async (question: string, collectionName: string, k = 3): Promise<string> => {
    const result = await window.electron.invoke('rag:query', {
      question,
      collectionName,
      k,
    });

    if (!result.success) {
      throw new Error(result.error);
    }

    return result.response;
  };

  const search = async (query: string, collectionName: string, k = 5): Promise<Document[]> => {
    const result = await window.electron.invoke('rag:search', {
      query,
      collectionName,
      k,
    });

    if (!result.success) {
      throw new Error(result.error);
    }

    return result.documents;
  };

  return { indexDocuments, query, search };
}

Metadata Filtering

// Search with metadata filters
const results = await vectorStore.similaritySearch('TypeScript patterns', 5, {
  fileType: 'typescript', // Only .ts files
  category: 'backend', // Only backend code
});

Hybrid Search

Vector search + Keyword search:

export class HybridSearch {
  async search(query: string, k: number = 5): Promise<Array<Document & { score: number }>> {
    // 1. Vector search
    const vectorResults = await this.vectorStore.similaritySearchWithScore(query, k);

    // 2. Keyword search (BM25)
    const keywordResults = await this.keywordSearch(query, k);

    // 3. Combine and rerank
    const combined = this.combineResults(vectorResults, keywordResults);

    return combined.slice(0, k);
  }
}

Performance Optimization

Caching

class RAGCache {
  private cache = new Map<string, { response: string; timestamp: number }>();
  private ttl = 1000 * 60 * 5; // 5 minutes

  get(query: string): string | null {
    const cached = this.cache.get(query);
    if (!cached) return null;

    if (Date.now() - cached.timestamp > this.ttl) {
      this.cache.delete(query);
      return null;
    }

    return cached.response;
  }

  set(query: string, response: string): void {
    this.cache.set(query, { response, timestamp: Date.now() });
  }
}

Batch Processing

// Index documents in batches
async function indexInBatches(documents: Document[], batchSize: number = 100): Promise<void> {
  for (let i = 0; i < documents.length; i += batchSize) {
    const batch = documents.slice(i, i + batchSize);
    await vectorStore.addDocuments(batch);
  }
}

Best Practices

Chunk Size: 1000-1500 characters가 일반적으로 최적
Overlap: 10-20% overlap으로 context 유지
Top K: 3-5개 문서가 적당 (너무 많으면 noise)
Metadata: 출처, 날짜, 카테고리 등 메타데이터 포함
Updates: 문서 변경 시 벡터 스토어도 업데이트
Caching: 자주 사용되는 쿼리는 캐싱

Real-World Example

// components/DocumentSearch.tsx
export function DocumentSearch() {
  const { query, indexDocuments } = useRAG();
  const [result, setResult] = useState<string>('');

  const handleSearch = async (question: string): Promise<void> => {
    const response = await query(question, 'main-docs', 3);
    setResult(response);
  };

  const handleIndexFiles = async (files: File[]): Promise<void> => {
    const documents = await Promise.all(
      files.map(async (file) => ({
        content: await file.text(),
        metadata: { source: file.name, type: file.type },
      }))
    );

    await indexDocuments(documents, 'main-docs');
  };

  return <div>{/* UI implementation */}</div>;
}

Testing

// tests/rag/basic-rag.test.ts
describe('BasicRAG', () => {
  let rag: BasicRAG;

  beforeAll(async () => {
    const docs = [
      new Document({
        pageContent: 'TypeScript is a typed superset of JavaScript',
        metadata: { source: 'ts-intro.md' },
      }),
    ];

    const vectorStore = await MemoryVectorStore.fromDocuments(docs, embeddings);
    rag = new BasicRAG(vectorStore, llm);
  });

  it('should answer questions based on indexed documents', async () => {
    const response = await rag.query('What is TypeScript?');
    expect(response).toContain('typed');
    expect(response).toContain('JavaScript');
  });
});

RAG & Vector Search

Install Skill

SKILL.md

RAG & Vector Search Skill

Overview

Architecture

Embedding Models

OpenAI Embeddings

Anthropic Embeddings (Claude)

Local Embeddings (무료, 오프라인)

Vector Stores

In-Memory (개발/테스트용)

File-Based (Electron 앱에 적합)

ChromaDB (고급 기능)

Document Processing

Text Splitting

Document Loaders

RAG Implementation

Basic RAG Pattern

Advanced RAG with Reranking

IPC Integration

Backend Handler

Frontend Usage

Metadata Filtering

Hybrid Search

Performance Optimization

Caching

Batch Processing

Best Practices

Real-World Example

Testing