| name | doc-testing |
| description | Comprehensive guide for writing tests in magenta.nvim, including test environment setup, mock providers, driver interactions, and best practices |
Testing in magenta.nvim
To run the full test suite, use npx vitest run from the project root. You do not need to cd.
To run a specific test file, use npx vitest run <file>. Important You do not need to cd.
Tests should make use of the node/test/preamble.ts helpers.
When doing integration-level testing, like user flows, use the withDriver helper and the interactions in node/test/driver.ts. When performing generic user actions that may be reusable between tests, put them into the NvimDriver class as helpers.
As of July 2025, tests are now run in parallel for improved performance. The test infrastructure has been updated to support concurrent test execution.
Test Environment Setup
Fixture Files & Directory Structure:
- Each test gets a fresh temporary directory in
/tmp/magenta-test/{testId}/ - Files from
node/test/fixtures/are copied into this temp directory for each test - Available fixture files include
poem.txt,test.jpg,sample2.pdf,test.bin, and others - Nvim runs in this temporary directory, so files can be safely mutated during tests
- The temp directory is automatically cleaned up after each test - no manual cleanup needed
- Use
await getcwd(driver.nvim)to get the current working directory for file path operations - The temporary directory is completely isolated between tests
Test Pattern:
import { withDriver } from "../test/preamble";
test("my test", async () => {
await withDriver({}, async (driver) => {
// Test code here - nvim runs in temp dir with fixture files
// Access cwd with: const cwd = await getcwd(driver.nvim)
});
});
Custom File Setup:
test("test with custom files", async () => {
await withDriver(
{
setupFiles: async (tmpDir) => {
const fs = await import("fs/promises");
const path = await import("path");
await fs.writeFile(path.join(tmpDir, "custom.txt"), "content");
await fs.mkdir(path.join(tmpDir, "subfolder"));
},
},
async (driver) => {
// Custom files are now available in the test environment
},
);
});
Available Mocks & Test Interactions
Configuring Magenta Options:
Tests can override magenta options by passing them to withDriver:
test("test with custom options", async () => {
await withDriver(
{
options: {
getFileAutoAllowGlobs: ["*.log", "config/*"],
changeDebounceMs: 100,
// Any other MagentaOptions can be overridden here
},
},
async (driver) => {
// Magenta will use the custom options
},
);
});
Available options include:
getFileAutoAllowGlobs- Array of glob patterns for auto-allowing file readschangeDebounceMs- Override the default change tracking debounce- Any other options from
MagentaOptionstype
Mock Provider Interactions:
The mock provider (driver.mockAnthropic) captures all requests and allows controlled responses:
Awaiting Requests:
// Wait for any pending request
const request = await driver.mockAnthropic.awaitPendingRequest();
// Wait for request with specific text in message content
const request =
await driver.mockAnthropic.awaitPendingRequestWithText("specific text");
// Wait for user message (tool results, etc.)
const request = await driver.mockAnthropic.awaitPendingUserRequest();
// Wait for forced tool use requests
const forceRequest =
await driver.mockAnthropic.awaitPendingForceToolUseRequest();
// Check if there's a pending request with specific text (non-blocking)
const hasPending = driver.mockAnthropic.hasPendingRequestWithText("text");
Responding to Regular Requests:
// Simple text response
request.respond({
stopReason: "end_turn",
text: "Response text",
toolRequests: [],
});
// Response with tool use
request.respond({
stopReason: "tool_use",
text: "I'll use a tool",
toolRequests: [
{
status: "ok",
value: {
id: "tool_id" as ToolRequestId,
toolName: "get_file" as ToolName,
input: { filePath: "./file.txt" as UnresolvedFilePath },
},
},
],
});
// Response with error tool request
request.respond({
stopReason: "tool_use",
text: "Tool failed",
toolRequests: [
{
status: "error",
rawRequest: { invalid: "request" },
},
],
});
Responding to Force Tool Use Requests:
const forceRequest =
await driver.mockAnthropic.awaitPendingForceToolUseRequest();
// Successful tool response
await driver.mockAnthropic.respondToForceToolUse({
toolRequest: {
status: "ok",
value: {
id: "tool_id" as ToolRequestId,
toolName: "get_file" as ToolName,
input: { filePath: "./file.txt" as UnresolvedFilePath },
},
},
stopReason: "tool_use",
});
// Error tool response
await driver.mockAnthropic.respondToForceToolUse({
toolRequest: {
status: "error",
rawRequest: { invalid: "data" },
},
stopReason: "tool_use",
});
Request Inspection:
// Access request properties
console.log(request.messages); // Message history
console.log(request.model); // Model used
console.log(request.tools); // Available tools
console.log(request.systemPrompt); // System prompt (if any)
// For force tool use requests
console.log(forceRequest.spec); // Tool specification
console.log(forceRequest.model); // Model used
console.log(forceRequest.messages); // Message history
// Check if request was aborted
if (request.aborted) {
// Handle aborted request
}
Advanced Response Patterns:
// Stream individual parts of response
request.streamText("First part of response");
request.streamToolUse(toolRequest);
request.finishResponse("end_turn");
// Respond with errors
request.respondWithError(new Error("Something went wrong"));
// Access tool responses from previous messages
const toolResponses = request.getToolResponses();
Mock Provider:
driver.mockAnthropic- Pre-configured mock provider that captures all requestsawait driver.mockAnthropic.awaitPendingForceToolUseRequest()- Wait for and capture forced tool use requestsawait driver.mockAnthropic.awaitPendingRequest()- Wait for regular message requestsawait driver.mockAnthropic.respondToForceToolUse({...})- Send mock responses- No need to manually mock providers - they're already set up in the test infrastructure
Driver Interactions (prefer these over internal API access):
await driver.editFile("poem.txt")- Open fixture filesawait driver.command("normal! gg")- Execute vim commandsawait driver.magenta.command("predict-edit")- Execute magenta commands- Use real nvim interactions to trigger change tracking naturally
Testing Best Practices:
- DO: Use realistic nvim interactions (
driver.editFile(),driver.command()) - DON'T: Reach into internal APIs (
driver.magenta.changeTracker.onTextDocumentDidChange()) - DO: Let the system work naturally - make real edits and let change tracking happen
- DO: Write integration tests that exercise the full user flow
- DON'T: Mock internal components - use the provided driver and mock provider
Change Tracker Testing:
- DO: Use
driver.assertChangeTrackerHasEdits(count)anddriver.assertChangeTrackerContains(changes)instead of arbitrary timeouts - DO: Be aware that rapid edits may be batched into single changes by the tracker
- DO: Use explicit assertions about what changes should be tracked rather than waiting fixed amounts of time
- DON'T: Use
setTimeout()or fixed delays when waiting for change tracking - use the assertion methods instead
Mock Provider Request Objects:
Force tool use requests captured by awaitPendingForceToolUseRequest() contain:
request.spec- The tool specification usedrequest.model- Which model was requestedrequest.messages- The messages array containing user/assistant conversationrequest.systemPrompt- The system prompt used (if any)request.defer- Promise resolution control
System Prompt vs User Messages: When implementing AI features, maintain proper separation:
- System prompt: General instructions about the agent's role and behavior ("You have to do your best to predict...")
- User messages: Specific contextual data (buffer content, cursor position, recent changes) This separation keeps the system prompt focused on behavior while allowing dynamic context in messages.
Test Writing Best Practices
Avoid Conditional Expect Statements
DON'T write tests with conditional expects like this:
if (toolResult && toolResult.type === "tool_result") {
expect(toolResult.result.status).toBe("ok");
if (toolResult.result.status === "ok") {
const textContent = toolResult.result.value.find(
(item) => item.type === "text",
);
if (textContent && textContent.type === "text") {
expect(textContent.text).toContain("expected content");
}
}
}
DO use TypeScript type assertions and direct expects:
const toolResult = toolResultMessage.content[0] as Extract<
(typeof toolResultMessage.content)[0],
{ type: "tool_result" }
>;
expect(toolResult.type).toBe("tool_result");
expect(toolResult.result.status).toBe("ok");
const result = toolResult.result as Extract<
typeof toolResult.result,
{ status: "ok" }
>;
const textContent = result.value.find(
(item) => item.type === "text",
) as Extract<(typeof result.value)[0], { type: "text" }>;
expect(textContent).toBeDefined();
expect(textContent.text).toContain("expected content");
TypeScript Type Narrowing in Tests
Use TypeScript's Extract utility type to narrow union types safely:
// For narrowing message content
const toolResult = content[0] as Extract<
(typeof content)[0],
{ type: "tool_result" }
>;
// For narrowing result status
const okResult = toolResult.result as Extract<
typeof toolResult.result,
{ status: "ok" }
>;
const errorResult = toolResult.result as Extract<
typeof toolResult.result,
{ status: "error" }
>;
Test Structure Patterns
Basic Test Structure
it("should do something", async () => {
await withDriver({}, async (driver) => {
await driver.showSidebar();
// Trigger the action
await driver.inputMagentaText(`Some command`);
await driver.send();
// Mock the response
const request = await driver.mockAnthropic.awaitPendingRequest();
request.respond({
stopReason: "tool_use",
text: "response text",
toolRequests: [
/* tool requests */
],
});
// Assert the UI state
await driver.assertDisplayBufferContains("Expected UI text");
// Handle tool result and verify
const toolResultRequest = await driver.mockAnthropic.awaitPendingRequest();
const toolResultMessage =
toolResultRequest.messages[toolResultRequest.messages.length - 1];
// Type-safe assertions
expect(toolResultMessage.role).toBe("user");
expect(Array.isArray(toolResultMessage.content)).toBe(true);
const toolResult = toolResultMessage.content[0] as Extract<
(typeof toolResultMessage.content)[0],
{ type: "tool_result" }
>;
expect(toolResult.type).toBe("tool_result");
expect(toolResult.result.status).toBe("ok");
});
});
Tests with File Setup
it("should handle custom files", async () => {
await withDriver(
{
setupFiles: async (tmpDir) => {
const fs = await import("fs/promises");
const path = await import("path");
await fs.writeFile(path.join(tmpDir, "test.txt"), "content");
},
},
async (driver) => {
// Test implementation
},
);
});
Tests with Custom Options
it("should respect configuration", async () => {
await withDriver(
{
options: {
someOption: ["value1", "value2"],
},
},
async (driver) => {
// Test implementation
},
);
});
Mock Provider Patterns
Awaiting Requests
// Wait for regular requests
const request = await driver.mockAnthropic.awaitPendingRequest();
// Wait for forced tool use requests
const forceRequest =
await driver.mockAnthropic.awaitPendingForceToolUseRequest();
Responding to Requests
// Simple response
request.respond({
stopReason: "end_turn",
text: "Response text",
toolRequests: [],
});
// Response with tool use
request.respond({
stopReason: "tool_use",
text: "I'll use a tool",
toolRequests: [
{
status: "ok",
value: {
id: "tool_id" as ToolRequestId,
toolName: "tool_name" as ToolName,
input: { param: "value" },
},
},
],
});
Common Assertion Patterns
UI Assertions
// Check for presence
await driver.assertDisplayBufferContains("Expected text");
// Check for absence
await driver.assertDisplayBufferDoesNotContain("Unwanted text");
// Get position for interactions
const buttonPos = await driver.assertDisplayBufferContains("[ YES ]");
await driver.triggerDisplayBufferKey(buttonPos, "<CR>");
Tool Result Assertions
// Use helper functions when available
assertToolResultContainsText(toolResult, "expected text");
assertToolResultHasImageSource(toolResult, "image/jpeg");
// Manual assertions for specific cases
const result = toolResult.result as Extract<
typeof toolResult.result,
{ status: "ok" }
>;
const textContent = result.value.find(
(item) => item.type === "text",
) as Extract<(typeof result.value)[0], { type: "text" }>;
expect(textContent.text).toContain("expected content");
Change Tracker Assertions
// Use specific assertions instead of timeouts
await driver.assertChangeTrackerHasEdits(2);
await driver.assertChangeTrackerContains([
{ type: "edit", filePath: "file.txt" },
]);
// DON'T use arbitrary timeouts
// await new Promise(resolve => setTimeout(resolve, 1000)); // ❌
Testing Best Practices
Integration Over Unit
- Prefer testing complete user flows over isolated units
- Use realistic nvim interactions rather than reaching into internal APIs
- Let the system work naturally (e.g., let change tracking happen through real edits)
Mock Boundaries
- Mock external services (Anthropic API) but not internal components
- Use the provided driver and mock infrastructure
- Don't manually mock internal classes or methods
Realistic Interactions
// DO: Use realistic interactions
await driver.editFile("poem.txt");
await driver.command("normal! gg");
// DON'T: Reach into internals
// driver.magenta.changeTracker.onTextDocumentDidChange(...); // ❌
File Handling
- Each test gets a fresh temporary directory
- Fixture files are automatically copied for each test
- Files can be safely mutated during tests
- Use the
setupFilescallback for custom file creation
Error Testing
- Test both success and error paths
- Verify error messages are meaningful
- Test edge cases like invalid input, missing files, etc.
Async Patterns
- Always await async operations
- Use the driver's assertion methods that handle timing
- Don't use fixed delays unless absolutely necessary