name	web-ai-prompt-api
description	Chrome's built-in Prompt API implementation guide. Use Gemini Nano locally in browser for AI features - session management, multimodal input, structured output, streaming, Chrome Extensions. Reference for all Prompt API development.

Chrome Prompt API Reference

Complete implementation guide for Chrome's built-in Prompt API using Gemini Nano.

Hardware & OS Requirements

OS: Windows 10/11, macOS 13+ (Ventura+), Linux, ChromeOS (Platform 16389.0.0+) on Chromebook Plus Storage: 22 GB free (model downloaded separately) GPU: >4 GB VRAM OR CPU: 16 GB RAM + 4 cores Network: Unmetered connection for download Chrome: 138+ (Extensions in stable, Web in origin trial)

Not supported: Mobile (Android, iOS), non-Chromebook Plus ChromeOS

Check model size: chrome://on-device-internals Model removed if storage <10 GB after download.

Language Support

From Chrome 140: English, Spanish, Japanese (input/output)

Availability Check

const availability = await LanguageModel.availability();
// Returns: "unavailable" | "downloadable" | "downloading" | "available"

CRITICAL: Always pass same options to availability() as you use in create(). Some models don't support certain modalities/languages.

Model Parameters

await LanguageModel.params();
// { defaultTopK: 3, maxTopK: 128, defaultTemperature: 1, maxTemperature: 2 }

Session Creation

Basic Session

const session = await LanguageModel.create();

With Custom Parameters

const params = await LanguageModel.params();
const session = await LanguageModel.create({
  temperature: Math.min(params.defaultTemperature * 1.2, 2.0),
  topK: params.defaultTopK
});

With Download Monitoring

const session = await LanguageModel.create({
  monitor(m) {
    m.addEventListener('downloadprogress', (e) => {
      console.log(`Downloaded ${e.loaded * 100}%`);
    });
  }
});

With AbortSignal

const controller = new AbortController();
stopButton.onclick = () => controller.abort();

const session = await LanguageModel.create({
  signal: controller.signal
});

Initial Prompts (Context Setting)

const session = await LanguageModel.create({
  initialPrompts: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'What is the capital of Italy?' },
    { role: 'assistant', content: 'The capital of Italy is Rome.' },
    { role: 'user', content: 'What language is spoken there?' },
    { role: 'assistant', content: 'The official language is Italian.' }
  ]
});

Use cases:

Resume stored sessions after browser restart
N-shot prompting
Set assistant personality/tone
Provide conversation history

Multimodal Input (Origin Trial)

expectedInputs & expectedOutputs

const session = await LanguageModel.create({
  expectedInputs: [
    {
      type: "text",  // or "image", "audio"
      languages: ["en" /* system */, "ja" /* user prompt */]
    }
  ],
  expectedOutputs: [
    { type: "text", languages: ["ja"] }
  ]
});

Input types: text, image, audio Output types: text only

Throws NotSupportedError if unsupported modality.

Using with Images/Audio

const session = await LanguageModel.create({
  initialPrompts: [{
    role: 'system',
    content: 'Analyze images for patterns.'
  }],
  expectedInputs: [{ type: 'image' }]
});

// Append image
await session.append([{
  role: 'user',
  content: [
    { type: 'text', value: 'Analyze this image' },
    { type: 'image', value: fileInput.files[0] }
  ]
}]);

Prompting Methods

Non-Streaming (Short Responses)

const result = await session.prompt('Write me a haiku!');
console.log(result);

Streaming (Long Responses)

const stream = session.promptStreaming('Write me a long poem!');
for await (const chunk of stream) {
  console.log(chunk);  // Partial results as they arrive
}

With AbortSignal

const controller = new AbortController();
stopButton.onclick = () => controller.abort();

const result = await session.prompt('Write a poem', {
  signal: controller.signal
});

Response Constraints (Structured Output)

Pass JSON Schema to get predictable JSON responses:

const schema = {
  type: "object",
  properties: {
    hashtags: {
      type: "array",
      maxItems: 3,
      items: {
        type: "string",
        pattern: "^#[^\\s#]+$"
      }
    }
  },
  required: ["hashtags"],
  additionalProperties: false
};

const result = await session.prompt(
  `Generate hashtags for: ${post}`,
  { responseConstraint: schema }
);

const data = JSON.parse(result);  // Guaranteed valid JSON

Simple boolean example:

const schema = { "type": "boolean" };
const result = await session.prompt(
  `Is this about pottery?\n\n${text}`,
  { responseConstraint: schema }
);
console.log(JSON.parse(result));  // true or false

Measure input quota usage:

const usage = session.measureInputUsage({ responseConstraint: schema });

Omit schema from input quota:

const result = await session.prompt(
  `Summarize as JSON { rating } with 0-5 number:`,
  {
    responseConstraint: schema,
    omitResponseConstraintInput: true
  }
);

Response Prefixes

Guide model output format by prefilling assistant response:

const result = await session.prompt([
  { role: 'user', content: 'Create a TOML character sheet' },
  { role: 'assistant', content: '```toml\n', prefix: true }
]);
// Model continues from "```toml\n"

append() Method

Add messages to session without immediate response (useful for multimodal):

await session.append([
  {
    role: 'user',
    content: [
      { type: 'text', value: 'First context message' },
      { type: 'image', value: imageFile }
    ]
  }
]);

// Later, prompt with accumulated context
const result = await session.prompt('Analyze the images');

Promise fulfills when validated and appended.

Session Management

Check Quota

console.log(`${session.inputUsage}/${session.inputQuota}`);
const remaining = session.inputQuota - session.inputUsage;

When quota exceeded, oldest messages lost from context.

Clone Session

Clones inherit parameters, initial prompts, and history:

const mainSession = await LanguageModel.create({
  initialPrompts: [{ role: 'system', content: 'Speak like a pirate' }]
});

const clone1 = await mainSession.clone();
const clone2 = await mainSession.clone({ signal: controller.signal });

// Independent conversations with same setup
await clone1.prompt('Tell me a joke about parrots');
await clone2.prompt('Tell me a joke about treasure');

Use for: Parallel conversations, "what if" scenarios, resource efficiency

Restore Session from localStorage

let sessionData = getFromLocalStorage(uuid) || {
  initialPrompts: [],
  topK: (await LanguageModel.params()).defaultTopK,
  temperature: (await LanguageModel.params()).defaultTemperature
};

const session = await LanguageModel.create(sessionData);

// Track conversation
const stream = session.promptStreaming(prompt);
let result = '';
for await (const chunk of stream) {
  result = chunk;
}

sessionData.initialPrompts.push(
  { role: 'user', content: prompt },
  { role: 'assistant', content: result }
);

localStorage.setItem(uuid, JSON.stringify(sessionData));

Destroy Session

session.destroy();
// Frees resources, aborts ongoing execution
// Session unusable after destroy

Best practice: Keep one empty session alive to keep model loaded. Destroy only when truly done.

User Activation Requirement

Model download requires user interaction:

if (navigator.userActivation.isActive) {
  const session = await LanguageModel.create();
}

Sticky activation events: click, tap, keydown, mousedown

Permission Policy & iframes

Default: Top-level windows + same-origin iframes only

Grant cross-origin iframe access:

<iframe src="https://cross-origin.example.com/"
        allow="language-model">
</iframe>

Not available in Web Workers (permission policy complexity)

Local Development (localhost)

Go to chrome://flags/#prompt-api-for-gemini-nano-multimodal-input
Select Enabled
Relaunch Chrome
Open DevTools, type: await LanguageModel.availability();
Should return "available"

Troubleshoot:

Restart Chrome
Check chrome://on-device-internals → Model Status tab
Wait for model download to complete

Chrome Extensions

Available in Chrome 138+ stable for extensions.

Remove expired origin trial permissions:

{
  "permissions": ["aiLanguageModelOriginTrial"]  // REMOVE THIS
}

Best Practices

Session management:

Clone sessions to preserve initial prompts
Use AbortController to let users stop responses
Track inputUsage to warn before quota exceeded
Destroy unused sessions to free memory
Keep one empty session alive to keep model ready

Performance:

Use streaming for long responses
append() messages in advance for multimodal
Monitor download progress, inform users
Check availability before creating sessions

Structured output:

Use JSON Schema for predictable responses
LLMs good at generating schemas from descriptions
Validate with JSON Schema validators
Use prefix for format guidance

Context preservation:

Store conversation in localStorage for restoration
Use initialPrompts for session context
Clone for parallel conversations with same context

Error Handling

try {
  const session = await LanguageModel.create();
  const result = await session.prompt('Hello');
} catch (err) {
  if (err.name === 'AbortError') {
    // User stopped generation
  } else if (err.name === 'NotSupportedError') {
    // Unsupported modality/language
  } else {
    console.error(err.name, err.message);
  }
}

Key Limitations

Desktop only (no mobile)
22 GB storage required
Unmetered network for download
No Web Workers support
Output is text-only (input can be multimodal)
Context window has token limits
Model removed if storage <10 GB after download

Use Cases

AI-powered search on page content
Content classification/filtering
Event extraction from pages
Contact information extraction
Chatbots with conversation memory
Offline AI features
Privacy-sensitive data processing
Content summarization
Translation assistance
Image/audio analysis (multimodal)

web-ai-prompt-api

Install Skill

SKILL.md

Chrome Prompt API Reference

Hardware & OS Requirements

Language Support

Availability Check

Model Parameters

Session Creation

Basic Session

With Custom Parameters

With Download Monitoring

With AbortSignal

Initial Prompts (Context Setting)

Multimodal Input (Origin Trial)

expectedInputs & expectedOutputs

Using with Images/Audio

Prompting Methods

Non-Streaming (Short Responses)

Streaming (Long Responses)

With AbortSignal

Response Constraints (Structured Output)

Response Prefixes

append() Method

Session Management

Check Quota

Clone Session

Restore Session from localStorage

Destroy Session

User Activation Requirement

Permission Policy & iframes

Local Development (localhost)

Chrome Extensions

Best Practices

Error Handling

Key Limitations

Use Cases