Prompt API
A W3C proposal enabling web developers to access browser-provided or OS-provided language models directly via a JavaScript API. Supports prompt-completion workflows with stateful sessions, streaming, structured outputs, and multimodal inputs — all running on-device with no cloud dependency.
This is a note from my public notes. View the canonical version: Prompt API.
A W3C proposal enabling web developers to access browser-provided or OS-provided language models directly via a JavaScript API. Supports prompt-completion workflows with stateful sessions, streaming, structured outputs, and multimodal inputs — all running on-device with no cloud dependency.
Spec repo: https://github.com/webmachinelearning/prompt-api Part of the WebMachineLearning ecosystem.
Status
Experimental in Chrome and Microsoft Edge (2026). Under active W3C development.
Core Capabilities
| Feature | Description |
|---|---|
| Session management | Stateful conversations across multiple prompts |
| Streaming | promptStreaming() for real-time token output |
| System prompts | Configurable system context per session |
| Multimodal inputs | Text, images, and audio |
| Structured outputs | JSON Schema or regex constraints on model output |
| Tool calling | Invoke JavaScript functions from model decisions |
| Token tracking | Monitor context usage and handle overflow |
| Abort signal | Cancel in-flight requests |
| Availability check | Detect whether a model is available before using |
| Download progress | Monitor model download state |
Key Design Principles
- Local-first: model inference happens on device; no data transmitted to external servers
- Privacy by default: prompts and responses never leave the machine
- Standardized contract: one API across browsers (once standardized)
- Model abstraction: developers don't manage model weights directly
Usage Pattern
const session = await ai.languageModel.create({
systemPrompt: "You are a helpful assistant."
});
const stream = session.promptStreaming("Summarize this text: ...");
for await (const chunk of stream) {
console.log(chunk);
}
Relationship to Sibling APIs
- WebNN API: lower-level neural network ops; Prompt API builds on top
- Writing Assistance APIs: higher-level task-specific wrappers (Summarizer, Writer, Rewriter)
References
Related
- WebMachineLearning
- Large Language Models (LLMs)
- AI Inference
- AI Privacy
- WebNN API
- Writing Assistance APIs
- Browser-Provided Language Models
- On-Device Machine Learning
- Gemini Nano
- LLM Tool Calling
- LLM Structured Outputs
- LLM Streaming
- Edge AI
About Sébastien
I'm Sébastien Dubois, and I'm on a mission to help knowledge workers escape information overload. After 20+ years in IT and seeing too many brilliant minds drowning in digital chaos, I've decided to help people build systems that actually work. Through the Knowii Community, my courses, products & services and my Website/Newsletter, I share practical and battle-tested systems.
I write about Knowledge Work, Personal Knowledge Management, Note-taking, Lifelong Learning, Personal Organization, Productivity, and more. I also craft lovely digital products and tools.
If you want to follow my work, then become a member and join our community.
Ready to get to the next level?
If you're tired of information overwhelm and ready to build a reliable knowledge system:
- 📚 KM for Beginners — 10+ hours of structured video lessons
- 🚀 Obsidian Starter Kit — Ready-made vault with 40+ templates
- 💼 Knowledge Worker Kit — Complete guides + lifetime community
- 🦉 1-on-1 Coaching — Personalized guidance
- 🎯 Join Knowii — Community + ALL courses & tools
Found this valuable? Share it with someone who needs it.