Workers AI provider for the Vercel AI SDK. Run Workers AI models, and route any catalog model through AI Gateway with unified billing, server-side fallback, and resumable streaming (coming soon).
- Package README (quickstart):
packages/workers-ai-provider - Concepts: gateway routing · resume (coming soon) · binding vs REST
🚧 Resumable streaming is coming soon. It's not generally available yet — the AI Gateway resume backend is still rolling out. The
resume/onResumeExpiredoptions below are in place so you can adopt them early, but treat resume as experimental until the rollout completes.
import { createWorkersAI } from "workers-ai-provider";
import { generateText } from "ai";
const workersai = createWorkersAI({ binding: env.AI });
const { text } = await generateText({
model: workersai("@cf/zai-org/glm-5.2"),
prompt: "Write a haiku about Cloudflare Workers.",
});Supports chat (tool calling, reasoning), image generation, embeddings,
transcription, text-to-speech, and reranking. Works with a binding or with
accountId + apiKey (REST).
Wrap any @ai-sdk/* provider so its requests are routed through your gateway
(host-stripped to the gateway-native endpoint, with cf-aig-* headers applied).
Useful for provider-native or non-chat providers the slug delegate can't
auto-wire, or when you want full control over the model instance.
import { createGatewayProvider } from "workers-ai-provider";
import { createOpenAI } from "@ai-sdk/openai";
import { generateText } from "ai";
const openai = createGatewayProvider(createOpenAI, {
binding: env.AI,
gateway: "my-gateway",
byok: true, // forward the provider key; omit to use unified billing / a stored key
apiKey: env.OPENAI_API_KEY,
});
const { text } = await generateText({ model: openai("gpt-5"), prompt: "Hello" });The provider id is detected from the request URL. Outside a Worker, the lower-level
createGatewayFetch from workers-ai-provider/gateway does the same with explicit
credentials.
Configure createWorkersAI with provider plugins, then pass a vendor/model
slug (instead of a @cf/... id) to route it through AI Gateway with
capability-driven transport selection (run path vs gateway path), resume,
caching, and fallback.
import { createWorkersAI } from "workers-ai-provider";
import { openai } from "workers-ai-provider/openai";
import { anthropic } from "workers-ai-provider/anthropic";
const gatewayAi = createWorkersAI({
binding: env.AI,
gateway: { id: "my-gateway" },
providers: [openai, anthropic],
});
// Run path + resume (default for unified-catalog providers):
const model = gatewayAi("openai/gpt-5");
// Cross-vendor server-side fallback (one gateway run, cf-aig-step picks winner):
const resilient = gatewayAi("openai/gpt-5", {
fallback: { mode: "server", models: ["anthropic/claude-sonnet-4-5"] },
});Provider plugins are imported from sub-paths so the @ai-sdk/* packages stay
optional peers: workers-ai-provider/openai, /anthropic, /google. One plugin
covers a whole wire format — the openai plugin serves the entire
OpenAI-compatible long tail. A @cf/... id on the same provider still routes to
Workers AI directly, so one provider instance handles both.
| Option | Effect |
|---|---|
resume |
(Coming soon) Resumable streaming on the run path (default true). |
onResumeExpired |
(Coming soon) "error" (default) or "accept-partial" on resume-buffer eviction. |
fallback |
{ mode: "client" | "server", models: string[] }. client keeps resume per leg; server uses the gateway path. |
cacheTtl / skipCache |
Gateway response caching (forces the gateway path). |
transport |
Force "run" or "gateway". |
gateway |
Override the delegate's gateway for this call (id or full options). |
metadata |
Custom gateway-log metadata (merged over gateway.metadata; bigint coerced to string). |
collectLog |
Force gateway log collection on/off. |
byok |
Forward the upstream provider key (BYOK) instead of stripping it. |
extraHeaders |
Extra request headers. |
onDispatch |
Called once per dispatch with the resolved transport + gateway headers (incl. runId, cfStep, cacheStatus, logId). |
onProgress |
Run path: fired with the cumulative SSE event offset as resume advances. |
See gateway routing for how transport,
fallback, and cacheTtl interact with transport selection.
A single fetch handler that streams a resilient, resumable response. Workers AI
models and catalog slugs share the same provider instance.
import { createWorkersAI } from "workers-ai-provider";
import { openai } from "workers-ai-provider/openai";
import { anthropic } from "workers-ai-provider/anthropic";
import { streamText } from "ai";
export default {
async fetch(req: Request, env: { AI: Ai }) {
const workersai = createWorkersAI({
binding: env.AI,
providers: [openai, anthropic],
gateway: { id: "my-gateway" },
});
const { prompt } = (await req.json()) as { prompt: string };
const result = streamText({
// Run path + resume by default; client-side fallback keeps resume per leg.
model: workersai("openai/gpt-5", {
fallback: { mode: "client", models: ["anthropic/claude-sonnet-4-5"] },
onResumeExpired: "accept-partial",
}),
prompt,
});
return result.toTextStreamResponse();
},
};createWorkersAI(catalog slug routing is enabled by passingproviders),createGatewayProviderGATEWAY_PROVIDERS,findProviderBySlug,detectProviderByUrl,wireableProviders(the shared registry)createResumableStream,GatewayDelegateError,WorkersAIGatewayError,WorkersAIFallbackError,createClientFallbackModelDelegateCallOptions,ProviderPlugin,Transport,WireFormat(types)- Provider plugins from
workers-ai-provider/{openai,anthropic,google}
The shared core is bundled into each consumer, so shared error classes are
distinct per bundle — cross-package instanceof would fail. Match shared errors
by .name / a kind discriminant rather than instanceof. Catching the error
class exported by this package (single bundle) works fine.