Think

@cloudflare/think lets you build a stateful AI chat agent — one that streams replies, remembers the conversation, and calls tools — by extending a single base class. You provide a model with getModel(), and Think wires up the rest of the chat lifecycle for you: the agentic loop (the model calls tools, reads the results, and keeps going until it has an answer), message persistence, streaming, client tools, stream resumption, and extensions — all backed by Durable Object SQLite.

Think works as both a top-level agent (WebSocket chat to browser clients via useAgentChat) and a sub-agent (a child agent that another agent drives over RPC via chat()).

Quick start

Install

npm install @cloudflare/think @cloudflare/ai-chat agents ai @cloudflare/shell zod workers-ai-provider

import { Think } from "@cloudflare/think";
import { createWorkersAI } from "workers-ai-provider";
import { routeAgentRequest } from "agents";

export class MyAgent extends Think {
  getModel() {
    return createWorkersAI({ binding: this.env.AI })(
      "@cf/moonshotai/kimi-k2.6",
    );
  }
}

export default {
  async fetch(request, env) {
    return (
      (await routeAgentRequest(request, env)) ||
      new Response("Not found", { status: 404 })
    );
  },
};

import { Think } from "@cloudflare/think";
import { createWorkersAI } from "workers-ai-provider";
import { routeAgentRequest } from "agents";

export class MyAgent extends Think<Env> {
  getModel() {
    return createWorkersAI({ binding: this.env.AI })(
      "@cf/moonshotai/kimi-k2.6",
    );
  }
}

export default {
  async fetch(request: Request, env: Env) {
    return (
      (await routeAgentRequest(request, env)) ||
      new Response("Not found", { status: 404 })
    );
  },
} satisfies ExportedHandler<Env>;

That is it. Think handles the WebSocket chat protocol, message persistence, the agentic loop, message sanitization, stream resumption, client tool support, and workspace file tools.

Client

JavaScript
TypeScript

import { useAgent } from "agents/react";
import { useAgentChat } from "@cloudflare/ai-chat/react";

function Chat() {
  const agent = useAgent({ agent: "MyAgent" });
  const { messages, sendMessage, status } = useAgentChat({ agent });

  return (
    <div>
      {messages.map((msg) => (
        <div key={msg.id}>
          <strong>{msg.role}:</strong>
          {msg.parts.map((part, i) =>
            part.type === "text" ? <span key={i}>{part.text}</span> : null,
          )}
        </div>
      ))}

      <form
        onSubmit={(e) => {
          e.preventDefault();
          const input = e.currentTarget.elements.namedItem("input");
          sendMessage({ text: input.value });
          input.value = "";
        }}
      >
        <input name="input" placeholder="Send a message..." />
        <button type="submit">Send</button>
      </form>
    </div>
  );
}

import { useAgent } from "agents/react";
import { useAgentChat } from "@cloudflare/ai-chat/react";

function Chat() {
  const agent = useAgent({ agent: "MyAgent" });
  const { messages, sendMessage, status } = useAgentChat({ agent });

  return (
    <div>
      {messages.map((msg) => (
        <div key={msg.id}>
          <strong>{msg.role}:</strong>
          {msg.parts.map((part, i) =>
            part.type === "text" ? <span key={i}>{part.text}</span> : null,
          )}
        </div>
      ))}

      <form
        onSubmit={(e) => {
          e.preventDefault();
          const input = e.currentTarget.elements.namedItem(
            "input",
          ) as HTMLInputElement;
          sendMessage({ text: input.value });
          input.value = "";
        }}
      >
        <input name="input" placeholder="Send a message..." />
        <button type="submit">Send</button>
      </form>
    </div>
  );
}

Configuration

wrangler.jsonc
wrangler.toml

{
  "$schema": "./node_modules/wrangler/config-schema.json",
  // Set this to today's date
  "compatibility_date": "2026-06-27",
  "compatibility_flags": [
    "nodejs_compat"
  ],
  "ai": {
    "binding": "AI"
  },
  "durable_objects": {
    "bindings": [
      {
        "class_name": "MyAgent",
        "name": "MyAgent"
      }
    ]
  },
  "migrations": [
    {
      "new_sqlite_classes": [
        "MyAgent"
      ],
      "tag": "v1"
    }
  ]
}

# Set this to today's date
compatibility_date = "2026-06-27"
compatibility_flags = ["nodejs_compat"]

[ai]
binding = "AI"

[[durable_objects.bindings]]
class_name = "MyAgent"
name = "MyAgent"

[[migrations]]
new_sqlite_classes = ["MyAgent"]
tag = "v1"

Think vs AIChatAgent

Both Think and AIChatAgent extend Agent and speak the same cf_agent_chat_* WebSocket protocol. They serve different goals.

AIChatAgent is a protocol adapter. You override onChatMessage and are responsible for calling streamText, wiring tools, converting messages, and returning a Response. AIChatAgent handles the plumbing — message persistence, streaming, abort, resume — but the LLM call is entirely your concern.

Think is an opinionated framework. It makes decisions for you: getModel() returns the model, getSystemPrompt() or configureSession() sets the prompt, getTools() returns tools. The default onChatMessage runs the complete agentic loop. You override individual pieces, not the whole pipeline.

Concern	AIChatAgent	Think
Minimal subclass	~15 lines (wire `streamText` + tools + system prompt + response)	3 lines (`getModel()` only)
Storage	Flat SQL table	Session: tree-structured messages, context blocks, compaction, FTS5
Regeneration	Destructive (old response deleted)	Non-destructive branching (old responses preserved)
Context management	Manual	Context blocks with LLM-writable persistent memory
Sub-agent RPC	Not built in	`chat()` with `StreamCallback`
Programmatic turns	`saveMessages()`	`saveMessages()`, `submitMessages()`, `continueLastTurn()`
Compaction	`maxPersistedMessages` (deletes oldest)	Non-destructive summaries via overlays
Search	Not available	FTS5 full-text search per-session and cross-session

When to use AIChatAgent

You need full control over the LLM call (RAG, multi-model, custom streaming)
You want the Response return type for HTTP middleware or testing
You are building a simple chatbot with no memory requirements

When to use Think

You want to ship fast (3-line subclass with everything wired)
You need persistent memory (context blocks the model can read and write)
You need long conversations (non-destructive compaction)
You need conversation search (FTS5)
You are building a sub-agent system (parent-child RPC with streaming)
You need proactive agents (programmatic turns from scheduled tasks or webhooks)
You need durable async submission for webhook or RPC callers

Choose a turn API

Think has several ways to start or continue a turn. They all funnel through one public entry point — runTurn(options) — and the older methods remain as convenience shortcuts.

runTurn()

runTurn() is the unified turn-admission API. One method, three modes, selected by options.mode:

Mode	Use when	Returns	Shortcut for
`"wait"` (default)	The caller can block until the model response is finished	`Promise<TurnResult>`	`saveMessages()`
`"submit"`	The caller needs fast, durable acceptance and a later status	`Promise<SubmitMessagesResult>`	`submitMessages()`
`"stream"`	The caller wants the response streamed to a callback (RPC)	`Promise<void>`	`chat()`

The input accepts a string, a UIMessage, an array of messages, or — in wait and stream modes — a function (current) => UIMessage[] evaluated at admission. (submit does not accept function input.)

JavaScript
TypeScript

export class Assistant extends Think {
  async examples(inboundEventId) {
    // wait — block for the result
    const result = await this.runTurn({ input: "Summarize the latest thread" });
    if (result.status === "completed") {
      // result.message is the assistant message; result.continuation is false
    }

    // submit — durable acceptance, check status later
    const submission = await this.runTurn({
      mode: "submit",
      input: "Process this webhook",
      idempotencyKey: inboundEventId, // dedupe; safe to retry
    });
    // submission.accepted is true on first accept; submission.status is "pending"

    // stream — drive a callback (the same surface as chat())
    await this.runTurn({
      mode: "stream",
      input: "Stream me",
      callback: {
        onStart({ requestId }) {},
        onEvent(json) {}, // UIMessageChunk JSON
        onDone() {},
        onError(error) {},
      },
    });

    // continuation — continue the last assistant turn instead of sending input
    await this.runTurn({ continuation: true });
  }
}

export class Assistant extends Think<Env> {
  async examples(inboundEventId: string) {
    // wait — block for the result
    const result = await this.runTurn({ input: "Summarize the latest thread" });
    if (result.status === "completed") {
      // result.message is the assistant message; result.continuation is false
    }

    // submit — durable acceptance, check status later
    const submission = await this.runTurn({
      mode: "submit",
      input: "Process this webhook",
      idempotencyKey: inboundEventId, // dedupe; safe to retry
    });
    // submission.accepted is true on first accept; submission.status is "pending"

    // stream — drive a callback (the same surface as chat())
    await this.runTurn({
      mode: "stream",
      input: "Stream me",
      callback: {
        onStart({ requestId }) {},
        onEvent(json) {}, // UIMessageChunk JSON
        onDone() {},
        onError(error) {},
      },
    });

    // continuation — continue the last assistant turn instead of sending input
    await this.runTurn({ continuation: true });
  }
}

Key behaviors:

Blocking modes cannot nest. Calling wait/stream/continuation (or the equivalent shortcut) from inside an active turn — for example, from a tool's execute — throws, because it would deadlock the turn queue. From inside a turn, use runTurn({ mode: "submit" }) (durable, runs after the current turn frees the queue) or addMessages() (transcript only, no inference).
submit is idempotent. Pass submissionId and/or idempotencyKey; re-submitting a known key returns the existing record with accepted: false instead of starting a second turn. See Programmatic submissions.
Recovery-safe. When chatRecovery is enabled, the wait, stream, and drained submit paths all run inference inside a recovery fiber, so an interrupted turn resumes after eviction.

runTurn is exported alongside its option and result types: RunTurnOptions, RunTurnWait, RunTurnSubmit, RunTurnStream, TurnInputMessages, and TurnResult.

Pick a shortcut

The table below maps each scenario to the most direct call. Each shortcut has an unchanged signature; reach for them when you want the narrower surface, or use runTurn() when you want one mental model.

Use case	API
A browser user sends chat messages	`useAgentChat` over the WebSocket chat protocol
Server code can wait for the model response	`saveMessages()`
Server code needs fast durable acceptance and later status	`submitMessages()`
Code should create recurring prompt-driven turns or handlers	`getScheduledTasks()`
Parent code needs direct streaming RPC to a specific child	`subAgent(...).chat()`
A parent delegates work to a retained child agent	`agentTool()` or `runAgentTool()`
Surround a turn with idempotent app-owned side effects	`startFiber()`
Coordinate multi-step durable orchestration	Workflows
Add context or messages without starting a model turn	`addMessages()`
Advanced subclass or recovery code continues an assistant turn	`continueLastTurn()`

Use saveMessages() when the caller owns the trigger and can wait for the turn to finish. Use submitMessages() when timeout ambiguity would make retries unsafe.

Add messages without a turn

Use addMessages() to write to the transcript without starting a model turn — for importing prior history or injecting background context the next turn should see:

JavaScript
TypeScript

export class Assistant extends Think {
  async importContext() {
    await this.addMessages([
      {
        id: crypto.randomUUID(),
        role: "user",
        parts: [{ type: "text", text: "Imported context" }],
      },
    ]);
  }
}

export class Assistant extends Think<Env> {
  async importContext() {
    await this.addMessages([
      {
        id: crypto.randomUUID(),
        role: "user",
        parts: [{ type: "text", text: "Imported context" }],
      },
    ]);
  }
}

addMessages() appends (or upserts) into the Session tree:

It does not run inference and does not enter the turn queue, so it is safe to call from inside a tool's execute without deadlocking.
Array entries are appended linearly (each attaches under the previous one), so imported history stays a single path. By default the first message attaches to the latest committed leaf; pass parentId to attach elsewhere, or null for a root message.
Appends are idempotent by message id. Pass { mode: "upsert" } to update an existing message in place instead.

The supported pattern is "add context, then run a turn": call addMessages(), then runTurn().

Use chat() for low-level parent-to-child streaming when your code owns forwarding, cancellation, and replay policy. Use Agents as tools when a parent model or workflow delegates to a child agent and you want retained child runs, event replay, abort bridging, and UI drill-in.

Use startFiber() outside Think when the durable unit is an application job around a turn: accepting a webhook once, restoring a serialized channel or thread target, posting a visible reply, or recording app-level recovery policy. Think submissions own conversation admission and turn serialization; managed fibers own external job acceptance, idempotent side effects, and application recovery.

In this section

Getting started Build a Think agent step by step.

Configuration Configuration overrides, dynamic configuration, and Session integration.

Tools Workspace tools, code execution, browser tools, and extensions.

Actions Server actions with idempotency, approvals, authorization, and reply attachments.

Channels Per-channel policy, channel selection, and out-of-band notices.

Lifecycle hooks beforeTurn, beforeStep, onStepFinish, onChatResponse, and more.

Client tools Browser-side tools, approvals, and concurrency.

Messengers Receive and reply to Chat SDK messenger webhooks.

Scheduled tasks Declarative recurring prompts and handlers.

Workflows Durable model-driven reasoning steps inside Cloudflare Workflows.

Sub-agent RPC chat() streaming, saveMessages, continueLastTurn, and abort.

Programmatic submissions Durable turn admission for webhooks and RPC callers.

Durable recovery Chat recovery, stream-stall watchdog, and stability detection.

Agent Skills On-demand instructions, resources, and scripts via getSkills().

Acknowledgments

Think's design is inspired by Pi ↗.

Example

Assistant example Explore a multi-session Think assistant with sub-agent routing, shared workspace, MCP, chat recovery, and GitHub OAuth.

Sessions — context blocks, compaction, search, multi-session (the storage layer Think builds on)
Sub-agents — subAgent(), abortSubAgent(), deleteSubAgent() (the base Agent methods for spawning children)
Chat agents — AIChatAgent for when you need full control over the LLM call
Long-running agents — sub-agent delegation patterns for multi-week agent lifetimes
Durable execution — runFiber() and crash recovery (used by chatRecovery)
Browse the web — full CDP helper API reference