<button onClick={isListening ? stop : start}>        {isListening ? "Stop" : "Dictate"}      </button>    </div>  );}
```

## Client API: `VoiceClient`

Framework-agnostic client for environments without React.

* [  JavaScript ](#tab-panel-5517)
* [  TypeScript ](#tab-panel-5518)

JavaScript

```
import { VoiceClient } from "@cloudflare/voice/client";
const client = new VoiceClient({ agent: "MyAgent" });
client.addEventListener("statuschange", (status) => {  console.log("Status:", status);});
client.addEventListener("transcriptchange", (messages) => {  console.log("Transcript:", messages);});
client.addEventListener("error", (err) => {  console.error("Error:", err);});
client.connect();await client.startCall();
// Switch assistant playback without reconnecting the call.await client.setOutputDevice(selectedSpeakerId);
// Later:client.endCall();client.disconnect();
```

TypeScript

```
import { VoiceClient } from "@cloudflare/voice/client";
const client = new VoiceClient({ agent: "MyAgent" });
client.addEventListener("statuschange", (status) => {  console.log("Status:", status);});
client.addEventListener("transcriptchange", (messages) => {  console.log("Transcript:", messages);});
client.addEventListener("error", (err) => {  console.error("Error:", err);});
client.connect();await client.startCall();
// Switch assistant playback without reconnecting the call.await client.setOutputDevice(selectedSpeakerId);
// Later:client.endCall();client.disconnect();
```

### Events

| Event             | Data type             | Description                           |
| ----------------- | --------------------- | ------------------------------------- |
| statuschange      | VoiceStatus           | Pipeline state changed                |
| transcriptchange  | TranscriptMessage\[\] | Transcript updated                    |
| interimtranscript | string \| null        | Interim transcript from streaming STT |
| metricschange     | VoicePipelineMetrics  | Pipeline timing metrics               |
| audiolevelchange  | number                | Mic audio level (0–1)                 |
| connectionchange  | boolean               | WebSocket connected/disconnected      |
| mutechange        | boolean               | Mute state changed                    |
| error             | string \| null        | Error occurred                        |
| outputdeviceerror | string \| null        | Non-fatal speaker routing issue       |
| custommessage     | unknown               | Non-voice message from server         |

### Advanced options

| Option          | Type             | Description                                           |
| --------------- | ---------------- | ----------------------------------------------------- |
| transport       | VoiceTransport   | Custom transport (default: WebSocket via PartySocket) |
| audioInput      | VoiceAudioInput  | Custom mic capture (default: built-in AudioWorklet)   |
| preferredFormat | VoiceAudioFormat | Hint for server audio format (advisory only)          |
| outputDeviceId  | string           | Preferred audiooutput device for assistant playback   |

## Providers

### Built-in (Workers AI)

No API keys required — use your Workers AI binding:

| Class             | Type           | Default model       | Recommended for |
| ----------------- | -------------- | ------------------- | --------------- |
| WorkersAIFluxSTT  | Continuous STT | @cf/deepgram/flux   | withVoice       |
| WorkersAINova3STT | Continuous STT | @cf/deepgram/nova-3 | withVoiceInput  |
| WorkersAITTS      | TTS            | @cf/deepgram/aura-1 | Both            |

* [  JavaScript ](#tab-panel-5519)
* [  TypeScript ](#tab-panel-5520)

JavaScript

```
import { Agent } from "agents";import {  withVoice,  WorkersAIFluxSTT,  WorkersAINova3STT,  WorkersAITTS,} from "@cloudflare/voice";
const VoiceAgent = withVoice(Agent);
// Default usageexport class MyAgent extends VoiceAgent {  transcriber = new WorkersAIFluxSTT(this.env.AI);  tts = new WorkersAITTS(this.env.AI);}
// Custom optionsexport class CustomAgent extends VoiceAgent {  transcriber = new WorkersAIFluxSTT(this.env.AI, {    eotThreshold: 0.8,    keyterms: ["Cloudflare", "Workers"],  });  tts = new WorkersAITTS(this.env.AI, {    model: "@cf/deepgram/aura-1",    speaker: "asteria",  });}
```

TypeScript

```
import { Agent } from "agents";import {  withVoice,  WorkersAIFluxSTT,  WorkersAINova3STT,  WorkersAITTS,} from "@cloudflare/voice";
const VoiceAgent = withVoice(Agent);
// Default usageexport class MyAgent extends VoiceAgent<Env> {  transcriber = new WorkersAIFluxSTT(this.env.AI);  tts = new WorkersAITTS(this.env.AI);}
// Custom optionsexport class CustomAgent extends VoiceAgent<Env> {  transcriber = new WorkersAIFluxSTT(this.env.AI, {    eotThreshold: 0.8,    keyterms: ["Cloudflare", "Workers"],  });  tts = new WorkersAITTS(this.env.AI, {    model: "@cf/deepgram/aura-1",    speaker: "asteria",  });}
```

### Third-party providers

| Package                      | Class         | Description             |
| ---------------------------- | ------------- | ----------------------- |
| @cloudflare/voice-deepgram   | DeepgramSTT   | Continuous STT          |
| @cloudflare/voice-elevenlabs | ElevenLabsTTS | High-quality TTS        |
| @cloudflare/voice-twilio     | TwilioAdapter | Telephony (phone calls) |

**ElevenLabs TTS:**

* [  JavaScript ](#tab-panel-5507)
* [  TypeScript ](#tab-panel-5508)

JavaScript

```
import { ElevenLabsTTS } from "@cloudflare/voice-elevenlabs";
export class MyAgent extends VoiceAgent {  transcriber = new WorkersAIFluxSTT(this.env.AI);  tts = new ElevenLabsTTS({    apiKey: this.env.ELEVENLABS_API_KEY,    voiceId: "21m00Tcm4TlvDq8ikWAM",  });}
```

TypeScript

```
import { ElevenLabsTTS } from "@cloudflare/voice-elevenlabs";
export class MyAgent extends VoiceAgent<Env> {  transcriber = new WorkersAIFluxSTT(this.env.AI);  tts = new ElevenLabsTTS({    apiKey: this.env.ELEVENLABS_API_KEY,    voiceId: "21m00Tcm4TlvDq8ikWAM",  });}
```

**Deepgram STT:**

* [  JavaScript ](#tab-panel-5509)
* [  TypeScript ](#tab-panel-5510)

JavaScript

```
import { DeepgramSTT } from "@cloudflare/voice-deepgram";
export class MyAgent extends VoiceAgent {  transcriber = new DeepgramSTT({    apiKey: this.env.DEEPGRAM_API_KEY,  });  tts = new WorkersAITTS(this.env.AI);}
```

TypeScript

```
import { DeepgramSTT } from "@cloudflare/voice-deepgram";
export class MyAgent extends VoiceAgent<Env> {  transcriber = new DeepgramSTT({    apiKey: this.env.DEEPGRAM_API_KEY,  });  tts = new WorkersAITTS(this.env.AI);}
```

## Telephony (Twilio)

Connect phone calls to your voice agent using the Twilio adapter:

Terminal window

```
npm install @cloudflare/voice-twilio
```

The adapter bridges Twilio Media Streams to your VoiceAgent:

```
Phone → Twilio → WebSocket → TwilioAdapter → WebSocket → VoiceAgent
```

`WorkersAITTS` returns MP3, which cannot be decoded to PCM in the Workers runtime. When using the Twilio adapter, use a TTS provider that outputs raw PCM (for example, ElevenLabs with `outputFormat: "pcm_16000"`).

## Text messages

`withVoice` agents can also receive text messages, bypassing STT entirely. This is useful for chat-style input alongside voice.

```
const { sendText } = useVoiceAgent({ agent: "MyAgent" });
// Send text — goes straight to onTurn() without STTsendText("What is the weather like today?");
```

Text messages work both during and outside of active calls. During a call, the response is spoken aloud via TTS. Outside a call, the response is sent as text-only transcript messages.

## Custom messages

Send and receive application-level JSON messages alongside voice protocol messages. Non-voice messages pass through to your `onMessage` handler on the server and emit `custommessage` events on the client.

**Server:**

* [  JavaScript ](#tab-panel-5515)
* [  TypeScript ](#tab-panel-5516)

JavaScript

```
export class MyAgent extends VoiceAgent {  onMessage(connection, message) {    const data = JSON.parse(message);    if (data.type === "kick_speaker") {      this.forceEndCall(connection);    }  }}
```

TypeScript

```
export class MyAgent extends VoiceAgent<Env> {  onMessage(connection: Connection, message: WSMessage) {    const data = JSON.parse(message as string);    if (data.type === "kick_speaker") {      this.forceEndCall(connection);    }  }}
```

**Client:**

```
const { sendJSON, lastCustomMessage } = useVoiceAgent({ agent: "MyAgent" });
sendJSON({ type: "kick_speaker" });
useEffect(() => {  if (lastCustomMessage) {    console.log("Custom message:", lastCustomMessage);  }}, [lastCustomMessage]);
```

## Single-speaker enforcement

Use `beforeCallStart` to restrict who can start a call. This example enforces single-speaker — only one connection can be the active speaker at a time:

* [  JavaScript ](#tab-panel-5521)
* [  TypeScript ](#tab-panel-5522)

JavaScript

```
import {} from "agents";
export class MyAgent extends VoiceAgent {  #speakerId = null;
  beforeCallStart(connection) {    if (this.#speakerId !== null) {      return false;    }    this.#speakerId = connection.id;    return true;  }
  onCallEnd(connection) {    if (this.#speakerId === connection.id) {      this.#speakerId = null;    }  }}
```

TypeScript

```
import { type Connection } from "agents";
export class MyAgent extends VoiceAgent<Env> {  #speakerId: string | null = null;
  beforeCallStart(connection: Connection) {    if (this.#speakerId !== null) {      return false;    }    this.#speakerId = connection.id;    return true;  }
  onCallEnd(connection: Connection) {    if (this.#speakerId === connection.id) {      this.#speakerId = null;    }  }}
```

## Pipeline metrics

`withVoice` agents emit timing metrics after each turn:

```
const { metrics } = useVoiceAgent({ agent: "MyAgent" });
// metrics: {//   llm_ms: 850,//   tts_ms: 200,//   first_audio_ms: 950,//   total_ms: 1200,// }
```

## Conversation history

`withVoice` automatically persists conversation messages to SQLite. Access history in your `onTurn` via `context.messages`, or directly:

* [  JavaScript ](#tab-panel-5511)
* [  TypeScript ](#tab-panel-5512)

JavaScript

```
const history = this.getConversationHistory(20);
this.saveMessage("assistant", "Welcome! How can I help?");
```

TypeScript

```
const history = this.getConversationHistory(20);
this.saveMessage("assistant", "Welcome! How can I help?");
```

History survives Durable Object restarts and client reconnections. Voice agents use `keepAlive` to prevent eviction during active calls.

```json
{"@context":"https://schema.org","@type":"TechArticle","@id":"https://developers.cloudflare.com/agents/communication-channels/voice/#page","headline":"Voice · Cloudflare Agents docs","description":"Build real-time voice agents with speech-to-text, text-to-speech, and conversation persistence over WebSocket.","url":"https://developers.cloudflare.com/agents/communication-channels/voice/","inLanguage":"en","image":"https://developers.cloudflare.com/dev-products-preview.png","dateModified":"2026-06-16","publisher":{"@type":"Organization","name":"Cloudflare","url":"https://www.cloudflare.com/"},"isPartOf":{"@type":"WebSite","@id":"https://developers.cloudflare.com/#website","name":"Cloudflare Docs","url":"https://developers.cloudflare.com/"}}
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/agents/","name":"Agents"}},{"@type":"ListItem","position":3,"item":{"@id":"/agents/communication-channels/","name":"Communication channels"}},{"@type":"ListItem","position":4,"item":{"@id":"/agents/communication-channels/voice/","name":"Voice"}}]}
```