🦞
Tutorial

Voice Mode (Talk Mode)

Have natural voice conversations with OpenClaw. Speak naturally, and hear responses through high-quality text-to-speech powered by ElevenLabs.

🎙️ What is Talk Mode?

Talk Mode enables continuous voice conversations with your OpenClaw assistant. Instead of typing, you speak naturally and hear responses as natural speech.

The Voice Cycle:

1. Listen2. Transcribe3. Process4. TTS Playback

OpenClaw uses ElevenLabs for text-to-speech, providing natural-sounding voices with emotional nuance and proper intonation.

Requirements

API Key

ElevenLabs API Key

Free tier available

Platform

macOS App, iOS, or Android

OpenClaw node with audio

Permissions

Microphone Access

Required for voice input

Setup Steps

1

Get an ElevenLabs API Key

Sign up at ElevenLabs and generate an API key from your dashboard.

  • Visit elevenlabs.io and create an account
  • Go to Profile Settings > API Keys
  • Generate a new API key and copy it
  • Free tier includes limited characters per month
2

Configure openclaw.json

Add the talk configuration to enable voice mode.

{
  "talk": {
    "voiceId": "EXAVITQu4vr4xnSDxMaL",
    "modelId": "eleven_v3",
    "outputFormat": "mp3_44100_128",
    "apiKey": "${ELEVENLABS_API_KEY}",
    "interruptOnSpeech": true
  }
}
3

Set Environment Variable (Alternative)

Instead of putting the key in config, use an environment variable.

export ELEVENLABS_API_KEY="your_api_key_here"
4

Start Talk Mode

Launch OpenClaw and activate voice mode.

  • On macOS: Click the microphone icon in the overlay
  • On mobile: Tap the voice button in the app
  • Via command: openclaw talk
Full Configuration Example

Complete talk configuration with all options:

{
  "talk": {
    "voiceId": "EXAVITQu4vr4xnSDxMaL",
    "modelId": "eleven_v3",
    "outputFormat": "mp3_44100_128",
    "apiKey": "${ELEVENLABS_API_KEY}",
    "interruptOnSpeech": true,
    "stability": 0.5,
    "similarityBoost": 0.75,
    "style": 0.5,
    "speakerBoost": true
  }
}
  • voiceId — ElevenLabs voice identifier
  • modelId — TTS model (eleven_v3 recommended)
  • outputFormat — Audio format and quality
  • interruptOnSpeech — Stop playback when user speaks
  • stability — Voice consistency (0-1)
  • similarityBoost — Voice clarity (0-1)
Voice Aliases

Map friendly names to voice IDs for easy switching during conversations:

{
  "talk": {
    "voiceId": "default",
    "voices": {
      "default": "EXAVITQu4vr4xnSDxMaL",
      "professional": "21m00Tcm4TlvDq8ikWAM",
      "friendly": "AZnzlk1XvdvUeBnXmlld",
      "narrator": "pNInz6obpgDQGcFmaJgB"
    }
  }
}

Available Voice Aliases:

default

Sarah - Clear female voice

professional

Rachel - Professional tone

friendly

Domi - Warm and engaging

narrator

Adam - Deep narrator voice

Switch voices dynamically: "Use the narrator voice for this response"

Platform Features

macOS App
  • Always-on floating overlay
  • Visual audio level indicators
  • System-wide hotkey activation
  • Native audio processing
iOS / Android
  • Streaming audio playback
  • Background voice processing
  • Push-to-talk option
  • Optimized for mobile bandwidth
Voice Directives in Replies

Control voice settings dynamically through JSON directives in responses:

// Per-reply voice control
{
  "voice": "narrator",
  "speed": 1.1,
  "stability": 0.8
}

This response will be spoken in the narrator voice at slightly faster speed.

Directive Options:

  • voice — Voice alias or ID for this reply
  • speed — Playback speed (0.5 - 2.0)
  • stability — Override stability for this reply
  • persist — Keep settings for future replies
Text-to-Speech (TTS) for Messages

Enable automatic TTS for messages outside of Talk Mode:

{
  "tts": {
    "enabled": true,
    "mode": "tagged",
    "provider": "elevenlabs",
    "voiceId": "EXAVITQu4vr4xnSDxMaL"
  }
}

Auto TTS Modes:

always

Every response is spoken aloud

Best for: Hands-free environments

inbound

Only incoming messages trigger TTS

Best for: Notifications from integrations

tagged

Messages with #speak tag are voiced

Best for: Selective voice responses

Provider Options:

  • ElevenLabs — High-quality, natural voices (recommended)
  • OpenAI — Good quality, simple pricing
💡 Tips for Best Results
  • Use a quality microphone — Better input leads to better transcription accuracy
  • Minimize background noise — Find a quiet environment for voice conversations
  • Speak naturally — No need to speak slowly; the system handles normal speech
  • Monitor usage — ElevenLabs charges by character; watch your quota

Ready to Talk?

Set up your ElevenLabs account and start having voice conversations with OpenClaw.