Get Started With ElevenLabs Text to Speech API in Minutes. The quickest guide to getting started with the Elevenlabs Text to Speech API.

in API

August 23, 2024 4 min read
Get Started With ElevenLabs Text to Speech API in Minutes.

Low latency, highest quality text to speech API

clone voiceClone your voice
Free API Playground

Table of Contents

ElevenLabs Text to Speech API Quickstart Guide

The ElevenLabs Text to Speech API allows developers to convert text into high-quality AI voices, with applications ranging from podcasts to audiobooks and chatbots. Here’s how to use the API and integrate it into your Python workflow.

Authentication

To access the API, use your API key in the request headers:

xi-api-key: <your_api_key>
content-type: application/json

Making a Request

To generate speech, send a POST request to:

https://api.elevenlabs.io/v1/text-to-speech/{voice_id}

Here’s an example in Python:

import requests

url = "https://api.elevenlabs.io/v1/text-to-speech/21m00tcm4tlvdq8ikwam"
headers = {
    "xi-api-key": "your_api_key",
    "Content-Type": "application/json"
}
data = {
    "text": "Hello, welcome to the future of voice generation.",
    "voice_settings": {
        "stability": 0.7,
        "similarity_boost": 0.8
    }
}
response = requests.post(url, headers=headers, json=data)
with open("output.mp3", "wb") as f:
    f.write(response.content)

This script sends a request to generate speech using the voice_id and saves the audio file locally.

Voice Customization

ElevenLabs provides voice cloning and custom voice settings such as:

  • stability: Controls voice consistency.
  • similarity_boost: Adjusts the similarity to the reference voice.

Use Cases

  • Podcasts: Automate voiceovers with real-time speech generation.
  • Audiobooks: Generate engaging voices for different characters.
  • Chatbots: Enhance conversations with low latency and natural voices.

Multilingual Support

The API supports multilingual voices, including English, Portuguese, and more.

Example with JSON

Here’s an extended JSON payload example:

{
  "text": "This is an example of ElevenLabs TTS API.",
  "voice_settings": {
    "stability": 0.5,
    "similarity_boost": 0.9
  }
}

Headers & Response Format

The API expects:

  • xi-api-key: Your API key.
  • content-type: Application/json.
    The response contains an audio stream or audio file in MP3 format.

Pricing & Subscription

ElevenLabs offers various pricing tiers based on usage, including access to the voice library and different levels of API concurrency.

Open-Source Integrations

The API can be integrated into open-source platforms, allowing seamless text-to-speech functionality in custom applications.

Advanced Features

  • Turbo Mode for faster audio generation.
  • Real-time speech synthesis for dynamic content.
  • Speech-to-speech capabilities for advanced workflows.

Conclusion

The ElevenLabs TTS API is a powerful tool for generating AI voices for a wide range of applications. From voiceovers to chatbots, the API supports real-time interaction, multilingual content, and customization through advanced voice settings.

For further documentation, explore the official API Docs.

ElevenLabs Voice and Endpoint Expansion

ElevenLabs Voice

ElevenLabs voice options offer various customizable AI voices, including voice cloning, tailored voice settings, and a comprehensive voice library for projects. These voices are suitable for podcasts, audiobooks, and voiceovers, with personalization through settings like stability and similarity_boost.

API Endpoint

The main endpoint for generating speech via the ElevenLabs API is:

https://api.elevenlabs.io/v1/text-to-speech/{voice_id}

This endpoint processes your model_id, voice selection, and speech synthesis request, returning a high-quality audio file or stream in real-time.

OpenAI Integration

ElevenLabs’ API can be integrated with OpenAI tools for more complex workflows. Combining OpenAI models with ElevenLabs’ text to speech technology can enhance applications in chatbots, real-time interaction, and automated voiceovers.

Tutorial Example

A basic tutorial for using ElevenLabs TTS involves setting up the endpoint, passing your API key, and sending a JSON request. Here’s a Python example:

import requests
url = "https://api.elevenlabs.io/v1/text-to-speech/voice_id"
headers = {"xi-api-key": "your_api_key", "Content-Type": "application/json"}
data = {"text": "Your text", "voice_settings": {"stability": 0.7, "similarity_boost": 0.8}}
response = requests.post(url, headers=headers, json=data)
with open("audio.mp3", "wb") as f: f.write(response.content)

A List of Features

  • Multilingual support (e.g., English, Portuguese)
  • Voice cloning for unique, customized voices
  • Low latency for real-time interaction
  • Integration with OpenAI
  • Customizable voice settings

Model ID Explanation

The model_id identifies specific voice models or voice types. You pass the model_id along with your request to use specific voices or voice styles. This allows for a flexible range of audio generation, from conversational tones to specific character voiceovers.

Recent Posts

Listen & Rate TTS Voices

See Leaderboard

Top AI Apps

Alternatives

Similar articles