ElevenLabs
Orate supports ElevenLabs' speech services.
ElevenLabs creates emotionally & contextually aware AI voices. Their voice AI responds to emotional cues in text and adapts its delivery to suit both the immediate content and the wider context. This lets their AI voices achieve high emotional range and avoid making logical errors.
Setup
The ElevenLabs provider is available by default in Orate. To import it, you can use the following code:
Configuration
You can use ElevenLabs by creating a new instance of the ElevenLabs
class:
This will use the ELEVENLABS_API_KEY
environment variable. If you don't have this variable set, you can pass your API key as an argument to the constructor.
Usage
The ElevenLabs provider provides a single interface for all of ElevenLabs' speech and transcription services.
Text to Speech
The ElevenLabs provider provides a tts
function that allows you to create a text-to-speech synthesis function using ElevenLabs. By default, the tts
function uses the multilingual_v2
model and the aria
voice.
You can specify the model and voice to use by passing them as arguments to the tts
function.
The voice can be the name of a default voice e.g. charlotte
or the ID of a custom voice e.g. rxQ8sHg3rojjgBilXbSC
.
You can also specify specific ElevenLabs properties by passing them as an argument to the tts
function.
You can also stream the speech.
Speech to Text
The ElevenLabs provider provides a stt
function that allows you to create a speech-to-text transcription function using ElevenLabs. By default, the stt
function uses the scribe_v1
model.
You can also specify specific ElevenLabs properties by passing them as an argument to the stt
function.
You can also stream the transcription.
Speech to Speech
The ElevenLabs provider provides a sts
function that allows you to change the voice of the audio. By default, the sts
function uses the eleven_multilingual_sts_v2
model and the aria
voice.
You can specify the model and voice to use by passing them as arguments to the sts
function.
You can also specify specific ElevenLabs properties by passing them as an argument to the sts
function.
Speech Isolation
The ElevenLabs provider provides a isl
function that allows you to isolate the speech from the audio.
You can also specify specific ElevenLabs properties by passing them as an argument to the isl
function (though there are none right now).
You can also stream the isolation.