Cartesia
Orate supports Cartesia's speech services.
Cartesia is a platform for real-time, multimodal intelligence. It helps you generate seamless speech, power voice applications, and fine-tune your own voice models on the fastest real-time AI platform.
Setup
The Cartesia provider is available by default in Orate. To import it, you can use the following code:
Configuration
You can use Cartesia by creating a new instance of the Cartesia
class:
This will use the CARTESIA_API_KEY
environment variable. If you don't have this variable set, you can pass your API key as an argument to the constructor.
Usage
The Cartesia provider provides a single interface for all of Cartesia's speech and transcription services.
Text to Speech
The Cartesia provider provides a tts
function that allows you to create a text-to-speech synthesis function using Cartesia. By default, the tts
function uses the sonic-2
model and the Griffin
voice.
You can specify the model and voice to use by passing them as arguments to the tts
function.
The voice can be the name of a default voice e.g. Silas
or the ID of a custom voice e.g. rxQ8sHg3rojjgBilXbSC
.
You can also specify specific Cartesia properties by passing them as an argument to the tts
function.
You can also stream the speech.
Speech to Speech
The Cartesia provider provides a sts
function that allows you to change the voice of the audio. By default, the sts
function uses the Silas
voice.
You can specify the voice to use by passing it as an argument to the sts
function.
You can also specify specific Cartesia properties by passing them as an argument to the sts
function.
You can also stream the speech.