ElevenLabs

ElevenLabs creates emotionally & contextually aware AI voices. Their voice AI responds to emotional cues in text and adapts its delivery to suit both the immediate content and the wider context. This lets their AI voices achieve high emotional range and avoid making logical errors.

Setup

The ElevenLabs provider is available by default in Orate. To import it, you can use the following code:

import { ElevenLabs } from 'orate/elevenlabs';

Configuration

You can use ElevenLabs by creating a new instance of the ElevenLabs class:

const elevenlabs = new ElevenLabs();

This will use the ELEVENLABS_API_KEY environment variable. If you don't have this variable set, you can pass your API key as an argument to the constructor.

const elevenlabs = new ElevenLabs('your_api_key');

Usage

The ElevenLabs provider provides a single interface for all of ElevenLabs' speech and transcription services.

Text to Speech

The ElevenLabs provider provides a tts function that allows you to create a text-to-speech synthesis function using ElevenLabs. By default, the tts function uses the multilingual_v2 model and the aria voice.

import { speak } from 'orate';
import { ElevenLabs } from 'orate/elevenlabs';
 
const speech = await speak({
  model: new ElevenLabs().tts(),
  prompt: 'Hello, world!',
});

You can specify the model and voice to use by passing them as arguments to the tts function.

const speech = await speak({
  model: new ElevenLabs().tts('eleven_english_sts_v2', 'charlotte'),
  prompt: 'Hello, world!',
});

The voice can be the name of a default voice e.g. charlotte or the ID of a custom voice e.g. rxQ8sHg3rojjgBilXbSC.

You can also specify specific ElevenLabs properties by passing them as an argument to the tts function.

const speech = await speak({
  model: new ElevenLabs().tts('eleven_english_sts_v2', 'charlotte', {
    voice_settings: {
      stability: 0.5,
    },
  }),
  prompt: 'Hello, world!',
});

You can also stream the speech.

const speech = await speak({
  model: new ElevenLabs().tts(),
  prompt: 'Hello, world!',
  stream: true,
});

Speech to Text

The ElevenLabs provider provides a stt function that allows you to create a speech-to-text transcription function using ElevenLabs. By default, the stt function uses the scribe_v1 model.

import { transcribe } from 'orate';
import { ElevenLabs } from 'orate/elevenlabs';
 
const text = await transcribe({
  model: new ElevenLabs().stt(),
  audio: new File(...),
});

You can also specify specific ElevenLabs properties by passing them as an argument to the stt function.

const text = await transcribe({
  model: new ElevenLabs().stt('scribe_v1', {
    diarize: true,
  }),
  audio: new File(...),
});

You can also stream the transcription.

const text = await transcribe({
  model: new ElevenLabs().stt(),
  audio: new File(...),
  stream: true,
});

Speech to Speech

The ElevenLabs provider provides a sts function that allows you to change the voice of the audio. By default, the sts function uses the eleven_multilingual_sts_v2 model and the aria voice.

import { change } from 'orate';
import { ElevenLabs } from 'orate/elevenlabs';
 
const speech = await change({
  model: new ElevenLabs().sts(),
  audio: new File([], 'test.mp3', { type: 'audio/mp3' }),
});

You can specify the model and voice to use by passing them as arguments to the sts function.

const speech = await change({
  model: new ElevenLabs().sts('eleven_english_sts_v2', 'charlotte'),
  audio: new File([], 'test.mp3', { type: 'audio/mp3' }),
});

You can also specify specific ElevenLabs properties by passing them as an argument to the sts function.

const speech = await change({
  model: new ElevenLabs().sts('eleven_english_sts_v2', 'charlotte', {
    remove_background_noise: true,
  }),
  audio: new File([], 'test.mp3', { type: 'audio/mp3' }),
});

Speech Isolation

The ElevenLabs provider provides a isl function that allows you to isolate the speech from the audio.

import { isolate } from 'orate';
import { ElevenLabs } from 'orate/elevenlabs';
 
const speech = await isolate({
  model: new ElevenLabs().isl(),
  audio: new File([], 'test.mp3', { type: 'audio/mp3' }),
});

You can also specify specific ElevenLabs properties by passing them as an argument to the isl function (though there are none right now).

const speech = await isolate({
  model: new ElevenLabs().isl({ }),
});

You can also stream the isolation.

const speech = await isolate({
  model: new ElevenLabs().isl(),
  audio: new File([], 'test.mp3', { type: 'audio/mp3' }),
  stream: true,
});