dapr-agents/cookbook/llm/elevenlabs_speech_basic.ipynb

188 lines
4.5 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# LLM: ElevenLabs Text-To-Speech Endpoint Basic Examples\n",
"\n",
"This notebook demonstrates how to use the `ElevenLabsSpeechClient` in dapr-agents for basic tasks with the [ElevenLabs Text-To-Speech Endpoint](https://elevenlabs.io/docs/api-reference/text-to-speech/convert). We will explore:\n",
"\n",
"* Initializing the `ElevenLabsSpeechClient`.\n",
"* Generating speech from text and saving it as an MP3 file.."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Install Required Libraries\n",
"\n",
"Ensure you have the required library installed:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install dapr-agents python-dotenv elevenlabs"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Load Environment Variables"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from dotenv import load_dotenv\n",
"load_dotenv()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Enable Logging"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import logging\n",
"\n",
"logging.basicConfig(level=logging.INFO)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Initialize ElevenLabsSpeechClient\n",
"\n",
"Initialize the `ElevenLabsSpeechClient`. By default the voice is set to: `voice_id=EXAVITQu4vr4xnSDxMaL\",name=\"Sarah\"`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from dapr_agents import ElevenLabsSpeechClient\n",
"\n",
"client = ElevenLabsSpeechClient(\n",
" model=\"eleven_multilingual_v2\", # Default model\n",
" voice=\"JBFqnCBsd6RMkjVDRZzb\" # 'name': 'George', 'language': 'en', 'labels': {'accent': 'British', 'description': 'warm', 'age': 'middle aged', 'gender': 'male', 'use_case': 'narration'}\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Generate Speech from Text\n",
"\n",
"### Manual File Creation\n",
"\n",
"This section demonstrates how to generate speech from a given text input and save it as an MP3 file."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Define the text to convert to speech\n",
"text = \"Hello Roberto! This is an example of text-to-speech generation.\"\n",
"\n",
"# Create speech from text\n",
"audio_bytes = client.create_speech(\n",
" text=text,\n",
" output_format=\"mp3_44100_128\" # default output format, mp3 with 44.1kHz sample rate at 128kbps.\n",
")\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Save the audio to an MP3 file\n",
"output_path = \"output_speech.mp3\"\n",
"with open(output_path, \"wb\") as audio_file:\n",
" audio_file.write(audio_bytes)\n",
"\n",
"print(f\"Audio saved to {output_path}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Automatic File Creation\n",
"\n",
"The audio file is saved directly by providing the file_name parameter."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Define the text to convert to speech\n",
"text = \"Hello Roberto! This is another example of text-to-speech generation.\"\n",
"\n",
"# Create speech from text\n",
"client.create_speech(\n",
" text=text,\n",
" output_format=\"mp3_44100_128\", # default output format, mp3 with 44.1kHz sample rate at 128kbps.,\n",
" file_name='output_speech_auto.mp3'\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.1"
}
},
"nbformat": 4,
"nbformat_minor": 2
}