{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# LLM: ElevenLabs Text-To-Speech Endpoint Basic Examples\n", "\n", "This notebook demonstrates how to use the `ElevenLabsSpeechClient` in dapr-agents for basic tasks with the [ElevenLabs Text-To-Speech Endpoint](https://elevenlabs.io/docs/api-reference/text-to-speech/convert). We will explore:\n", "\n", "* Initializing the `ElevenLabsSpeechClient`.\n", "* Generating speech from text and saving it as an MP3 file.." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Install Required Libraries\n", "\n", "Ensure you have the required library installed:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!pip install dapr-agents python-dotenv elevenlabs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load Environment Variables" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from dotenv import load_dotenv\n", "load_dotenv()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Enable Logging" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import logging\n", "\n", "logging.basicConfig(level=logging.INFO)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Initialize ElevenLabsSpeechClient\n", "\n", "Initialize the `ElevenLabsSpeechClient`. By default the voice is set to: `voice_id=EXAVITQu4vr4xnSDxMaL\",name=\"Sarah\"`" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from dapr_agents import ElevenLabsSpeechClient\n", "\n", "client = ElevenLabsSpeechClient(\n", " model=\"eleven_multilingual_v2\", # Default model\n", " voice=\"JBFqnCBsd6RMkjVDRZzb\" # 'name': 'George', 'language': 'en', 'labels': {'accent': 'British', 'description': 'warm', 'age': 'middle aged', 'gender': 'male', 'use_case': 'narration'}\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Generate Speech from Text\n", "\n", "### Manual File Creation\n", "\n", "This section demonstrates how to generate speech from a given text input and save it as an MP3 file." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Define the text to convert to speech\n", "text = \"Hello Roberto! This is an example of text-to-speech generation.\"\n", "\n", "# Create speech from text\n", "audio_bytes = client.create_speech(\n", " text=text,\n", " output_format=\"mp3_44100_128\" # default output format, mp3 with 44.1kHz sample rate at 128kbps.\n", ")\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Save the audio to an MP3 file\n", "output_path = \"output_speech.mp3\"\n", "with open(output_path, \"wb\") as audio_file:\n", " audio_file.write(audio_bytes)\n", "\n", "print(f\"Audio saved to {output_path}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Automatic File Creation\n", "\n", "The audio file is saved directly by providing the file_name parameter." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Define the text to convert to speech\n", "text = \"Hello Roberto! This is another example of text-to-speech generation.\"\n", "\n", "# Create speech from text\n", "client.create_speech(\n", " text=text,\n", " output_format=\"mp3_44100_128\", # default output format, mp3 with 44.1kHz sample rate at 128kbps.,\n", " file_name='output_speech_auto.mp3'\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.1" } }, "nbformat": 4, "nbformat_minor": 2 }