dapr-agents/cookbook/llm/elevenlabs_speech_basic.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# LLM: ElevenLabs Text-To-Speech Endpoint Basic Examples\n",
    "\n",
    "This notebook demonstrates how to use the `ElevenLabsSpeechClient` in dapr-agents for basic tasks with the [ElevenLabs Text-To-Speech Endpoint](https://elevenlabs.io/docs/api-reference/text-to-speech/convert). We will explore:\n",
    "\n",
    "* Initializing the `ElevenLabsSpeechClient`.\n",
    "* Generating speech from text and saving it as an MP3 file.."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Install Required Libraries\n",
    "\n",
    "Ensure you have the required library installed:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install dapr-agents python-dotenv elevenlabs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load Environment Variables"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from dotenv import load_dotenv\n",
    "load_dotenv()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Enable Logging"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import logging\n",
    "\n",
    "logging.basicConfig(level=logging.INFO)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Initialize ElevenLabsSpeechClient\n",
    "\n",
    "Initialize the `ElevenLabsSpeechClient`. By default the voice is set to: `voice_id=EXAVITQu4vr4xnSDxMaL\",name=\"Sarah\"`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from dapr_agents import ElevenLabsSpeechClient\n",
    "\n",
    "client = ElevenLabsSpeechClient(\n",
    "    model=\"eleven_multilingual_v2\", # Default model\n",
    "    voice=\"JBFqnCBsd6RMkjVDRZzb\" # 'name': 'George', 'language': 'en', 'labels': {'accent': 'British', 'description': 'warm', 'age': 'middle aged', 'gender': 'male', 'use_case': 'narration'}\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Generate Speech from Text\n",
    "\n",
    "### Manual File Creation\n",
    "\n",
    "This section demonstrates how to generate speech from a given text input and save it as an MP3 file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Define the text to convert to speech\n",
    "text = \"Hello Roberto! This is an example of text-to-speech generation.\"\n",
    "\n",
    "# Create speech from text\n",
    "audio_bytes = client.create_speech(\n",
    "    text=text,\n",
    "    output_format=\"mp3_44100_128\" # default output format, mp3 with 44.1kHz sample rate at 128kbps.\n",
    ")\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Save the audio to an MP3 file\n",
    "output_path = \"output_speech.mp3\"\n",
    "with open(output_path, \"wb\") as audio_file:\n",
    "    audio_file.write(audio_bytes)\n",
    "\n",
    "print(f\"Audio saved to {output_path}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Automatic File Creation\n",
    "\n",
    "The audio file is saved directly by providing the file_name parameter."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Define the text to convert to speech\n",
    "text = \"Hello Roberto! This is another example of text-to-speech generation.\"\n",
    "\n",
    "# Create speech from text\n",
    "client.create_speech(\n",
    "    text=text,\n",
    "    output_format=\"mp3_44100_128\", # default output format, mp3 with 44.1kHz sample rate at 128kbps.,\n",
    "    file_name='output_speech_auto.mp3'\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}