Text-to-speech with multiple providers
## Multi-Provider TTS MCP Server: Text-to-Speech Synthesis The **Multi-Provider TTS MCP Server** integrates multiple text-to-speech engines into Google Antigravity, offering flexible voice synthesis across providers like ElevenLabs, Amazon Polly, Google Cloud, Azure, and local models. ### Why Multi-Provider TTS MCP? - **Provider Flexibility**: Switch between TTS providers based on quality, cost, and latency needs - **Voice Variety**: Access hundreds of voices across different providers and languages - **Cost Optimization**: Route requests to cheaper providers for bulk generation - **Fallback Support**: Automatic failover to alternative providers if primary is unavailable - **Local Options**: Include offline TTS engines for privacy-sensitive applications ### Key Features #### 1. Voice Synthesis ```python # Generate speech from text audio = await mcp.synthesize( text="Welcome to our application!", provider="elevenlabs", voice="rachel", output_format="mp3" ) # Save audio file with open("welcome.mp3", "wb") as f: f.write(audio) ``` #### 2. Voice Selection ```python # List available voices across providers voices = await mcp.list_voices( provider="all", # or specific provider language="en-US" ) for voice in voices: print(f"{voice['provider']}: {voice['name']} - {voice['description']}") ``` #### 3. SSML Support ```python # Advanced speech control with SSML ssml_text = """ <speak> <prosody rate="slow" pitch="+2st"> Welcome to the tutorial. </prosody> <break time="500ms"/> Let us begin. </speak> """ audio = await mcp.synthesize( text=ssml_text, provider="azure", voice="en-US-JennyNeural", format="ssml" ) ``` #### 4. Streaming Output ```python # Stream audio for real-time playback async for chunk in mcp.stream_synthesis( text=long_article_text, provider="google", voice="en-US-Neural2-F", chunk_size=4096 ): audio_player.play(chunk) ``` ### Configuration ```json { "mcpServers": { "tts-multi": { "command": "npx", "args": ["-y", "@anthropic/mcp-tts-multi"], "env": { "ELEVENLABS_API_KEY": "your-elevenlabs-key", "AWS_ACCESS_KEY_ID": "your-aws-key", "AWS_SECRET_ACCESS_KEY": "your-aws-secret", "GOOGLE_APPLICATION_CREDENTIALS": "/path/to/credentials.json", "AZURE_SPEECH_KEY": "your-azure-key", "DEFAULT_PROVIDER": "elevenlabs" } } } } ``` ### Use Cases **Content Narration**: Generate professional voiceovers for videos, e-learning courses, and podcasts with natural-sounding voices. **Accessibility Features**: Add text-to-speech capabilities to applications for visually impaired users with customizable voice options. **Notification Systems**: Create dynamic voice alerts and announcements for IoT devices, kiosks, and notification systems. **Multilingual Support**: Generate audio content in multiple languages using native-speaking voices for global audiences. The Multi-Provider TTS MCP enables flexible, high-quality voice synthesis across providers within your development workflow.
{
"mcpServers": {
"tts-multi": {
"mcpServers": {
"tts": {
"env": {
"OPENAI_API_KEY": "YOUR_KEY",
"ELEVENLABS_API_KEY": "YOUR_KEY"
},
"args": [
"-y",
"mcp-tts"
],
"command": "npx"
}
}
}
}
}