Integrations and SDKs
Pipecat integration
Learn how to integrate Speechmatics STT with Pipecat.Pipecat is an open-source framework for building voice agents. When Speechmatics STT is integrated with Pipecat, you can build real-time voice and multimodal conversational agent specifically tailored to your needs.
Pipecat is perfect for:
- Voice AI: Voice assistants, chatbots, and IVR systems
- Transcription: Realtime transcription of live events or media
- Accessibility applications: Screen readers and assistive technologies
- Content creation: Podcasts, dubbing, audiobooks, and voice-overs
- Media production: News broadcasts and automated announcements
Features
- Realtime transcription: low-latency speech-to-text for responsive agents
- Speaker diarization: track who’s speaking in multi-participant sessions
- Turn detection: capture natural speech boundaries automatically
- Noise robustness: maintain accuracy in challenging environments
- Custom vocabularies: boost recognition for domain-specific terms
- Flexible deployment: use on-device, cloud, or hybrid Pipecat setups
Quickstart
Requirements
- Python 3.10 or later
- uv package manager installed
- Pipecat >= 1.2
- Speechmatics account. You can create one here.
- Speechmatics API key. You can generate one in the Portal.
Installation
pip install "pipecat-ai[speechmatics]"
Usage
Set the environment variable SPEECHMATICS_API_KEY to your Speechmatics API key.
export SPEECHMATICS_API_KEY=your_api_key
import asyncio
import os
from pipecat.services.speechmatics import SpeechmaticsSTTService
async def main():
stt = SpeechmaticsSTTService(
api_key=os.environ["SPEECHMATICS_API_KEY"],
)
async def audio_stream():
# Replace with your real audio source.
yield from [b"fake_audio_chunk_1", b"fake_audio_chunk_2"]
async for result in stt.transcribe(audio_stream()):
speaker = f"Speaker {result.speaker}" if result.speaker else "Unknown"
print(f"{speaker}: {result.text}")
if __name__ == "__main__":
asyncio.run(main())
For detailed examples, please see the Speechmatics Academy.