LiveKit quickstart
Build a real-time voice AI agent with Speechmatics and LiveKit in minutes.
LiveKit Agents is a framework for building voice AI applications using WebRTC. With the Speechmatics plugin, you get accurate speech recognition and natural text-to-speech for your voice agents.
Features
- Real-time transcription — Low-latency speech-to-text as users speak
- Speaker diarization — Identify and track multiple speakers
- Smart turn detection — Know when the user has finished speaking
- Natural TTS voices — Choose from multiple voice options
- Noise robustness — Accurate recognition in challenging audio environments
- Global language support — Works with diverse accents and dialects
Prerequisites
- Python 3.10+
- Speechmatics API key
- LiveKit Cloud account (free tier available)
- OpenAI API key (for the LLM)
Setup
This guide assumes LiveKit Cloud. If you want to self-host LiveKit instead, follow LiveKit's self-hosting guide and configure LIVEKIT_URL, LIVEKIT_API_KEY, and LIVEKIT_API_SECRET for your deployment: https://docs.livekit.io/transport/self-hosting/
1. Create project
mkdir voice-agent && cd voice-agent
2. Install dependencies
uv init
uv add "livekit-agents[speechmatics,openai,silero]==1.4.2" python-dotenv
3. Install and authenticate the LiveKit CLI
Install the LiveKit CLI. For additional installation options, see the LiveKit CLI setup guide: https://docs.livekit.io/home/cli/cli-setup/
macOS:
brew install livekit-cli
Linux:
curl -sSL https://get.livekit.io/cli | bash
Windows:
winget install LiveKit.LiveKitCLI
Authenticate and link your LiveKit Cloud project:
lk cloud auth
4. Configure environment
Run the LiveKit CLI to write your LiveKit Cloud credentials to a .env.local file:
lk app env -w
This creates a .env.local file with your LiveKit credentials. Add your Speechmatics and OpenAI keys:
LIVEKIT_URL=wss://your-project.livekit.cloud
LIVEKIT_API_KEY=...
LIVEKIT_API_SECRET=...
SPEECHMATICS_API_KEY=your_speechmatics_key
OPENAI_API_KEY=your_openai_key
5. Create your agent
Create a main.py file:
from dotenv import load_dotenv
from livekit import agents
from livekit.agents import AgentSession, Agent, RoomInputOptions
from livekit.plugins import openai, silero, speechmatics
from livekit.plugins.speechmatics import TurnDetectionMode
load_dotenv(".env.local")
class VoiceAssistant(Agent):
def __init__(self):
super().__init__(
instructions="You are a helpful voice assistant. Be concise and friendly."
)
async def entrypoint(ctx: agents.JobContext):
await ctx.connect()
# Speech-to-Text: Speechmatics
stt = speechmatics.STT(
turn_detection_mode=TurnDetectionMode.SMART_TURN,
)
# Language Model: OpenAI
llm = openai.LLM(model="gpt-4o-mini")
# Text-to-Speech: Speechmatics
tts = speechmatics.TTS()
# Voice Activity Detection: Silero
vad = silero.VAD.load()
# Create and start session
session = AgentSession(
stt=stt,
llm=llm,
tts=tts,
vad=vad,
)
await session.start(
room=ctx.room,
agent=VoiceAssistant(),
room_input_options=RoomInputOptions(),
)
await session.generate_reply(
instructions="Say a short hello and ask how you can help."
)
if __name__ == "__main__":
agents.cli.run_app(
agents.WorkerOptions(entrypoint_fnc=entrypoint),
)
6. Run your agent
Run your agent in dev mode to connect it to LiveKit and make it available from anywhere on the internet:
python main.py dev
Open the LiveKit Agents Playground to test your agent.
Run your agent in console mode to speak to it locally in your terminal:
python main.py console
Next steps
- Speech to text — Configure diarization, turn detection, and more
- Text to speech — Choose voices and adjust settings
- Speechmatics Academy — Full working examples
- LiveKit deployment — Deploy to production