Getting started
Deployments:SaaSStatus:Early AccessFlow is our Conversational AI API that allows you to add responsive, real-time speech-to-speech interactions to any product.
Flow is engineered to engage in natural and fluid conversations by automatically handling interruptions, responding to multiple speakers, and understanding different dialects and accents.
This page will show you how to use the Flow Conversational API for the most natural and intuitive conversational AI using an easy to use interactive code editor.
Set Up
- Create an account on the Speechmatics On-Demand Portal here.
- Navigate to Manage > API Keys page in the Speechmatics On-Demand Portal.
- Enter a name for your API key and store your API key somewhere safe.
Code Examples
- CLI
- Python
- NodeJS
The Speechmatics Flow library and CLI can found on GitHub and installed using pip:
pip3 install speechmatics-flow
Speak with Flow directly from your terminal. Just copy in your API key to get started!
speechmatics-flow --url wss://flow.api.speechmatics.com/v1/flow --ssl-mode insecure --auth-token $API_KEY -
The Speechmatics Flow library and CLI can found on GitHub and installed using pip:
pip3 install speechmatics-flow
Speak with Flow with this Python code example. Just copy in your API key and file name to get started!
1import asyncio
2import io
3import ssl
4import sys
5
6import pyaudio
7
8from speechmatics_flow.client import WebsocketClient
9from speechmatics_flow.models import (
10 ConnectionSettings,
11 Interaction,
12 AudioSettings,
13 ConversationConfig,
14 ServerMessageType,
15)
16
17AUTH_TOKEN = "Place your auth token here"
18
19# Create a websocket client
20client = WebsocketClient(
21 ConnectionSettings(
22 url="wss://flow.api.speechmatics.com/v1/flow",
23 auth_token=AUTH_TOKEN,
24 )
25)
26
27# Create a buffer to store binary messages sent from the server
28audio_buffer = io.BytesIO()
29
30
31# Create callback function which adds binary messages to audio buffer
32def binary_msg_handler(msg: bytes):
33 if isinstance(msg, (bytes, bytearray)):
34 audio_buffer.write(msg)
35
36
37# Register the callback which will be called
38# when the client receives an audio message from the server
39client.add_event_handler(ServerMessageType.AddAudio, binary_msg_handler)
40
41
42async def audio_playback(buffer):
43 """Read from buffer and play audio back to the user"""
44 p = pyaudio.PyAudio()
45 stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, output=True)
46 try:
47 while True:
48 # Get the current value from the buffer
49 audio_to_play = buffer.getvalue()
50 # Only proceed if there is audio data to play
51 if audio_to_play:
52 # Write the audio to the stream
53 stream.write(audio_to_play)
54 buffer.seek(0)
55 buffer.truncate(0)
56 # Pause briefly before checking the buffer again
57 await asyncio.sleep(0.05)
58 finally:
59 stream.close()
60 stream.stop_stream()
61 p.terminate()
62
63
64async def main():
65 tasks = [
66 # Use the websocket to connect to Flow Service and start a conversation
67 asyncio.create_task(
68 client.run(
69 interactions=[Interaction(sys.stdin.buffer)],
70 audio_settings=AudioSettings(),
71 conversation_config=ConversationConfig(),
72 )
73 ),
74 # Run audio playback handler which streams audio from audio buffer
75 asyncio.create_task(audio_playback(audio_buffer)),
76 ]
77
78 await asyncio.gather(*tasks)
79
80
81asyncio.run(main())
82
The Speechmatics Flow client can be found on GitHub along with a React client.
The Flow client can be installed using NPM:
npm i @speechmatics/flow-client
This library can be used in the browser and backend runtimes like NodeJS and Bun! Check out our examples on Github.
Below is an example of how to use the Flow client SDK. For a more complete example, check out our NextJS sample app.
import { FlowClient, AgentAudioEvent } from "@speechmatics/flow-client";
const flowClient = new FlowClient('wss://flow.api.speechmatics.com', { appId: "example" });
async function fetchCredentials() {
const resp = await fetch(
'https://mp.speechmatics.com/v1/api_keys?type=flow',
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
Authorization: `Bearer ${process.env.API_KEY}`,
},
body: JSON.stringify({
ttl: 3600,
}),
},
);
if (!resp.ok) {
throw new Error('Bad response from API', { cause: resp });
}
return await resp.json();
}
flowClient.addEventListener("agentAudio", (audio) => {
// audio.data is PCM16_SLE data. How you play this depends on your environment
myAudioPlayFunction(audio.data)
});
// Fetch a JWT for authentication, and start a conversation :
const jwt = await fetchCredentials(YOUR_API_KEY);
await flowClient.startConversation(jwt, {
config: {
template_id: "flow-service-assistant-amelia",
template_variables: {},
},
// Optional, this is the default
audioFormat: {
type: 'raw',
encoding: 'pcm_s16le',
sample_rate: 16000,
},
});
// PCM audio can be sent to the client (either f32 or int16 depending on the audio_format defined above)
function onPCMAudio(audio: Int16Array) {
flowClient.sendAudio(audio);
}
function onSessionEnd() {
// Ends conversation and closes websocket
flowClient.endConversation();
// Event listeners can also be removed like so
flowClient.removeEventListener("agentAudio", onAgentAudio);
}