Build a conversational AI app with React Native and Flow
Learn how to create a mobile application that integrates Speechmatics Flow service using React Native. This guide demonstrates how to build the app using the Expo framework, implementing real-time audio communication with Flow's servers.
Prerequisites
Before getting started, ensure you have:
- Node.js (LTS) installed on your system
- Development environment configured for:
Project Setup
Start by creating a fresh Expo project:
npx create-expo-app@latest
To remove the example code and start with a clean slate:
npm run reset-project
This command preserves the example files by moving them to an 'app-example' directory while creating a new clean app directory. You can safely remove the 'app-example' directory if you don't need it for reference.
Essential Dependencies
Install the following packages to enable Flow integration and audio handling:
# React version of Flow client
npm i @speechmatics/flow-client-react
# Polyfill for the EventTarget class
npm i event-target-polyfill
# Expo native module to handle audio
npm i @speechmatics/expo-two-way-audio
Building the User Interface
Let's create a minimal user interface.
Start by clearing the app/
directory and creating a new index.tsx
file with a basic UI structure:
1import { useState } from "react";
2import { Button, StyleSheet, View, Text } from "react-native";
3
4export default function Index() {
5 const [isConnecting, setIsConnecting] = useState(false);
6 const [isConnected, setIsConnected] = useState(false);
7 return (
8 <View style={styles.container}>
9 <Text>Talk to Flow!</Text>
10 <Button
11 title={isConnected ? "Disconnect" : "Connect"}
12 disabled={isConnecting}
13 />
14 </View>
15 );
16}
17
18const styles = StyleSheet.create({
19 container: {
20 flex: 1,
21 justifyContent: "center",
22 alignItems: "center",
23 },
24});
25
The view above will just render a button that will allow us to establish or close a connection with the Flow servers.
Lets run it on the simulator to see how it looks:
# For iOS simulator
npx expo run ios
# For Android emulator
npx expo run android
This will launch the Metro Bundler and show up some options. If it shows the following: Using Expo Go
we need to switch to a development build. This can be done by pressing s
. Then press r
to reload the app. Some features that we are going to include, like the native module for handling audio, don't work properly in Expo Go.
Implementing Flow Connection
It's time to add some functionality to this example. We'll start by implementing the connect and disconnect logic. For that we are going to use the @speechmatics/flow-client-react
package.
The Flow client uses EventTarget which is typically available in browsers but not in react native. For that reason we need a polyfill: event-target-polyfill
.
npm i @speechmatics/flow-client-react
npm i event-target-polyfill
# Just for the purpose of this example. See comment above `createSpeechmaticsJWT`
npm i @speechmatics/auth
1// The polyfill should be the first import in the app
2import "event-target-polyfill";
3import { Button, StyleSheet, View, Text } from "react-native";
4import { FlowProvider, useFlow } from "@speechmatics/flow-client-react";
5
6import { createSpeechmaticsJWT } from "@speechmatics/auth";
7
8export default function Index() {
9 return (
10 <FlowProvider
11 appId="react-native-flow-guide"
12 websocketBinaryType="arraybuffer"
13 >
14 <Flow />
15 </FlowProvider>
16 );
17}
18
19function Flow() {
20 const { startConversation, endConversation, sendAudio, socketState } =
21 useFlow();
22
23 const obtainJwt = async () => {
24 const apiKey = process.env.EXPO_PUBLIC_SPEECHMATICS_API_KEY;
25 if (!apiKey) {
26 throw new Error("API key not found");
27 }
28 // WARNING: This is just an example app.
29 // In a real app you should obtain the JWT from your server.
30 // For example, `createSpeechmaticsJWT` could be used on a server running JS.
31 // Otherwise, you will expose your API key to the client.
32 return await createSpeechmaticsJWT({
33 type: "flow",
34 apiKey,
35 ttl: 60,
36 });
37 };
38
39 const handleToggleConnect = async () => {
40 if (socketState === "open") {
41 endConversation();
42 } else {
43 try {
44 const jwt = await obtainJwt();
45 await startConversation(jwt, {
46 config: {
47 template_id: "flow-service-assistant-humphrey",
48 template_variables: {
49 timezone: "Europe/London",
50 },
51 },
52 });
53 } catch (e) {
54 console.log("Error connecting to Flow: ", e);
55 }
56 }
57 };
58
59 return (
60 <View style={styles.container}>
61 <Text>Talk to Flow!</Text>
62 <Button
63 title={socketState === "open" ? "Disconnect" : "Connect"}
64 disabled={socketState === "connecting" || socketState === "closing"}
65 onPress={handleToggleConnect}
66 />
67 </View>
68 );
69}
70
71const styles = StyleSheet.create({
72 container: {
73 flex: 1,
74 justifyContent: "center",
75 alignItems: "center",
76 },
77});
78
Audio Integration
The final step is implementing two-way audio communication. This involves three crucial tasks:
- Microphone input capture in PCM format
- Speaker output routing for Flow responses
- Acoustic Echo Cancellation (AEC) to prevent audio feedback
We'll use the Speechmatics Expo Two Way Audio module to handle these requirements efficiently:
npm i @speechmatics/expo-two-way-audio
In order to allow microphone access we need to add some configuration to the app.json
file in the root of our project.
For iOS we add an infoPlist
entry and for Android a permissions
entry.
{
"expo": {
...
"ios": {
"infoPlist": {
"NSMicrophoneUsageDescription": "Allow Speechmatics to access your microphone"
},
...
},
"android": {
"permissions": ["RECORD_AUDIO", "MODIFY_AUDIO_SETTINGS"],
...
}
}
...
}
1// The polyfill should be the first import in the whole app
2import "event-target-polyfill";
3import { useCallback, useEffect, useState } from "react";
4import { Button, StyleSheet, View, Text } from "react-native";
5import {
6 FlowProvider,
7 useFlow,
8 useFlowEventListener,
9} from "@speechmatics/flow-client-react";
10
11import { createSpeechmaticsJWT } from "@speechmatics/auth";
12
13import {
14 type MicrophoneDataCallback,
15 initialize,
16 playPCMData,
17 toggleRecording,
18 useExpoTwoWayAudioEventListener,
19 useIsRecording,
20 useMicrophonePermissions,
21} from "@speechmatics/expo-two-way-audio";
22
23export default function Index() {
24 const [micPermission, requestMicPermission] = useMicrophonePermissions();
25 if (!micPermission?.granted) {
26 return (
27 <View style={styles.container}>
28 <Text>Mic permission: {micPermission?.status}</Text>
29 <Button
30 title={
31 micPermission?.canAskAgain
32 ? "Request permission"
33 : "Cannot request permissions"
34 }
35 disabled={!micPermission?.canAskAgain}
36 onPress={requestMicPermission}
37 />
38 </View>
39 );
40 }
41 return (
42 <FlowProvider
43 appId="react-native-flow-guide"
44 websocketBinaryType="arraybuffer"
45 >
46 <Flow />
47 </FlowProvider>
48 );
49}
50
51function Flow() {
52 const [audioInitialized, setAudioInitialized] = useState(false);
53 const { startConversation, endConversation, sendAudio, socketState } =
54 useFlow();
55 const isRecording = useIsRecording();
56
57 // Initialize Expo Two Way Audio
58 useEffect(() => {
59 const initializeAudio = async () => {
60 await initialize();
61 setAudioInitialized(true);
62 };
63
64 initializeAudio();
65 }, []);
66
67 // Setup a handler for the "agentAudio" event from Flow API
68 useFlowEventListener("agentAudio", (audio) => {
69 // Even though Int16Array is a more natural representation for PCM16_sle,
70 // Expo Modules API uses a convertible type for arrays of bytes and it needs Uint8Array in the JS side.
71 // This is converted to a `Data` type in Swift and to a `kotlin.ByteArray` in Kotlin.
72 // More info here: https://docs.expo.dev/modules/module-api/#convertibles
73 // For this reason, the Expo Two Way Audio library requires a Uint8Array argument for the `playPCMData` function.
74 const byteArray = new Uint8Array(audio.data.buffer);
75 playPCMData(byteArray);
76 });
77
78 // Setup a handler for the "onMicrophoneData" event from Expo Two Way Audio module
79 useExpoTwoWayAudioEventListener(
80 "onMicrophoneData",
81 useCallback<MicrophoneDataCallback>(
82 (event) => {
83 // We send the audio bytes to the Flow API
84 sendAudio(event.data.buffer);
85 },
86 [sendAudio],
87 ),
88 );
89
90 const obtainJwt = async () => {
91 const apiKey = process.env.EXPO_PUBLIC_SPEECHMATICS_API_KEY;
92 if (!apiKey) {
93 throw new Error("API key not found");
94 }
95 // WARNING: This is just an example app.
96 // In a real app you should obtain the JWT from your server.
97 // `createSpeechmaticsJWT` could be used on a server running JS.
98 // Otherwise, you will expose your API key to the client.
99 return await createSpeechmaticsJWT({
100 type: "flow",
101 apiKey,
102 ttl: 60,
103 });
104 };
105
106 const handleToggleConnect = async () => {
107 if (socketState === "open") {
108 endConversation();
109 } else {
110 try {
111 const jwt = await obtainJwt();
112 await startConversation(jwt, {
113 config: {
114 template_id: "flow-service-assistant-humphrey",
115 template_variables: {
116 timezone: "Europe/London",
117 },
118 },
119 });
120 } catch (e) {
121 console.log("Error connecting to Flow: ", e);
122 }
123 }
124 };
125
126 // Handle clicks to the 'Mute/Unmute' button
127 const handleToggleMute = useCallback(() => {
128 toggleRecording(!isRecording);
129 }, [isRecording]);
130
131 return (
132 <View style={styles.container}>
133 <Text>Talk to Flow!</Text>
134 <Button
135 title={socketState === "open" ? "Disconnect" : "Connect"}
136 disabled={socketState === "connecting" || socketState === "closing"}
137 onPress={handleToggleConnect}
138 />
139 <Button
140 title={isRecording ? "Mute" : "Unmute"}
141 disabled={socketState !== "open" || !audioInitialized}
142 onPress={handleToggleMute}
143 />
144 </View>
145 );
146}
147
148const styles = StyleSheet.create({
149 container: {
150 flex: 1,
151 justifyContent: "center",
152 alignItems: "center",
153 },
154});
155
Now our app can unmute the microphone to send audio samples to the Flow server. Audio messages comming back from the Flow server will be played back through the speaker.
Testing on Physical Devices
While simulators are great for initial testing, features like Acoustic Echo Cancellation require physical devices for proper functionality. To deploy to real devices, first configure the development environment for using local builds with physical devices:
Then run the following command:
# For iOS devices
npx expo run:ios --device --configuration Release
# For Android devices
npx expo run:android --device --variant release
Additional resources
Dive deeper into the tools used in this guide: