Skip to main content

Build a conversational AI app with React Native and Flow

Learn how to create a mobile application that integrates Speechmatics Flow service using React Native. This guide demonstrates how to build the app using the Expo framework, implementing real-time audio communication with Flow's servers.

Prerequisites

Before getting started, ensure you have:

Project Setup

Start by creating a fresh Expo project:

npx create-expo-app@latest

To remove the example code and start with a clean slate:

npm run reset-project

This command preserves the example files by moving them to an 'app-example' directory while creating a new clean app directory. You can safely remove the 'app-example' directory if you don't need it for reference.

Essential Dependencies

Install the following packages to enable Flow integration and audio handling:

# React version of Flow client
npm i @speechmatics/flow-client-react

# Polyfill for the EventTarget class
npm i event-target-polyfill

# Expo native module to handle audio
npm i @speechmatics/expo-two-way-audio

Building the User Interface

Let's create a minimal user interface. Start by clearing the app/ directory and creating a new index.tsx file with a basic UI structure:

1import { useState } from "react";
2import { Button, StyleSheet, View, Text } from "react-native";
3
4export default function Index() {
5  const [isConnecting, setIsConnecting] = useState(false);
6  const [isConnected, setIsConnected] = useState(false);
7  return (
8    <View style={styles.container}>
9      <Text>Talk to Flow!</Text>
10      <Button
11        title={isConnected ? "Disconnect" : "Connect"}
12        disabled={isConnecting}
13      />
14    </View>
15  );
16}
17
18const styles = StyleSheet.create({
19  container: {
20    flex: 1,
21    justifyContent: "center",
22    alignItems: "center",
23  },
24});
25

The view above will just render a button that will allow us to establish or close a connection with the Flow servers.

Lets run it on the simulator to see how it looks:

# For iOS simulator
npx expo run ios

# For Android emulator
npx expo run android

This will launch the Metro Bundler and show up some options. If it shows the following: Using Expo Go we need to switch to a development build. This can be done by pressing s. Then press r to reload the app. Some features that we are going to include, like the native module for handling audio, don't work properly in Expo Go.

Implementing Flow Connection

It's time to add some functionality to this example. We'll start by implementing the connect and disconnect logic. For that we are going to use the @speechmatics/flow-client-react package.

The Flow client uses EventTarget which is typically available in browsers but not in react native. For that reason we need a polyfill: event-target-polyfill.

npm i @speechmatics/flow-client-react
npm i event-target-polyfill

# Just for the purpose of this example. See comment above `createSpeechmaticsJWT`
npm i @speechmatics/auth
1// The polyfill should be the first import in the app
2import "event-target-polyfill";
3import { Button, StyleSheet, View, Text } from "react-native";
4import { FlowProvider, useFlow } from "@speechmatics/flow-client-react";
5
6import { createSpeechmaticsJWT } from "@speechmatics/auth";
7
8export default function Index() {
9  return (
10    <FlowProvider
11      appId="react-native-flow-guide"
12      websocketBinaryType="arraybuffer"
13    >
14      <Flow />
15    </FlowProvider>
16  );
17}
18
19function Flow() {
20  const { startConversation, endConversation, sendAudio, socketState } =
21    useFlow();
22
23  const obtainJwt = async () => {
24    const apiKey = process.env.EXPO_PUBLIC_SPEECHMATICS_API_KEY;
25    if (!apiKey) {
26      throw new Error("API key not found");
27    }
28    // WARNING: This is just an example app.
29    // In a real app you should obtain the JWT from your server.
30    // For example, `createSpeechmaticsJWT` could be used on a server running JS.
31    // Otherwise, you will expose your API key to the client.
32    return await createSpeechmaticsJWT({
33      type: "flow",
34      apiKey,
35      ttl: 60,
36    });
37  };
38
39  const handleToggleConnect = async () => {
40    if (socketState === "open") {
41      endConversation();
42    } else {
43      try {
44        const jwt = await obtainJwt();
45        await startConversation(jwt, {
46          config: {
47            template_id: "flow-service-assistant-humphrey",
48            template_variables: {
49              timezone: "Europe/London",
50            },
51          },
52        });
53      } catch (e) {
54        console.log("Error connecting to Flow: ", e);
55      }
56    }
57  };
58
59  return (
60    <View style={styles.container}>
61      <Text>Talk to Flow!</Text>
62      <Button
63        title={socketState === "open" ? "Disconnect" : "Connect"}
64        disabled={socketState === "connecting" || socketState === "closing"}
65        onPress={handleToggleConnect}
66      />
67    </View>
68  );
69}
70
71const styles = StyleSheet.create({
72  container: {
73    flex: 1,
74    justifyContent: "center",
75    alignItems: "center",
76  },
77});
78

Audio Integration

The final step is implementing two-way audio communication. This involves three crucial tasks:

  1. Microphone input capture in PCM format
  2. Speaker output routing for Flow responses
  3. Acoustic Echo Cancellation (AEC) to prevent audio feedback

We'll use the Speechmatics Expo Two Way Audio module to handle these requirements efficiently:

npm i @speechmatics/expo-two-way-audio

In order to allow microphone access we need to add some configuration to the app.json file in the root of our project. For iOS we add an infoPlist entry and for Android a permissions entry.

{
  "expo": {
    ...
    "ios": {
      "infoPlist": {
        "NSMicrophoneUsageDescription": "Allow Speechmatics to access your microphone"
      },
      ...
    },
    "android": {
        "permissions": ["RECORD_AUDIO", "MODIFY_AUDIO_SETTINGS"],
        ...
    }
  }
    ...
}

1// The polyfill should be the first import in the whole app
2import "event-target-polyfill";
3import { useCallback, useEffect, useState } from "react";
4import { Button, StyleSheet, View, Text } from "react-native";
5import {
6    FlowProvider,
7    useFlow,
8    useFlowEventListener,
9} from "@speechmatics/flow-client-react";
10
11import { createSpeechmaticsJWT } from "@speechmatics/auth";
12
13import {
14    type MicrophoneDataCallback,
15    initialize,
16    playPCMData,
17    toggleRecording,
18    useExpoTwoWayAudioEventListener,
19    useIsRecording,
20    useMicrophonePermissions,
21} from "@speechmatics/expo-two-way-audio";
22
23export default function Index() {
24    const [micPermission, requestMicPermission] = useMicrophonePermissions();
25    if (!micPermission?.granted) {
26        return (
27            <View style={styles.container}>
28                <Text>Mic permission: {micPermission?.status}</Text>
29                <Button
30                    title={
31                        micPermission?.canAskAgain
32                            ? "Request permission"
33                            : "Cannot request permissions"
34                    }
35                    disabled={!micPermission?.canAskAgain}
36                    onPress={requestMicPermission}
37                />
38            </View>
39        );
40    }
41    return (
42        <FlowProvider
43            appId="react-native-flow-guide"
44            websocketBinaryType="arraybuffer"
45        >
46            <Flow />
47        </FlowProvider>
48    );
49}
50
51function Flow() {
52    const [audioInitialized, setAudioInitialized] = useState(false);
53    const { startConversation, endConversation, sendAudio, socketState } =
54        useFlow();
55    const isRecording = useIsRecording();
56
57    // Initialize Expo Two Way Audio
58    useEffect(() => {
59        const initializeAudio = async () => {
60            await initialize();
61            setAudioInitialized(true);
62        };
63
64        initializeAudio();
65    }, []);
66
67    // Setup a handler for the "agentAudio" event from Flow API
68    useFlowEventListener("agentAudio", (audio) => {
69        // Even though Int16Array is a more natural representation for PCM16_sle,
70        // Expo Modules API uses a convertible type for arrays of bytes and it needs Uint8Array in the JS side.
71        // This is converted to a `Data` type in Swift and to a `kotlin.ByteArray` in Kotlin.
72        // More info here: https://docs.expo.dev/modules/module-api/#convertibles
73        // For this reason, the Expo Two Way Audio library requires a Uint8Array argument for the `playPCMData` function.
74        const byteArray = new Uint8Array(audio.data.buffer);
75        playPCMData(byteArray);
76    });
77
78    // Setup a handler for the "onMicrophoneData" event from Expo Two Way Audio module
79    useExpoTwoWayAudioEventListener(
80        "onMicrophoneData",
81        useCallback<MicrophoneDataCallback>(
82            (event) => {
83                // We send the audio bytes to the Flow API
84                sendAudio(event.data.buffer);
85            },
86            [sendAudio],
87        ),
88    );
89
90    const obtainJwt = async () => {
91        const apiKey = process.env.EXPO_PUBLIC_SPEECHMATICS_API_KEY;
92        if (!apiKey) {
93            throw new Error("API key not found");
94        }
95        // WARNING: This is just an example app.
96        // In a real app you should obtain the JWT from your server.
97        // `createSpeechmaticsJWT` could be used on a server running JS.
98        // Otherwise, you will expose your API key to the client.
99        return await createSpeechmaticsJWT({
100            type: "flow",
101            apiKey,
102            ttl: 60,
103        });
104    };
105
106    const handleToggleConnect = async () => {
107        if (socketState === "open") {
108            endConversation();
109        } else {
110            try {
111                const jwt = await obtainJwt();
112                await startConversation(jwt, {
113                    config: {
114                        template_id: "flow-service-assistant-humphrey",
115                        template_variables: {
116                            timezone: "Europe/London",
117                        },
118                    },
119                });
120            } catch (e) {
121                console.log("Error connecting to Flow: ", e);
122            }
123        }
124    };
125
126    // Handle clicks to the 'Mute/Unmute' button
127    const handleToggleMute = useCallback(() => {
128        toggleRecording(!isRecording);
129    }, [isRecording]);
130
131    return (
132        <View style={styles.container}>
133            <Text>Talk to Flow!</Text>
134            <Button
135                title={socketState === "open" ? "Disconnect" : "Connect"}
136                disabled={socketState === "connecting" || socketState === "closing"}
137                onPress={handleToggleConnect}
138            />
139            <Button
140                title={isRecording ? "Mute" : "Unmute"}
141                disabled={socketState !== "open" || !audioInitialized}
142                onPress={handleToggleMute}
143            />
144        </View>
145    );
146}
147
148const styles = StyleSheet.create({
149    container: {
150        flex: 1,
151        justifyContent: "center",
152        alignItems: "center",
153    },
154});
155

Now our app can unmute the microphone to send audio samples to the Flow server. Audio messages comming back from the Flow server will be played back through the speaker.

Testing on Physical Devices

While simulators are great for initial testing, features like Acoustic Echo Cancellation require physical devices for proper functionality. To deploy to real devices, first configure the development environment for using local builds with physical devices:

Then run the following command:

# For iOS devices
npx expo run:ios --device --configuration Release

# For Android devices
npx expo run:android --device --variant release

Additional resources

Dive deeper into the tools used in this guide:

Speechmatics JS SDK

Expo Two Way Audio