Voice Agents — FlowGuides

Build a conversational AI app with React Native and Flow

Learn how to create a mobile application that integrates Speechmatics Flow service using React Native.

This guide demonstrates how to build the app using the Expo framework, implementing real-time audio communication with Flow's servers.

You can find the complete code on GitHub .

Prerequisites

Before getting started, ensure you have:

Node.js (LTS) installed on your system
Development environment configured for development builds:
- Android Development
- iOS Development

Project Setup

Start by creating a fresh Expo project:

npx create-expo-app@latest

To remove the example code and start with a clean slate:

npm run reset-project

This command preserves the example files by moving them to an 'app-example' directory while creating a new clean app directory. You can safely remove the 'app-example' directory if you don't need it for reference.

Essential Dependencies

Install the following packages to enable Flow integration and audio handling:

# React version of Flow client
npm i @speechmatics/flow-client-react

# Polyfill for the EventTarget class
npm i event-target-polyfill

# Expo native module to handle audio
npm i @speechmatics/expo-two-way-audio

# Just for the purpose of this example. See comment in the code above `createSpeechmaticsJWT`
npm i @speechmatics/auth

The Flow client uses EventTarget which is typically available in browsers but not in react native. For that reason we've installed the polyfill: event-target-polyfill.

Building the User Interface

Let's create a minimal user interface. Start by clearing the app/ directory and creating a new index.tsx file with a basic UI structure:

app/index.tsx
import { useState } from "react";
import { Button, StyleSheet, Text, View } from "react-native";

export default function Index() {
  const [isConnecting, setIsConnecting] = useState(false);
  const [isConnected, setIsConnected] = useState(false);
  return (
    <View style={styles.container}>
      <Text>Talk to Flow!</Text>
      <Button
        title={isConnected ? "Disconnect" : "Connect"}
        disabled={isConnecting}
      />
    </View>
  );
}

const styles = StyleSheet.create({
  container: {
    flex: 1,
    justifyContent: "center",
    alignItems: "center",
  },
});

The view above will just render a Connect/Disconnect button that won't do anything yet.

Let's run it on the simulator to see how it looks:

# For iOS simulator
npx expo run ios

# For Android emulator
npx expo run android

This will launch the Metro Bundler and show up some options.

Expo Go is not supported

If it shows the following: Using Expo Go, we need to switch to a development build by pressing s. Then press r to reload the app. Some features that we are going to include, like the native module for handling audio, don't work properly in Expo Go.

Implementing Flow Connection

It's time to add some functionality to this example. We'll start by implementing the connect and disconnect logic. For that we are going to use the @speechmatics/flow-client-react package.

Our /app/index.tsx file should now look as follows:

app/index.tsx
// The polyfill should be the first import in the app
import "event-target-polyfill";

import { createSpeechmaticsJWT } from "@speechmatics/auth";
import { FlowProvider, useFlow } from "@speechmatics/flow-client-react";
import { Button, StyleSheet, Text, View } from "react-native";

export default function Index() {
  return (
    <FlowProvider
      appId="react-native-flow-guide"
      websocketBinaryType="arraybuffer"
    >
      <Flow />
    </FlowProvider>
  );
}

function Flow() {
  const { startConversation, endConversation, sendAudio, socketState } =
    useFlow();

  const obtainJwt = async () => {
    const apiKey = process.env.EXPO_PUBLIC_SPEECHMATICS_API_KEY;
    if (!apiKey) {
      throw new Error("API key not found");
    }
    // WARNING: This is just an example app.
    // In a real app you should obtain the JWT from your server.
    // For example, `createSpeechmaticsJWT` could be used on a server running JS.
    // Otherwise, you will expose your API key to the client.
    return await createSpeechmaticsJWT({
      type: "flow",
      apiKey,
      ttl: 60,
    });
  };

  const handleToggleConnect = async () => {
    if (socketState === "open") {
      endConversation();
    } else {
      try {
        const jwt = await obtainJwt();
        await startConversation(jwt, {
          config: {
            template_id: "flow-service-assistant-humphrey",
            template_variables: {
              timezone: "Europe/London",
            },
          },
        });
      } catch (e) {
        console.log("Error connecting to Flow: ", e);
      }
    }
  };

  return (
    <View style={styles.container}>
      <Text>Talk to Flow!</Text>
      <Button
        title={socketState === "open" ? "Disconnect" : "Connect"}
        disabled={socketState === "connecting" || socketState === "closing"}
        onPress={handleToggleConnect}
      />
    </View>
  );
}

const styles = StyleSheet.create({
  container: {
    flex: 1,
    justifyContent: "center",
    alignItems: "center",
  },
});

In the code above, we are injecting the API key from an environment variable. To make the environment variable available, let's create a .env file in the root directory of the project with the following content:

.env
EXPO_PUBLIC_SPEECHMATICS_API_KEY='YOUR_API_KEY_GOES_HERE

API keys can be obtained from the Speechmatics Portal

This is just an example app. In a real app you should obtain the JWT from your server. createSpeechmaticsJWT could be used on a server running JS. Otherwise, you will expose your API key to the client.

Audio Integration

The final step is implementing two-way audio communication. This involves three crucial tasks:

Microphone input capture in PCM format
Speaker output routing for Flow responses
Acoustic Echo Cancellation (AEC) to prevent audio feedback

We'll use the Speechmatics Expo Two Way Audio module to handle these requirements efficiently.

In order to allow microphone access we need to add some configuration to the app.json file in the root of our project. For iOS we add an infoPlist entry and for Android a permissions entry.

app.json

{
  "expo": {
    // ...
    "ios": {
      "infoPlist": {
      "NSMicrophoneUsageDescription": "Allow Speechmatics to access your microphone"
    },
    // ...
    },
    "android": {
      "permissions": ["RECORD_AUDIO", "MODIFY_AUDIO_SETTINGS"],
      // ...
    }
  }
  // ...
}

Now we will update the code to handle these microphone adjustments. Our /app/index.tsx file should look as follows:

app/index.tsx
// The polyfill should be the first import in the whole app
import "event-target-polyfill";

import { createSpeechmaticsJWT } from "@speechmatics/auth";
import {
  initialize,
  type MicrophoneDataCallback,
  playPCMData,
  toggleRecording,
  useExpoTwoWayAudioEventListener,
  useIsRecording,
  useMicrophonePermissions,
} from "@speechmatics/expo-two-way-audio";
import {
  FlowProvider,
  useFlow,
  useFlowEventListener,
} from "@speechmatics/flow-client-react";
import { useCallback, useEffect, useState } from "react";
import { Button, StyleSheet, Text, View } from "react-native";

export default function Index() {
  const [micPermission, requestMicPermission] = useMicrophonePermissions();
  if (!micPermission?.granted) {
    return (
      <View style={styles.container}>
        <Text>Mic permission: {micPermission?.status}</Text>
        <Button
          title={
            micPermission?.canAskAgain
              ? "Request permission"
              : "Cannot request permissions"
          }
          disabled={!micPermission?.canAskAgain}
          onPress={requestMicPermission}
        />
      </View>
    );
  }
  return (
    <FlowProvider
      appId="react-native-flow-guide"
      websocketBinaryType="arraybuffer"
    >
      <Flow />
    </FlowProvider>
  );
}

function Flow() {
  const [audioInitialized, setAudioInitialized] = useState(false);
  const { startConversation, endConversation, sendAudio, socketState } =
    useFlow();
  const isRecording = useIsRecording();

  // Initialize Expo Two Way Audio
  useEffect(() => {
    const initializeAudio = async () => {
      await initialize();
      setAudioInitialized(true);
    };

    initializeAudio();
  }, []);

  // Setup a handler for the "agentAudio" event from Flow API
  useFlowEventListener("agentAudio", (audio) => {
    // Even though Int16Array is a more natural representation for PCM16_sle,
    // Expo Modules API uses a convertible type for arrays of bytes and it needs Uint8Array in the JS side.
    // This is converted to a `Data` type in Swift and to a `kotlin.ByteArray` in Kotlin.
    // More info here: https://docs.expo.dev/modules/module-api/#convertibles
    // For this reason, the Expo Two Way Audio library requires a Uint8Array argument for the `playPCMData` function.
    const byteArray = new Uint8Array(audio.data.buffer);
    playPCMData(byteArray);
  });

  // Setup a handler for the "onMicrophoneData" event from Expo Two Way Audio module
  useExpoTwoWayAudioEventListener(
    "onMicrophoneData",
    useCallback<MicrophoneDataCallback>(
      (event) => {
        // We send the audio bytes to the Flow API
        sendAudio(event.data.buffer);
      },
      [sendAudio],
    ),
  );

  const obtainJwt = async () => {
    const apiKey = process.env.EXPO_PUBLIC_SPEECHMATICS_API_KEY;
    if (!apiKey) {
      throw new Error("API key not found");
    }
    // WARNING: This is just an example app.
    // In a real app you should obtain the JWT from your server.
    // `createSpeechmaticsJWT` could be used on a server running JS.
    // Otherwise, you will expose your API key to the client.
    return await createSpeechmaticsJWT({
      type: "flow",
      apiKey,
      ttl: 60,
    });
  };

  const handleToggleConnect = async () => {
    if (socketState === "open") {
      endConversation();
    } else {
      try {
        const jwt = await obtainJwt();
        await startConversation(jwt, {
          config: {
            template_id: "flow-service-assistant-humphrey",
            template_variables: {
              timezone: "Europe/London",
            },
          },
        });
      } catch (e) {
        console.log("Error connecting to Flow: ", e);
      }
    }
  };

  // Handle clicks to the 'Mute/Unmute' button
  const handleToggleMute = useCallback(() => {
    toggleRecording(!isRecording);
  }, [isRecording]);

  return (
    <View style={styles.container}>
      <Text>
        {socketState === "open"
          ? isRecording
            ? "Try saying something to Flow"
            : "Tap 'Unmute' to start talking to Flow"
          : "Tap 'Connect' to connect to Flow"}
      </Text>
      <Button
        title={socketState === "open" ? "Disconnect" : "Connect"}
        disabled={socketState === "connecting" || socketState === "closing"}
        onPress={handleToggleConnect}
      />
      <Button
        title={isRecording ? "Mute" : "Unmute"}
        disabled={socketState !== "open" || !audioInitialized}
        onPress={handleToggleMute}
      />
    </View>
  );
}

const styles = StyleSheet.create({
  container: {
    flex: 1,
    justifyContent: "center",
    alignItems: "center",
  },
});

Since we've introduced some app configuration changes in app.json, let's build the app from scratch (cleaning the ios or android folders):

# iOS
npx expo prebuild --clean -p ios
npx expo run:ios

# Android
npx expo prebuild --clean -p android
npx expo run:android

Our app can now unmute the microphone to send audio samples to the Flow server. Audio messages received from the Flow server will be played back through the speaker.

While simulators are great for initial testing, features like Acoustic Echo Cancellation require physical devices for proper functionality. If you find that the simulator audio support is not working or is not good enough we strongly recommend testing on physical devices.

Volume Indicators

To enhance our UI, we'll add volume indicators for both the microphone and speaker, and organize our buttons into a "bottom bar."

Design Overview

Volume Indicators: These will consist of two concentric circles:
- Outer Circle: Represents the speaker volume.
- Inner Circle: Represents the microphone volume.
Animation: The circles will animate to grow or shrink based on the current volume level.

Implementation Details

We'll use the react-native-reanimated library to handle the animations. This library is often included by default in Expo apps, but if it's not, you can follow the installation instructions here.

To keep things organised, let’s create a new file inside the app folder to house our custom volume indicator component.

app/volume-display.tsx
import { StyleSheet, View } from "react-native";
import Animated, {
  Easing,
  interpolate,
  type SharedValue,
  useAnimatedStyle,
  withTiming,
} from "react-native-reanimated";

export default function VolumeDisplay({
  volumeLevel,
  color,
  minSize,
  maxSize,
}: {
  volumeLevel: SharedValue<number>;
  color: string;
  minSize: number;
  maxSize: number;
}) {
  const animatedStyle = useAnimatedStyle(() => {
    const size = interpolate(volumeLevel.value, [0, 1], [minSize, maxSize]);

    return {
      width: withTiming(size, {
        duration: 100,
        easing: Easing.linear,
      }),
      height: withTiming(size, {
        duration: 100,
        easing: Easing.linear,
      }),
      borderRadius: withTiming(size / 2, {
        duration: 100,
        easing: Easing.linear,
      }),
    };
  });

  return (
    <View style={styles.container}>
      <Animated.View style={[{ backgroundColor: color }, animatedStyle]} />
    </View>
  );
}

const styles = StyleSheet.create({
  container: {
    justifyContent: "center",
    alignItems: "center",
  },
});

Next, we'll integrate our volume indicator component into the app.

Our /app/index.tsx file should look as follows:

app/index.tsx
// The polyfill should be the first import in the whole app
import "event-target-polyfill";

import { createSpeechmaticsJWT } from "@speechmatics/auth";
import {
  initialize,
  type MicrophoneDataCallback,
  playPCMData,
  toggleRecording,
  useExpoTwoWayAudioEventListener,
  useIsRecording,
  useMicrophonePermissions,
  type VolumeLevelCallback,
} from "@speechmatics/expo-two-way-audio";
import {
  FlowProvider,
  useFlow,
  useFlowEventListener,
} from "@speechmatics/flow-client-react";
import { useCallback, useEffect, useState } from "react";
import { Button, StyleSheet, Text, View } from "react-native";
import { useSharedValue } from "react-native-reanimated";
import VolumeDisplay from "./volume-display";

export default function Index() {
  const [micPermission, requestMicPermission] = useMicrophonePermissions();
  if (!micPermission?.granted) {
    return (
      <View style={styles.container}>
        <Text>Mic permission: {micPermission?.status}</Text>
        <Button
          title={
            micPermission?.canAskAgain
              ? "Request permission"
              : "Cannot request permissions"
          }
          disabled={!micPermission?.canAskAgain}
          onPress={requestMicPermission}
        />
      </View>
    );
  }
  return (
    <FlowProvider
      appId="react-native-flow-guide"
      websocketBinaryType="arraybuffer"
    >
      <Flow />
    </FlowProvider>
  );
}

function Flow() {
  const [audioInitialized, setAudioInitialized] = useState(false);
  const inputVolumeLevel = useSharedValue(0.0);
  const outputVolumeLevel = useSharedValue(0.0);
  const { startConversation, endConversation, sendAudio, socketState } =
    useFlow();
  const isRecording = useIsRecording();

  // Initialize Expo Two Way Audio
  useEffect(() => {
    const initializeAudio = async () => {
      await initialize();
      setAudioInitialized(true);
    };

    initializeAudio();
  }, []);

  // Setup a handler for the "agentAudio" event from Flow API
  useFlowEventListener("agentAudio", (audio) => {
    // Even though Int16Array is a more natural representation for PCM16_sle,
    // Expo Modules API uses a convertible type for arrays of bytes and it needs Uint8Array in the JS side.
    // This is converted to a `Data` type in Swift and to a `kotlin.ByteArray` in Kotlin.
    // More info here: https://docs.expo.dev/modules/module-api/#convertibles
    // For this reason, the Expo Two Way Audio library requires a Uint8Array argument for the `playPCMData` function.
    const byteArray = new Uint8Array(audio.data.buffer);
    playPCMData(byteArray);
  });

  // Setup a handler for the "onMicrophoneData" event from Expo Two Way Audio module
  useExpoTwoWayAudioEventListener(
    "onMicrophoneData",
    useCallback<MicrophoneDataCallback>(
      (event) => {
        // We send the audio bytes to the Flow API
        sendAudio(event.data.buffer);
      },
      [sendAudio],
    ),
  );

  const obtainJwt = async () => {
    const apiKey = process.env.EXPO_PUBLIC_SPEECHMATICS_API_KEY;
    if (!apiKey) {
      throw new Error("API key not found");
    }
    // WARNING: This is just an example app.
    // In a real app you should obtain the JWT from your server.
    // `createSpeechmaticsJWT` could be used on a server running JS.
    // Otherwise, you will expose your API key to the client.
    return await createSpeechmaticsJWT({
      type: "flow",
      apiKey,
      ttl: 60,
    });
  };

  const handleToggleConnect = async () => {
    if (socketState === "open") {
      endConversation();
    } else {
      try {
        const jwt = await obtainJwt();
        await startConversation(jwt, {
          config: {
            template_id: "flow-service-assistant-humphrey",
            template_variables: {
              timezone: "Europe/London",
            },
          },
        });
      } catch (e) {
        console.log("Error connecting to Flow: ", e);
      }
    }
  };

  // Handle clicks to the 'Mute/Unmute' button
  const handleToggleMute = useCallback(() => {
    toggleRecording(!isRecording);
  }, [isRecording]);

  // Setup a handler for the "onInputVolumeLevelData" event from Expo Two Way Audio module.
  // This event is triggered when the input volume level changes.
  useExpoTwoWayAudioEventListener(
    "onInputVolumeLevelData",
    useCallback<VolumeLevelCallback>(
      (event) => {
        // We update the react-native-reanimated shared value for the input volume (microphone).
        // This allow us to create animations for the microphone volume based on this shared value.
        inputVolumeLevel.value = event.data;
      },
      [inputVolumeLevel],
    ),
  );

  // Setup a handler for the "onOutputVolumeLevelData" event from Expo Two Way Audio module.
  // This event is triggered when the output volume level changes.
  useExpoTwoWayAudioEventListener(
    "onOutputVolumeLevelData",
    useCallback<VolumeLevelCallback>(
      (event) => {
        // We update the react-native-reanimated shared value for the output volume (speaker).
        // This allow us to create animations for the speaker volume based on this shared value.
        outputVolumeLevel.value = event.data;
      },
      [outputVolumeLevel],
    ),
  );

  return (
    <View style={styles.container}>
      <View style={styles.VolumeDisplayContainer}>
        <View style={styles.volumeDisplay}>
          <VolumeDisplay
            volumeLevel={outputVolumeLevel}
            color="#7ECFBD"
            minSize={130}
            maxSize={180}
          />
        </View>
        <View style={styles.volumeDisplay}>
          <VolumeDisplay
            volumeLevel={inputVolumeLevel}
            color="#4CBBA5"
            minSize={70}
            maxSize={120}
          />
        </View>
      </View>
      <View>
        <Text style={{ fontSize: 18 }}>
          {socketState === "open"
            ? isRecording
              ? "Try saying something to Flow"
              : "Tap 'Unmute' to start talking to Flow"
            : "Tap 'Connect' to connect to Flow"}
        </Text>
      </View>
      <View style={styles.bottomBar}>
        <View style={styles.buttonContainer}>
          <Button
            title={socketState === "open" ? "Disconnect" : "Connect"}
            disabled={socketState === "connecting" || socketState === "closing"}
            onPress={handleToggleConnect}
          />
          <Button
            title={isRecording ? "Mute" : "Unmute"}
            disabled={socketState !== "open" || !audioInitialized}
            onPress={handleToggleMute}
          />
        </View>
      </View>
    </View>
  );
}

const styles = StyleSheet.create({
  container: {
    flex: 1,
    alignItems: "center",
    justifyContent: "space-evenly",
    padding: 50,
  },
  buttonContainer: {
    flexDirection: "row",
    justifyContent: "space-around",
    width: "100%",
    marginBottom: 20,
  },
  bottomBar: {
    position: "absolute",
    bottom: 0,
    width: "100%",
    padding: 20,
    borderTopWidth: 1,
    borderTopColor: "lightgray",
  },
  VolumeDisplayContainer: {
    position: "relative",
    width: 150,
    height: 150,
    alignItems: "center",
    justifyContent: "center",
  },
  volumeDisplay: {
    position: "absolute",
  },
});

We have successfully completed our conversational AI application! You can connect to Flow services and unmute the microphone to start a conversation.

Testing on Physical Devices

To deploy to real devices, first configure the development environment for using local builds with physical devices:

iOS
Android

Then run the following command:

# For iOS devices
npx expo run:ios --device --configuration Release

# For Android devices
npx expo run:android --device --variant release

Build a conversational AI app with React Native and Flow

Prerequisites

Project Setup

Essential Dependencies

Building the User Interface

Expo Go is not supported

Implementing Flow Connection

Audio Integration

Volume Indicators

Testing on Physical Devices

Additional resources

Speechmatics JS SDK

Expo Two Way Audio

Prerequisites​

Project Setup​

Essential Dependencies​

Building the User Interface​

Expo Go is not supported​

Implementing Flow Connection​

Audio Integration​

Volume Indicators​

Testing on Physical Devices​

Additional resources​

Speechmatics JS SDK​

Expo Two Way Audio​

Prerequisites

Project Setup

Essential Dependencies

Building the User Interface

Expo Go is not supported

Implementing Flow Connection

Audio Integration

Volume Indicators

Testing on Physical Devices

Additional resources

Speechmatics JS SDK

Expo Two Way Audio