Voice agentsFlowFeatures

Function Calling

Learn about the function calling feature for Flow

Function Calling allows you to connect Flow to external tools and systems. This unlocks Flow's ability to act in the real-world and better serve the needs of your users.

This could involve needing real-time information such as opening/closing times or validation services for authentication or action APIs that control a fast food system while placing a drive-thru order.

Based on what the user says in the conversation, Flow will recognise the user's intentions and extract out the key information that your system needs to complete the function call.

For example, you may want Flow to add reminders in a user's calendar:

{
    "name": "add_reminder",
    "description": "Use this to schedule reminders. Needs a confirmation.",
    "parameters": {
        "type": "object",
        "properties": {
            "date" : {
                "type" : "string",
                "description" : "The date for the reminder in dd/mm/yyyy format"
            },
            "time" : {
                "type": "string",
                "description" : "The time for the reminder in 24 hour hh:mm format"
            },
            "title" : {
                "type": "string",
                "description" : "The title for the reminder"
            },
            "project": {
                "type": "string",
                "description": "Which project the reminder is related to. If not provided, leave blank."
            }
        },
        "required": ["project"]
    }
}

Configuring Function Calling

An agent can be configured to use function calling in two ways:

In code: when starting a session with the StartConversation message
(coming soon) In the portal: when configuring an agent

In the portal

Create an agent in the portal and enable function calling in the agent settings.

In `StartConversation`

Functions must be declared within a list of tools when your client sends the StartConversation message. Each function must be defined with the following:

namestringrequired

The name of the function that should be called. This name is passed as a field in the ToolInvoke message

descriptionstring

A natural language string that instructs the LLM about the condition in which the function must be called

parameters object

An object containing the properties of the function call which should be collected from the conversation. Each parameter is defined by:

typestring

The type of the parameters (currently, will always be object).

Possible values: [object]

requiredstring[]

(optional) The list of input parameters for the function which are required.

properties object

Properties of the function parameter object

[property name: string] object

typestringrequired

The type of the parameter

Possible values: [integer, number, string, boolean]

descriptionstring

A description of the parameter.

enumundefined[]

examplestring

An example value for the parameter.

Example

import asyncio
import io
import sys
import json

import pyaudio

from speechmatics_flow.client import WebsocketClient
from speechmatics_flow.models import (
    ConnectionSettings,
    Interaction,
    AudioSettings,
    ConversationConfig,
    ServerMessageType,
    ClientMessageType,
)
from speechmatics_flow.tool_function_param import ToolFunctionParam

AUTH_TOKEN = "Place your auth token here"

# Example configuration which could add a reminder to a calendar.
reminder_config = ToolFunctionParam(
    type="function",
    function={
        "name": "add_reminder",
        "description": "Use this to schedule reminders. Needs a confirmation.",
        "parameters": {
            "type": "object",
            "properties": {
                "date": {
                    "type": "string",
                    "description": "The date for the reminder in dd/mm/yyyy format",
                },
                "time": {
                    "type": "string",
                    "description": "The time for the reminder in 24 hour hh:mm format",
                },
                "title": {
                    "type": "string",
                    "description": "The title for the reminder",
                },
                "project": {
                    "type": "string",
                    "description": "Which project the reminder is related to. If not provided, leave blank.",
                },
            },
            "required": ["project"],
        },
    },
)


# Callback for handling reminder ToolInvoke in your system.
async def reminder_handler(msg: dict):
    print("Attempting to add reminder")
    print(msg)
    response_message = {
        "message": ClientMessageType.ToolResult,
        "id": msg["id"],
        "status": "ok",  # Used to inform user the status of the function call. Could be "failed" or "rejected".
        "content": "Added reminder successfully to calendar",  # LLM response helper message
    }

    await client.websocket.send(json.dumps(response_message))


# Create a websocket client
client = WebsocketClient(
    ConnectionSettings(
        url="wss://flow.api.speechmatics.com/v1/flow",
        auth_token=AUTH_TOKEN,
    )
)

# Create a buffer to store binary messages sent from the server
audio_buffer = io.BytesIO()


# Create callback function which adds binary messages to audio buffer
def binary_msg_handler(msg: bytes):
    if isinstance(msg, (bytes, bytearray)):
        audio_buffer.write(msg)


# Register the callback which will be called
# when the client receives an audio message from the server
client.add_event_handler(ServerMessageType.AddAudio, binary_msg_handler)

# Handling ToolInvoke message
client.add_event_handler(ServerMessageType.ToolInvoke, reminder_handler)


async def audio_playback(buffer):
    """Read from buffer and play audio back to the user"""
    p = pyaudio.PyAudio()
    stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, output=True)
    try:
        while True:
            # Get the current value from the buffer
            audio_to_play = buffer.getvalue()
            # Only proceed if there is audio data to play
            if audio_to_play:
                # Write the audio to the stream
                stream.write(audio_to_play)
                buffer.seek(0)
                buffer.truncate(0)
            # Pause briefly before checking the buffer again
            await asyncio.sleep(0.05)
    finally:
        stream.close()
        stream.stop_stream()
        p.terminate()


async def main():
    print("Starting...")
    tasks = [
        # Use the websocket to connect to Flow Service and start a conversation
        asyncio.create_task(
            client.run(
                interactions=[Interaction(sys.stdin.buffer)],
                audio_settings=AudioSettings(),
                conversation_config=ConversationConfig(),
                tools=[reminder_config],
            )
        ),
        # Run audio playback handler which streams audio from audio buffer
        asyncio.create_task(audio_playback(audio_buffer)),
    ]
    await asyncio.gather(*tasks)


asyncio.run(main())

Considerations

Function status - The client must inform the service of whether the function call succeeded or not. This allows the service to inform the user of the result. There is no automatic timeout on the Flow API.
Asynchronous - Function calling is fully asynchronous. Once the client is informed of the function call, the conversation will continue to progress until a function call status update is received from the client. This is to continue providing a natural conversational experience to the customer.
Completion Message - Flow can play a message on completion of the function call. The Client can switch this off by passing <NO_RESPONSE_REQUIRED/> in the content field of the ToolResult message.

Since LLMs are semantically instructed, complete, narrow and unambiguous function calls with simple descriptions can create a reliable customer experience. Complex business logic should be handled within your client.

Configuring Function Calling​

In the portal​

In StartConversation​

Example​

Considerations​

Configuring Function Calling

In the portal

In `StartConversation`

Example

Considerations