The legacy V1 API that the Batch Virtual Appliance currently supports is now removed. The V2 API is the only supported API in the Batch Virtual Appliance following this release. How to use the V2 API can be seen in the docs (https://docs.speechmatics.com/en/batch-appliance/api-guide/api-howto/).
The Speechmatics Batch Virtual Appliance exposes a REST Speech API to enable communication between a client application and the appliance over a HTTP or HTTPS connection. This provides the ability to convert a media file into a text transcript, providing words, speaker, and timing information.
For the purposes of this guide the following terms are used.
|Client||An application connecting to the Batch Virtual Appliance using the Transcription API. The client will provide audio containing speech, and process the transcripts received as a result.|
|Management API||The REST API that allows administrators to manage the virtual appliance over port 8080 (or 443 for secure access). To access the documentation you can use the following endpoints: |
|Speech V2 API||The REST API that allows users of the appliance to submit ASR jobs over port 8082 (or 443 for secure access). The endpoints |
|Batch Virtual Appliance||The appliance (VM) that provides ASR transcription capability.|
In order to use the REST Speech API you need access to a Batch Virtual Appliance. See the Speechmatics Virtual Appliance Installation and Admin Guide on how to install, configure, and license the appliance.
You do not need user credentials (such as an authorization token) to use the Speech API with the Batch Virtual Appliance.
A variety of audio formats for input are supported; there is no need to specify the audio format when it is submitted for transcription; the Batch Virtual Appliance automatically detects the format and handles it using the correct decoder. The current audio formats are supported:
Note: the native formats are 16KHz or 8KHz (PCM32 LE) WAV; for the best results and performance we recommend that you submit files in that format.
The V2 API is the primary way via which all customers should submit media and retrieve transcripts on the Batch Virtual Appliance.
The maximum file size supported is 4GB, or up to 2 hours in length. Anything larger must be chunked into smaller sections in order to be successfully transcribed.
In the V2 API, three output formats are available:
json-v2 (the default),
srt. The current version of this output is 2.7. If the output format is set to
txt, the file is returned in plain text rather than JSON format. If the output format is set to
srt, the file is returned in the SubRip subtitle format instead.
In the V1 API, four output formats for transcription are available:
json (the default),
srt. If you want JSON output it is recommended to use
If you have problems making a call, ensure that you are using exactly the same URI format as shown in this document. For instance, not including the trailing '/' character on the URIs will cause a 302 redirect response to be sent – if your client does not handle redirects then this may cause problems.
The easiest way to access the APIs and online help is via the following URL on the appliance:
This page allows you to access the documentation from the browser as well as providing links to the APIs.
On a Windows PC you can use these download and installation links to get these tools:
Use the relevant package manager for your flavor of Linux, which will either be:
$ apt install curl jq
$ yum install curl jq
On the Mac, the easiest way to install these utilities is using Homebrew:
$ brew install curl jq