/
Batch Virtual Appliance
/
API Guide
/
How to use the V1 API

How to use the V1 API

Deprecation Note

The V1 API is now deprecated and will be removed by February 2022. We recommend all customers move to using the V2 API - please see the section How to use the V2 API.

The Speech API is a REST API that enables you to create and manage transcription jobs by uploading audio files to the Speechmatics Batch Virtual Appliance, and downloading the resulting transcriptions.

User ID in request URL

Although you do not need an API auth token, you do need to supply a User ID in the URL. It can be any positive integer, and can be used to track transcription requests since the ID that you use will be returned in any job response.

The base URI for the Speech API requests looks like this (HTTP):

http://${APPLIANCE_HOST}:8082/v1/user/1/jobs/

or this (HTTPS):

https://${APPLIANCE_HOST}/v1/user/1/jobs/

Where ${APPLIANCE_HOST} is the IP address or hostname of the appliance you want to use. You must use port 8082 if you are using HTTP. If HTTPS is used, then port 443 is used (as this is a default port it is not necessary to specify it). A user ID of 1 is used for all the examples in this document (you can use any positive integer value however). This guide shows HTTP URLs, but however, since appliance version 3.4.0 you can also use HTTPS URLs.

You can access a dashboard on the appliance from any browser by navigating to the following URL:

http://${APPLIANCE_HOST}:8080/help

Submitting a Job with the V1 API

Sample Request

The simplest example to get going is to submit an audio file for transcription. This is done by making a POST request with the audio file and the language model you want to use:

curl -X POST "http://${APPLIANCE_HOST}:8082/v1/user/1/jobs/" \
  -H 'Content-Type: multipart/form-data' \
  -H 'Accept: application/json' \
  -F data_file=@example.wav \
  -F 'config={ "type": "transcription", "transcription_config": { "language": "en" } }'

The data_file form field is used to submit the audio, and a config object is passed in with details of how to transcribe the file (in this case, using the English language model).

Example Response

On successful submission of the job a 200 OK status will be returned by the appliance, along with JSON output showing the id of the job, and an indication in the check_wait property of how many seconds to wait before checking that the transcription is done:

The response headers returned will look like this:

{
  "date": "Wed, 24 Jul 2019 16:35:25 GMT",
  "server": "nginx/1.15.10",
  "connection": "keep-alive",
  "content-length": "1479",
  "content-type": "application/json"
}

And the body will comprise a JSON object like this:

{
  "balance": 0,
  "check_wait": 30,
  "cost": 0,
  "id": 111
}

Note: The balance and cost properties are not used by the Batch Virtual Appliance. They will always return zero values.

Retrieving job status

curl -X GET "http://${APPLIANCE_HOST}:8082/v1/user/1/jobs/111/"

Expected response:

{
  "job": {
    "check_wait": null,
    "created_at": "Thu May 28 13:16:47 2020",
    "duration": 4,
    "id": 111,
    "job_status": "done",
    "job_type": "transcription",
    "lang": "en",
    "meta": null,
    "name": "example.wav",
    "next_check": 0,
    "notification": "none",
    "transcription": "output.json",
    "user_id": 1
  }
}

Retrieving transcript

Transcript can be requested in formats described in Output Formats section.

curl -X GET "http://${APPLIANCE_HOST}:8082/v1/user/1/jobs/111/transcript?format=txt"

Supplying a Job Configuration

The config object parameter is used to pass information about these features into the appliance. This is the recommended approach for passing information about the transcription job to the appliance. The simplest config object is where just the transcription language is specified, for example:

{"type": "transcription",
   "transcription_config": {"language": "en"}
}

Example Configurations

Examples of configurations for V1 API-specific features. Where configurations are shared across both V1 and the V2 API, they are documented above in the V2 API section

Supported Formats

The default job transcription output format is legacy json. The json-v2 output format is also available with a richer set of information: it is recommended that you should use this format whenever using the V1 API. It is also possible to use a plain text transcription output (TXT) if you do not need timing information.The SubRip subtitling (SRT) format provides timing information as well as text. This text corresponds with broadcasting best practice on maximum character output per line and maximum line length. If you wish, you can alter these parameters via the transcription config.

Parameters

The following legacy parameters are available with the Batch Virtual Appliance:

ParameterDescriptionNotes
modelLanguage model used to process the job.Replaced by the language property in the config object.
notificationHow you would like to be notified of your job finishing.The email notification type is not supported by the Batch Virtual Appliance.
callbackIf set, and notification is set to 'callback', the appliance will make a POST request to this URL when the job completes.
callback_formatThe format to be used by the appliance for the callback POST request.Available formats are: 'srt', 'txt', 'json' and 'json-v2'.
metaMetadata about the job you would like to be able to view later.
diarizationControls whether speaker diarization is used.Replaced by the "diarization": "speaker", property in the config object.
diarisationA synonym for diarization.See above.

It's recommended that you use the config object to pass the job configuration; the other methods of specifying job configuration will be deprecated at some point in the future. However, if you want to pass metadata with the job, or use a notification callback then you can do so using the legacy API parameters.

Passing metadata and using callback notifications

The next sections describe how to pass metadata and use notification callbacks using the legacy API in the event you need to use this functionality. These features will be covered by additional parameters in the config object in a future release.

Passing Metadata

You can use the meta parameter in the legacy API to associate metadata to the job, and use this for tracking the job through your workflow. For instance, you can use this to associate your own asset tag or job number, and retrieve it later on when you process the JSON transcript.

curl -X POST "http://${APPLIANCE_HOST}:8082/v1/user/1/jobs/" \
  -H 'Content-Type: multipart/form-data' \
  -H 'Accept: application/json' \
  -F data_file=@example.wav \
  -F 'meta'='asset-id=29309231123' \
  -F 'config={ "type": "transcription", "transcription_config": { "language": "en" } }'

You'll then see meta information when you query the job, or retrieve the (JSON) transcript:

 {
    "format": "2.4",
    "job": {
        "created_at": "Tue Nov 19 17:34:41 2019",
        "duration": 383,
        "id": 4,
        "lang": "en",
        "meta": "asset-id=29309231123",
        "name": "en.mp3",
        "user_id": 1
    },
    "metadata": {
        "created_at": "2019-11-19T17:36:04.525Z",
        "transcription_config": {
            "language": "en"
        },
        "type": "transcription"
    },
[...]

Callbacks Usage

If you want to trigger a callback, so that you don't have to keep polling the jobs endpoint, you can do so by using the notification and callback parameters in the legacy API. This ensures that the Batch Appliance will send a POST to an HTTP server once the job is complete; typically, you would maintain a service running on that HTTP server that listens for these POST events and then performs some action to process the transcription (for example by writing it into a database, or copying the transcription to a file for further processing). Here is an example of how to setup a callback:

curl -X POST "http://${APPLIANCE_HOST}:8082/v1/user/1/jobs/" \
  -H 'Content-Type: multipart/form-data' \
  -H 'Accept: application/json' \
  -F data_file=@example.wav \
  -F model=en \
  -F notification=callback \
  -F callback=http://www.example.com/transcript_callback \
  -F callback_format=txt

The callback appends the job ID as a query string parameter with name id. As an example, if the job ID is 546, you'd see the following POST request:

POST /transcript_callback?id=546 HTTP/1.1
Host: www.example.com

The user agent is Speechmatics-API/1.0.