Speechmatics ASR REST API (2.0.0)

Download OpenAPI specification:Download

The Speechmatics Automatic Speech Recognition REST API is used to submit ASR jobs and receive the results. The supported job type is transcription of audio files.

Jobs

Create a new job.

header Parameters

Authorization required	string Customer API token
X-SM-EAR-Tag	string Early Access Release Tag

Request Body schema: multipart/form-data

config required	string JSON containing a `JobConfig` model indicating the type and parameters for the recognition job.
data_file	string <binary> The data file to be processed. Alternatively the data file can be fetched from a url specified in `JobConfig`.
text_file	string <binary> For alignment jobs, the text file that the data file should be aligned to.

Responses

Response Schema:
application/json

required

string

The unique ID assigned to the job. Keep a record of this for later retrieval of your completed job.

Request samples

Python
cURL
CLI

from speechmatics.batch_client import BatchClient

# Open the client using a context manager
with BatchClient("YOUR_API_KEY") as client:
    job_id = client.submit_job(
        audio="PATH_TO_FILE",
    )
    print(job_id)

Response samples

Content type

application/json

{"id": "a1b2c3d4e5"
}

List all jobs.

query Parameters

created_before	string <date-time> UTC Timestamp cursor for paginating request response. Filters jobs based on creation time to the nearest millisecond. Accepts up to nanosecond precision, truncating to millisecond precision. By default, the response will start with the most recent job.
limit	integer [ 1 .. 100 ] Limit for paginating the request response. Defaults to 100.
include_deleted	boolean Specifies whether deleted jobs should be included in the response. Defaults to false.

header Parameters

Authorization required	string Customer API token
X-SM-EAR-Tag	string Early Access Release Tag

Responses

Response Schema:
application/json

required

Array of objects (JobDetails)

Array

created_at required	string <date-time> Example: "2018-01-09T12:29:01.853047Z" The UTC date time the job was created.
data_name required	string Name of the data file submitted for job.
text_name	string Name of the text file submitted to be aligned to audio.
duration	integer >= 0 The file duration (in seconds). May be missing for fetch URL jobs.
id required	string Example: "a1b2c3d4e5" The unique id assigned to the job.
status required	string Enum: "running" "done" "rejected" "deleted" "expired" The status of the job. * `running` - The job is actively running. * `done` - The job completed successfully. * `rejected` - The job was accepted at first, but later could not be processed by the transcriber. * `deleted` - The user deleted the job. * `expired` - The system deleted the job. Usually because the job was in the `done` state for a very long time.
	object (JobConfig) JSON object that contains various groups of job configuration parameters. Based on the value of `type`, a type-specific object such as `transcription_config` is required to be present to specify all configuration settings or parameters needed to process the job inputs as expected. If the results of the job are to be forwarded on completion, `notification_config` can be provided with a list of callbacks to be made; no assumptions should be made about the order in which they will occur. Customer specific job details or metadata can be supplied in `tracking`, and this information will be available where possible in the job results and in callbacks.
lang	string Optional parameter used for backwards compatibility with v1 api
	Array of objects (JobDetailError) Optional list of errors that have occurred in user interaction, for example: audio could not be fetched or notification could not be sent.

Request samples

Python
cURL
CLI

from speechmatics.batch_client import BatchClient

with BatchClient("YOUR_API_KEY") as client:
    jobs_list = client.list_jobs()

    # Here, we get and print out the name 
    # of the first job if it exists
    if len(jobs_list):
      first_job_name = jobs_list["jobs"][0]["data_name"]
      print(first_job_name)

Response samples

Content type

application/json

{"jobs": [{"created_at": "2018-01-09T12:29:01.853047Z",
"data_name": "recording.mp3",
"duration": 244,
"id": "a1b2c3d4e5",
"status": "transcribing",
"type": "transcription",
"tracking": {"title": "ACME Q12018 Statement",
"reference": "/data/clients/ACME/statements/segs/2018Q1-seg8",
"tags": ["quick-review",
"segment"
],
"details": {"client": "ACME Corp",
"segment": 8,
"seg_start": 963.201,
"seg_end": 1091.481
}
},
"transcription_config": {"language": "en",
"additional_vocab": [{"content": "Speechmatics",
"sounds_like": ["speechmatics"
]
},
{"content": "gnocchi",
"sounds_like": ["nyohki",
"nokey",
"nochi"
]
},
{"content": "CEO",
"sounds_like": ["C.E.O."
]
},
{"content": "financial crisis"
}
],
"diarization": "channel",
"channel_diarization_labels": ["Agent",
"Caller"
]
},
"notification_config": [{"url": "https://collector.example.org/callback",
"contents": ["transcript",
"data"
],
"auth_headers": ["Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ1c2VySWQiOiJiMDhmODZhZi0zNWRhLTQ4ZjItOGZhYi1jZWYzOTA0NjYwYmQifQ.-xN_h82PHVTCMA9vdoHrcZxH-x5mb11y1537t3rGzcM"
]
}
]
},
{"created_at": "2018-01-09T11:23:42.984612Z",
"data_name": "hello.wav",
"duration": 130,
"id": "084d1f86-9fe9-11e8-9c91-00155d019c0b",
"status": "aligning",
"type": "alignment",
"text_name": "hello.txt",
"alignment_config": {"language": "en"
},
"notification_config": [{"url": "https://collector.example.org/trigger-fetch",
"contents": [ ]
}
],
"tracking": {"title": "Project X Intro",
"reference": "/data/projects/X/overview/audio/hello.wav"
}
}
]
}

Get job details, including progress and any error reports.

path Parameters

jobid

required

string

Example: a1b2c3d4e5

ID of the job.

header Parameters

Authorization required	string Customer API token
X-SM-EAR-Tag	string Early Access Release Tag

Responses

Response Schema:
application/json

required

object (JobDetails)

Document describing a job. JobConfig will be present in JobDetails returned for GET jobs/ request in SaaS and in Batch Appliance, but it will not be present in JobDetails returned as item in RetrieveJobsResponse in case of Batch Appliance.

created_at required	string <date-time> Example: "2018-01-09T12:29:01.853047Z" The UTC date time the job was created.
data_name required	string Name of the data file submitted for job.
text_name	string Name of the text file submitted to be aligned to audio.
duration	integer >= 0 The file duration (in seconds). May be missing for fetch URL jobs.
id required	string Example: "a1b2c3d4e5" The unique id assigned to the job.
status required	string Enum: "running" "done" "rejected" "deleted" "expired" The status of the job. * `running` - The job is actively running. * `done` - The job completed successfully. * `rejected` - The job was accepted at first, but later could not be processed by the transcriber. * `deleted` - The user deleted the job. * `expired` - The system deleted the job. Usually because the job was in the `done` state for a very long time.
	object (JobConfig) JSON object that contains various groups of job configuration parameters. Based on the value of `type`, a type-specific object such as `transcription_config` is required to be present to specify all configuration settings or parameters needed to process the job inputs as expected. If the results of the job are to be forwarded on completion, `notification_config` can be provided with a list of callbacks to be made; no assumptions should be made about the order in which they will occur. Customer specific job details or metadata can be supplied in `tracking`, and this information will be available where possible in the job results and in callbacks.
lang	string Optional parameter used for backwards compatibility with v1 api
	Array of objects (JobDetailError) Optional list of errors that have occurred in user interaction, for example: audio could not be fetched or notification could not be sent.

Request samples

Python
cURL
CLI

from speechmatics.batch_client import BatchClient

# This example shows how to check the duration of the file
with BatchClient("YOUR_API_KEY") as client:
    job_response = client.check_job_status("YOUR_JOB_ID")

    job_duration = job_response["job"]["duration"]
    print(job_duration)

Response samples

Content type

application/json

{"job": {"created_at": "2018-01-09T12:29:01.853047Z",
"data_name": "recording.mp3",
"duration": 244,
"id": "a1b2c3d4e5",
"status": "transcribing",
"type": "transcription",
"transcription_config": {"language": "en",
"additional_vocab": [{"content": "Speechmatics",
"sounds_like": ["speechmatics"
]
},
{"content": "gnocchi",
"sounds_like": ["nyohki",
"nokey",
"nochi"
]
},
{"content": "CEO",
"sounds_like": ["C.E.O."
]
},
{"content": "financial crisis"
}
],
"diarization": "channel",
"channel_diarization_labels": ["Agent",
"Caller"
]
},
"notification_config": [{"url": "https://collector.myorg.com/callback",
"contents": ["transcript",
"data"
],
"auth_headers": ["Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ1c2VySWQiOiJiMDhmODZhZi0zNWRhLTQ4ZjItOGZhYi1jZWYzOTA0NjYwYmQifQ.-xN_h82PHVTCMA9vdoHrcZxH-x5mb11y1537t3rGzcM"
]
}
],
"tracking": {"title": "ACME Q12018 Statement",
"reference": "/data/clients/ACME/statements/segs/2018Q1-seg8",
"tags": ["quick-review",
"segment"
],
"details": {"client": "ACME Corp",
"segment": 8,
"seg_start": 963.201,
"seg_end": 1091.481
}
}
}
}

Delete a job and remove all associated resources.

path Parameters

jobid

required

string

Example: a1b2c3d4e5

ID of the job to delete.

query Parameters

force

boolean

When set, a running job will be force terminated. When unset (default), a running job will not be terminated and request will return HTTP 423 Locked.

header Parameters

Authorization required	string Customer API token
X-SM-EAR-Tag	string Early Access Release Tag

Responses

Response Schema:
application/json

required

object (JobDetails)

created_at required	string <date-time> Example: "2018-01-09T12:29:01.853047Z" The UTC date time the job was created.
data_name required	string Name of the data file submitted for job.
text_name	string Name of the text file submitted to be aligned to audio.
duration	integer >= 0 The file duration (in seconds). May be missing for fetch URL jobs.
id required	string Example: "a1b2c3d4e5" The unique id assigned to the job.
status required	string Enum: "running" "done" "rejected" "deleted" "expired" The status of the job. * `running` - The job is actively running. * `done` - The job completed successfully. * `rejected` - The job was accepted at first, but later could not be processed by the transcriber. * `deleted` - The user deleted the job. * `expired` - The system deleted the job. Usually because the job was in the `done` state for a very long time.
	object (JobConfig) JSON object that contains various groups of job configuration parameters. Based on the value of `type`, a type-specific object such as `transcription_config` is required to be present to specify all configuration settings or parameters needed to process the job inputs as expected. If the results of the job are to be forwarded on completion, `notification_config` can be provided with a list of callbacks to be made; no assumptions should be made about the order in which they will occur. Customer specific job details or metadata can be supplied in `tracking`, and this information will be available where possible in the job results and in callbacks.
lang	string Optional parameter used for backwards compatibility with v1 api
	Array of objects (JobDetailError) Optional list of errors that have occurred in user interaction, for example: audio could not be fetched or notification could not be sent.

Request samples

Python
cURL
CLI

from speechmatics.batch_client import BatchClient

with BatchClient("YOUR_API_KEY") as client:
    client.delete_job("YOUR_JOB_ID")

Response samples

Content type

application/json

{"job": {"created_at": "2018-01-09T12:29:01.853047Z",
"data_name": "recording.mp3",
"duration": 244,
"id": "a1b2c3d4e5",
"status": "deleted",
"type": "transcription",
"transcription_config": {"language": "en",
"additional_vocab": [{"content": "Speechmatics",
"sounds_like": ["speechmatics"
]
},
{"content": "gnocchi",
"sounds_like": ["nyohki",
"nokey",
"nochi"
]
},
{"content": "CEO",
"sounds_like": ["C.E.O."
]
},
{"content": "financial crisis"
}
],
"diarization": "channel",
"channel_diarization_labels": ["Agent",
"Caller"
]
},
"notification_config": [{"url": "https://collector.myorg.com/callback",
"contents": ["transcript",
"data"
],
"auth_headers": ["Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ1c2VySWQiOiJiMDhmODZhZi0zNWRhLTQ4ZjItOGZhYi1jZWYzOTA0NjYwYmQifQ.-xN_h82PHVTCMA9vdoHrcZxH-x5mb11y1537t3rGzcM"
]
}
],
"tracking": {"title": "ACME Q12018 Statement",
"reference": "/data/clients/ACME/statements/segs/2018Q1-seg8",
"tags": ["quick-review",
"segment"
],
"details": {"client": "ACME Corp",
"segment": 8,
"seg_start": 963.201,
"seg_end": 1091.481
}
}
}
}

Get the transcript for a transcription job.

path Parameters

jobid

required

string

Example: a1b2c3d4e5

ID of the job.

query Parameters

format

string

Enum: "json-v2" "txt" "srt"

The transcription format (by default the json-v2 format is returned).

header Parameters

Authorization required	string Customer API token
X-SM-EAR-Tag	string Early Access Release Tag

Responses

Response Schema:
application/json

format required	string Example: "2.1" Speechmatics JSON transcript format version number.
required	object (JobInfo) Summary information about an ASR job, to support identification and tracking.
required	object (RecognitionMetadata) Summary information about the output from an ASR job, comprising the job type and configuration parameters used when generating the output.
required	Array of objects (RecognitionResult) Example: [[{"channel":"channel_1","start_time":0.55,"end_time":1.2,"type":"word","volume":0.5,"alternatives":[{"confidence":0.95,"content":"Hello","language":"en","speaker":"S1","display":{"direction":"ltr"}}]}]]
	object Example: {"de":[{"start_time":0.5,"end_time":1.3,"content":"Guten Tag, wie geht es dir?","speaker":"UU"}],"fr":[{"start_time":0.5,"end_time":1.3,"content":"Bonjour, comment ça va?","speaker":"UU"}]} Translations of the transcript into other languages. It is a map of ISO language codes to arrays of translated sentences. Configured using `translation_config`.
	object (SummarizationResult) Example: {"content":"this is a summary"} Summary of the transcript, configured using `summarization_config`.
	object (SentimentAnalysisResult) Example: {"segments":[{"text":"I am happy with the product.","start_time":0,"end_time":5,"sentiment":"positive","speaker":"John Doe","channel":"Chat","confidence":0.9},{"text":"I don't like the customer service.","start_time":6,"end_time":12,"sentiment":"negative","speaker":"John Doe","channel":"Chat","confidence":0.8}],"summary":{"overall":{"positive_count":1,"negative_count":1,"neutral_count":0},"speakers":[{"speaker":"John Doe","positive_count":1,"negative_count":1,"neutral_count":0}],"channels":[{"channel":"Chat","positive_count":1,"negative_count":1,"neutral_count":0}]}} The main object that holds sentiment analysis data.
	object (TopicDetectionResult) Example: {"segments":[{"text":"I am happy with the product.","start_time":0,"end_time":5,"topics":[{"topic":"product"}]},{"text":"We will deploy this container for Spanish.","start_time":6,"end_time":12,"topics":[{"topic":"deployment"},{"topic":"languages"}]}],"summary":{"overall":{"deployment":1,"languages":1,"product":1}}} Main object that holds topic detection results.
	Array of objects (AutoChaptersResult) Example: [{"title":"Part 1","summary":"Summary of part 1","start_time":0,"end_time":5},{"title":"Part 2","summary":"Summary of part 2","start_time":5,"end_time":10}] An array of objects that represent summarized chapters of the transcript
	Array of objects (AudioEventItem) Timestamped audio events, only set if `audio_events_config` is in the config
	object Summary statistics per event type, keyed by `type`, e.g. music

Request samples

Python
cURL
CLI

from speechmatics.batch_client import BatchClient

# This examples shows how to unpack various things from the transcript
with BatchClient("YOUR_API_KEY") as client:
    transcript = client.get_job_result("YOUR_JOB_ID")

    # Print out the first word of the transcript
    first_word = transcript["results"][0][0]["alternatives"][0]
    print(first_word)

    # Supposing we had submitted a translation, we might get the first sentence
    translation_sentence = transcript["translations"]["de"][0][content]
    print(translation_sentence)

    # If we wanted a summary
    summary = transcript["summary"]["content"]
    print(summary)

    # If we wanted to check for sentiment analysis
    first_sentiment = transcript["sentiment_analysis"]["segments"][0]["sentiment"]
    print(first_sentiment)

Response samples

Content type

application/json

{"format": "2.1",
"job": {"created_at": "2018-01-09T12:29:01.853047Z",
"data_name": "string",
"duration": 0,
"id": "a1b2c3d4e5",
"text_name": "string",
"tracking": {"title": "ACME Q12018 Earnings Call",
"reference": "/data/clients/ACME/statements/segs/2018Q1-seg8",
"tags": ["quick-review",
"segment"
],
"details": {"client": "ACME Corp",
"segment": 8,
"seg_start": 963.201,
"seg_end": 1091.481
}
}
},
"metadata": {"created_at": "2018-01-09T12:29:01.853047Z",
"type": "alignment",
"transcription_config": {"language": "en",
"output_locale": "en-GB",
"additional_vocab": [{"content": "Speechmatics",
"sounds_like": ["speechmatics"
]
},
{"content": "gnocchi",
"sounds_like": ["nyohki",
"nokey",
"nochi"
]
},
{"content": "CEO",
"sounds_like": ["C.E.O."
]
},
{"content": "financial crisis"
}
],
"diarization": "channel",
"channel_diarization_labels": ["Caller",
"Agent"
]
},
"orchestrator_version": "2024.12.26085+a0a32e61ad.HEAD",
"translation_errors": [{"type": "translation_failed",
"message": "string"
}
],
"summarization_errors": [{"type": "summarization_failed",
"message": "string"
}
],
"sentiment_analysis_errors": [{"type": "sentiment_analysis_failed",
"message": "string"
}
],
"topic_detection_errors": [{"type": "topic_detection_failed",
"message": "string"
}
],
"auto_chapters_errors": [{"type": "auto_chapters_failed",
"message": "string"
}
],
"alignment_config": {"language": "en"
},
"output_config": {"srt_overrides": {"max_line_length": 0,
"max_lines": 0
}
},
"language_pack_info": {"language_description": "string",
"word_delimiter": "string",
"writing_direction": "left-to-right",
"itn": true,
"adapted": true
},
"language_identification": {"results": [{"alternatives": [{"language": "en",
"confidence": 0.98
},
{"language": "fr",
"confidence": 0.02
}
],
"start_time": 0,
"end_time": 5.5
},
{"alternatives": [{"language": "en",
"confidence": 0.95
},
{"language": "fr",
"confidence": 0.05
}
],
"start_time": 5.6,
"end_time": 10
}
]
}
},
"results": [[{"channel": "channel_1",
"start_time": 0.55,
"end_time": 1.2,
"type": "word",
"volume": 0.5,
"alternatives": [{"confidence": 0.95,
"content": "Hello",
"language": "en",
"speaker": "S1",
"display": {"direction": "ltr"
}
}
]
}
]
],
"translations": {"de": [{"start_time": 0.5,
"end_time": 1.3,
"content": "Guten Tag, wie geht es dir?",
"speaker": "UU"
}
],
"fr": [{"start_time": 0.5,
"end_time": 1.3,
"content": "Bonjour, comment ça va?",
"speaker": "UU"
}
]
},
"summary": {"content": "this is a summary"
},
"sentiment_analysis": {"segments": [{"text": "I am happy with the product.",
"start_time": 0,
"end_time": 5,
"sentiment": "positive",
"speaker": "John Doe",
"channel": "Chat",
"confidence": 0.9
},
{"text": "I don't like the customer service.",
"start_time": 6,
"end_time": 12,
"sentiment": "negative",
"speaker": "John Doe",
"channel": "Chat",
"confidence": 0.8
}
],
"summary": {"overall": {"positive_count": 1,
"negative_count": 1,
"neutral_count": 0
},
"speakers": [{"speaker": "John Doe",
"positive_count": 1,
"negative_count": 1,
"neutral_count": 0
}
],
"channels": [{"channel": "Chat",
"positive_count": 1,
"negative_count": 1,
"neutral_count": 0
}
]
}
},
"topics": {"segments": [{"text": "I am happy with the product.",
"start_time": 0,
"end_time": 5,
"topics": [{"topic": "product"
}
]
},
{"text": "We will deploy this container for Spanish.",
"start_time": 6,
"end_time": 12,
"topics": [{"topic": "deployment"
},
{"topic": "languages"
}
]
}
],
"summary": {"overall": {"deployment": 1,
"languages": 1,
"product": 1
}
}
},
"chapters": [{"title": "Part 1",
"summary": "Summary of part 1",
"start_time": 0,
"end_time": 5
},
{"title": "Part 2",
"summary": "Summary of part 2",
"start_time": 5,
"end_time": 10
}
],
"audio_events": [{"type": "string",
"start_time": 0.1,
"end_time": 0.1,
"confidence": 0.1,
"channel": "string"
}
],
"audio_event_summary": {"overall": {"property1": {"total_duration": 0.1,
"count": 0
},
"property2": {"total_duration": 0.1,
"count": 0
}
},
"channels": {"property1": {"property1": {"total_duration": 0.1,
"count": 0
},
"property2": {"total_duration": 0.1,
"count": 0
}
},
"property2": {"property1": {"total_duration": 0.1,
"count": 0
},
"property2": {"total_duration": 0.1,
"count": 0
}
}
}
}
}

Get the aligned text file for an alignment job.

More details of our alignment service can be found here.

path Parameters

jobid

required

string

Example: a1b2c3d4e5

ID of the job.

query Parameters

header Parameters

Authorization required	string Customer API token
X-SM-EAR-Tag	string Early Access Release Tag

Responses

Response Schema:
text/plain

string <binary>

Request samples

cURL

JOB_ID="YOUR_JOB_ID"
API_KEY="YOUR_API_KEY"

curl -L -X GET "https://asr.api.speechmatics.com/v2/jobs/${JOB_ID}/alignment" \
    -H "Authorization: Bearer ${API_KEY}"

Response samples

Content type

text/plain

No sample

Get the usage statistics.

query Parameters

since	string <date> Include usage after the given date (inclusive). This is a ISO-8601 calendar date format: `YYYY-MM-DD`.
until	string <date> Include usage before the given date (inclusive). This is a ISO-8601 calendar date format: `YYYY-MM-DD`.

header Parameters

Authorization required	string Customer API token
X-SM-EAR-Tag	string Early Access Release Tag

Responses

Response Schema: application/json

since required	string <date-time> Example: "2021-10-14T00:55:00Z"
until required	string <date-time> Example: "2022-12-01T00:00:00Z"
required	Array of objects (UsageDetails)
required	Array of objects (UsageDetails)

Request samples

cURL

API_KEY="YOUR_API_KEY"

curl -L -X GET "https://asr.api.speechmatics.com/v2/usage" \
    -H "Authorization: Bearer ${API_KEY}"

Response samples

Content type

application/json

{"since": "2021-09-12T00:00:00Z",
"until": "2022-01-01T23:59:59Z",
"summary": [{"mode": "batch",
"type": "transcription",
"count": 5,
"duration_hrs": 1.53
},
{"mode": "batch",
"type": "alignment",
"count": 1,
"duration_hrs": 0.1
}
],
"details": [{"mode": "batch",
"type": "transcription",
"language": "sv",
"operating_point": "standard",
"count": 4,
"duration_hrs": 1.33
},
{"mode": "batch",
"type": "transcription",
"language": "de",
"operating_point": "enhanced",
"count": 1,
"duration_hrs": 0.2
},
{"mode": "batch",
"type": "alignment",
"language": "en",
"count": 1,
"duration_hrs": 0.1
}
]
}

Job Config

This model should be used when you create a new job. It will also be returned as a part of response in a number of requests. This includes when you get job details or get the transcript for a transcription job.

Based on the value of type, a type-specific object such as transcription_config is required to be present to specify all configuration settings or parameters needed to process the job inputs as expected.

If the results of the job are to be forwarded on completion, notification_config can be provided with a list of callbacks to be made; no assumptions should be made about the order in which they will occur. For more details, please refer to Notifications in the documentation.

Customer specific job details or metadata can be supplied in tracking, and this information will be available where possible in the job results and in callbacks.

type required	string (JobType) Enum: "alignment" "transcription"
	object (DataFetchConfig)
	object (DataFetchConfig)
	object (AlignmentConfig) Example: {"language":"en"}
	object (TranscriptionConfig) Example: {"language":"en","output_locale":"en-GB","additional_vocab":[{"content":"Speechmatics","sounds_like":["speechmatics"]},{"content":"gnocchi","sounds_like":["nyohki","nokey","nochi"]},{"content":"CEO","sounds_like":["C.E.O."]},{"content":"financial crisis"}],"diarization":"channel","channel_diarization_labels":["Caller","Agent"]}
	Array of objects (NotificationConfig) Example: [[{"url":"https://collector.example.org/callback","contents":["transcript:json-v2"],"auth_headers":["Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ1c2VySWQiOiJiMDhmODZhZi0zNWRhLTQ4ZjItOGZhYi1jZWYzOTA0NjYwYmQifQ.-xN_h82PHVTCMA9vdoHrcZxH-x5mb11y1537t3rGzcM"]}]]
	object (TrackingData) Example: {"title":"ACME Q12018 Earnings Call","reference":"/data/clients/ACME/statements/segs/2018Q1-seg8","tags":["quick-review","segment"],"details":{"client":"ACME Corp","segment":8,"seg_start":963.201,"seg_end":1091.481}}
	object (OutputConfig)
	object (TranslationConfig)
	object (LanguageIdentificationConfig)
	object (SummarizationConfig)
sentiment_analysis_config	object (SentimentAnalysisConfig)
	object (TopicDetectionConfig)
auto_chapters_config	object (AutoChaptersConfig)
	object (AudioEventsConfig)

{"type": "alignment",
"fetch_data": {"url": "string",
"auth_headers": ["string"
]
},
"fetch_text": {"url": "string",
"auth_headers": ["string"
]
},
"alignment_config": {"language": "en"
},
"transcription_config": {"language": "en",
"output_locale": "en-GB",
"additional_vocab": [{"content": "Speechmatics",
"sounds_like": ["speechmatics"
]
},
{"content": "gnocchi",
"sounds_like": ["nyohki",
"nokey",
"nochi"
]
},
{"content": "CEO",
"sounds_like": ["C.E.O."
]
},
{"content": "financial crisis"
}
],
"diarization": "channel",
"channel_diarization_labels": ["Caller",
"Agent"
]
},
"notification_config": [[{"url": "https://collector.example.org/callback",
"contents": ["transcript:json-v2"
],
"auth_headers": ["Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ1c2VySWQiOiJiMDhmODZhZi0zNWRhLTQ4ZjItOGZhYi1jZWYzOTA0NjYwYmQifQ.-xN_h82PHVTCMA9vdoHrcZxH-x5mb11y1537t3rGzcM"
]
}
]
],
"tracking": {"title": "ACME Q12018 Earnings Call",
"reference": "/data/clients/ACME/statements/segs/2018Q1-seg8",
"tags": ["quick-review",
"segment"
],
"details": {"client": "ACME Corp",
"segment": 8,
"seg_start": 963.201,
"seg_end": 1091.481
}
},
"output_config": {"srt_overrides": {"max_line_length": 0,
"max_lines": 0
}
},
"translation_config": {"target_languages": ["string"
]
},
"language_identification_config": {"expected_languages": ["string"
],
"low_confidence_action": "allow",
"default_language": "string"
},
"summarization_config": {"content_type": "auto",
"summary_length": "brief",
"summary_type": "paragraphs"
},
"sentiment_analysis_config": { },
"topic_detection_config": {"topics": ["string"
]
},
"auto_chapters_config": { },
"audio_events_config": {"types": ["string"
]
}
}

Job Details

Returned when you get job details, list all jobs or delete a job. This model includes the status and config that was used.

created_at required	string <date-time> Example: "2018-01-09T12:29:01.853047Z" The UTC date time the job was created.
data_name required	string Name of the data file submitted for job.
text_name	string Name of the text file submitted to be aligned to audio.
duration	integer >= 0 The file duration (in seconds). May be missing for fetch URL jobs.
id required	string Example: "a1b2c3d4e5" The unique id assigned to the job.
status required	string Enum: "running" "done" "rejected" "deleted" "expired" The status of the job. * `running` - The job is actively running. * `done` - The job completed successfully. * `rejected` - The job was accepted at first, but later could not be processed by the transcriber. * `deleted` - The user deleted the job. * `expired` - The system deleted the job. Usually because the job was in the `done` state for a very long time.
	object (JobConfig) JSON object that contains various groups of job configuration parameters. Based on the value of `type`, a type-specific object such as `transcription_config` is required to be present to specify all configuration settings or parameters needed to process the job inputs as expected. If the results of the job are to be forwarded on completion, `notification_config` can be provided with a list of callbacks to be made; no assumptions should be made about the order in which they will occur. Customer specific job details or metadata can be supplied in `tracking`, and this information will be available where possible in the job results and in callbacks.
lang	string Optional parameter used for backwards compatibility with v1 api
	Array of objects (JobDetailError) Optional list of errors that have occurred in user interaction, for example: audio could not be fetched or notification could not be sent.

{"created_at": "2018-01-09T12:29:01.853047Z",
"data_name": "string",
"text_name": "string",
"duration": 0,
"id": "a1b2c3d4e5",
"status": "running",
"config": {"type": "alignment",
"fetch_data": {"url": "string",
"auth_headers": ["string"
]
},
"fetch_text": {"url": "string",
"auth_headers": ["string"
]
},
"alignment_config": {"language": "en"
},
"transcription_config": {"language": "en",
"output_locale": "en-GB",
"additional_vocab": [{"content": "Speechmatics",
"sounds_like": ["speechmatics"
]
},
{"content": "gnocchi",
"sounds_like": ["nyohki",
"nokey",
"nochi"
]
},
{"content": "CEO",
"sounds_like": ["C.E.O."
]
},
{"content": "financial crisis"
}
],
"diarization": "channel",
"channel_diarization_labels": ["Caller",
"Agent"
]
},
"notification_config": [[{"url": "https://collector.example.org/callback",
"contents": ["transcript:json-v2"
],
"auth_headers": ["Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ1c2VySWQiOiJiMDhmODZhZi0zNWRhLTQ4ZjItOGZhYi1jZWYzOTA0NjYwYmQifQ.-xN_h82PHVTCMA9vdoHrcZxH-x5mb11y1537t3rGzcM"
]
}
]
],
"tracking": {"title": "ACME Q12018 Earnings Call",
"reference": "/data/clients/ACME/statements/segs/2018Q1-seg8",
"tags": ["quick-review",
"segment"
],
"details": {"client": "ACME Corp",
"segment": 8,
"seg_start": 963.201,
"seg_end": 1091.481
}
},
"output_config": {"srt_overrides": {"max_line_length": 0,
"max_lines": 0
}
},
"translation_config": {"target_languages": ["string"
]
},
"language_identification_config": {"expected_languages": ["string"
],
"low_confidence_action": "allow",
"default_language": "string"
},
"summarization_config": {"content_type": "auto",
"summary_length": "brief",
"summary_type": "paragraphs"
},
"sentiment_analysis_config": { },
"topic_detection_config": {"topics": ["string"
]
},
"auto_chapters_config": { },
"audio_events_config": {"types": ["string"
]
}
},
"lang": "string",
"errors": [{"timestamp": "2021-07-14T11:53:49.242Z",
"message": "Audio fetch error, http status 418"
}
]
}

Transcript

Returned when you get the transcript for a transcription job. It includes metadata about the job, such as the transcription config that was used.

format required	string Example: "2.1" Speechmatics JSON transcript format version number.
required	object (JobInfo) Summary information about an ASR job, to support identification and tracking.
required	object (RecognitionMetadata) Summary information about the output from an ASR job, comprising the job type and configuration parameters used when generating the output.
required	Array of objects (RecognitionResult) Example: [[{"channel":"channel_1","start_time":0.55,"end_time":1.2,"type":"word","volume":0.5,"alternatives":[{"confidence":0.95,"content":"Hello","language":"en","speaker":"S1","display":{"direction":"ltr"}}]}]]
	object Example: {"de":[{"start_time":0.5,"end_time":1.3,"content":"Guten Tag, wie geht es dir?","speaker":"UU"}],"fr":[{"start_time":0.5,"end_time":1.3,"content":"Bonjour, comment ça va?","speaker":"UU"}]} Translations of the transcript into other languages. It is a map of ISO language codes to arrays of translated sentences. Configured using `translation_config`.
	object (SummarizationResult) Example: {"content":"this is a summary"} Summary of the transcript, configured using `summarization_config`.
	object (SentimentAnalysisResult) Example: {"segments":[{"text":"I am happy with the product.","start_time":0,"end_time":5,"sentiment":"positive","speaker":"John Doe","channel":"Chat","confidence":0.9},{"text":"I don't like the customer service.","start_time":6,"end_time":12,"sentiment":"negative","speaker":"John Doe","channel":"Chat","confidence":0.8}],"summary":{"overall":{"positive_count":1,"negative_count":1,"neutral_count":0},"speakers":[{"speaker":"John Doe","positive_count":1,"negative_count":1,"neutral_count":0}],"channels":[{"channel":"Chat","positive_count":1,"negative_count":1,"neutral_count":0}]}} The main object that holds sentiment analysis data.
	object (TopicDetectionResult) Example: {"segments":[{"text":"I am happy with the product.","start_time":0,"end_time":5,"topics":[{"topic":"product"}]},{"text":"We will deploy this container for Spanish.","start_time":6,"end_time":12,"topics":[{"topic":"deployment"},{"topic":"languages"}]}],"summary":{"overall":{"deployment":1,"languages":1,"product":1}}} Main object that holds topic detection results.
	Array of objects (AutoChaptersResult) Example: [{"title":"Part 1","summary":"Summary of part 1","start_time":0,"end_time":5},{"title":"Part 2","summary":"Summary of part 2","start_time":5,"end_time":10}] An array of objects that represent summarized chapters of the transcript
	Array of objects (AudioEventItem) Timestamped audio events, only set if `audio_events_config` is in the config
	object Summary statistics per event type, keyed by `type`, e.g. music

{"format": "2.1",
"job": {"created_at": "2018-01-09T12:29:01.853047Z",
"data_name": "string",
"duration": 0,
"id": "a1b2c3d4e5",
"text_name": "string",
"tracking": {"title": "ACME Q12018 Earnings Call",
"reference": "/data/clients/ACME/statements/segs/2018Q1-seg8",
"tags": ["quick-review",
"segment"
],
"details": {"client": "ACME Corp",
"segment": 8,
"seg_start": 963.201,
"seg_end": 1091.481
}
}
},
"metadata": {"created_at": "2018-01-09T12:29:01.853047Z",
"type": "alignment",
"transcription_config": {"language": "en",
"output_locale": "en-GB",
"additional_vocab": [{"content": "Speechmatics",
"sounds_like": ["speechmatics"
]
},
{"content": "gnocchi",
"sounds_like": ["nyohki",
"nokey",
"nochi"
]
},
{"content": "CEO",
"sounds_like": ["C.E.O."
]
},
{"content": "financial crisis"
}
],
"diarization": "channel",
"channel_diarization_labels": ["Caller",
"Agent"
]
},
"orchestrator_version": "2024.12.26085+a0a32e61ad.HEAD",
"translation_errors": [{"type": "translation_failed",
"message": "string"
}
],
"summarization_errors": [{"type": "summarization_failed",
"message": "string"
}
],
"sentiment_analysis_errors": [{"type": "sentiment_analysis_failed",
"message": "string"
}
],
"topic_detection_errors": [{"type": "topic_detection_failed",
"message": "string"
}
],
"auto_chapters_errors": [{"type": "auto_chapters_failed",
"message": "string"
}
],
"alignment_config": {"language": "en"
},
"output_config": {"srt_overrides": {"max_line_length": 0,
"max_lines": 0
}
},
"language_pack_info": {"language_description": "string",
"word_delimiter": "string",
"writing_direction": "left-to-right",
"itn": true,
"adapted": true
},
"language_identification": {"results": [{"alternatives": [{"language": "en",
"confidence": 0.98
},
{"language": "fr",
"confidence": 0.02
}
],
"start_time": 0,
"end_time": 5.5
},
{"alternatives": [{"language": "en",
"confidence": 0.95
},
{"language": "fr",
"confidence": 0.05
}
],
"start_time": 5.6,
"end_time": 10
}
]
}
},
"results": [[{"channel": "channel_1",
"start_time": 0.55,
"end_time": 1.2,
"type": "word",
"volume": 0.5,
"alternatives": [{"confidence": 0.95,
"content": "Hello",
"language": "en",
"speaker": "S1",
"display": {"direction": "ltr"
}
}
]
}
]
],
"translations": {"de": [{"start_time": 0.5,
"end_time": 1.3,
"content": "Guten Tag, wie geht es dir?",
"speaker": "UU"
}
],
"fr": [{"start_time": 0.5,
"end_time": 1.3,
"content": "Bonjour, comment ça va?",
"speaker": "UU"
}
]
},
"summary": {"content": "this is a summary"
},
"sentiment_analysis": {"segments": [{"text": "I am happy with the product.",
"start_time": 0,
"end_time": 5,
"sentiment": "positive",
"speaker": "John Doe",
"channel": "Chat",
"confidence": 0.9
},
{"text": "I don't like the customer service.",
"start_time": 6,
"end_time": 12,
"sentiment": "negative",
"speaker": "John Doe",
"channel": "Chat",
"confidence": 0.8
}
],
"summary": {"overall": {"positive_count": 1,
"negative_count": 1,
"neutral_count": 0
},
"speakers": [{"speaker": "John Doe",
"positive_count": 1,
"negative_count": 1,
"neutral_count": 0
}
],
"channels": [{"channel": "Chat",
"positive_count": 1,
"negative_count": 1,
"neutral_count": 0
}
]
}
},
"topics": {"segments": [{"text": "I am happy with the product.",
"start_time": 0,
"end_time": 5,
"topics": [{"topic": "product"
}
]
},
{"text": "We will deploy this container for Spanish.",
"start_time": 6,
"end_time": 12,
"topics": [{"topic": "deployment"
},
{"topic": "languages"
}
]
}
],
"summary": {"overall": {"deployment": 1,
"languages": 1,
"product": 1
}
}
},
"chapters": [{"title": "Part 1",
"summary": "Summary of part 1",
"start_time": 0,
"end_time": 5
},
{"title": "Part 2",
"summary": "Summary of part 2",
"start_time": 5,
"end_time": 10
}
],
"audio_events": [{"type": "string",
"start_time": 0.1,
"end_time": 0.1,
"confidence": 0.1,
"channel": "string"
}
],
"audio_event_summary": {"overall": {"property1": {"total_duration": 0.1,
"count": 0
},
"property2": {"total_duration": 0.1,
"count": 0
}
},
"channels": {"property1": {"property1": {"total_duration": 0.1,
"count": 0
},
"property2": {"total_duration": 0.1,
"count": 0
}
},
"property2": {"property1": {"total_duration": 0.1,
"count": 0
},
"property2": {"total_duration": 0.1,
"count": 0
}
}
}
}
}

Speechmatics ASR REST API (2.0.0)

Jobs

Create a new job.

header Parameters

Request Body schema: multipart/form-data

Responses

Response Schema: application/jsonapplication/vnd.speechmatics.v2+jsonapplication/json

Request samples

Response samples

List all jobs.

query Parameters

header Parameters

Responses

Response Schema: application/jsonapplication/vnd.speechmatics.v2+jsonapplication/json

Request samples

Response samples

Get job details, including progress and any error reports.

path Parameters

header Parameters

Responses

Response Schema: application/jsonapplication/vnd.speechmatics.v2+jsonapplication/json

Request samples

Response samples

Delete a job and remove all associated resources.

path Parameters

query Parameters

header Parameters

Responses

Response Schema: application/jsonapplication/vnd.speechmatics.v2+jsonapplication/json

Request samples

Response samples

Get the transcript for a transcription job.

path Parameters

query Parameters

header Parameters

Responses

Response Schema: application/jsonapplication/vnd.speechmatics.v2application/vnd.speechmatics.v2+jsontext/plainapplication/json

Request samples

Response samples

Get the aligned text file for an alignment job.

path Parameters

query Parameters

header Parameters

Responses

Response Schema: text/plainapplication/jsonword_start_and_endone_per_linetext/plain

Request samples

Response samples

Get the usage statistics.

query Parameters

header Parameters

Responses

Response Schema: application/json

Request samples

Response samples

Job Config

Job Details

Transcript

Response Schema:
application/json

Response Schema:
application/json

Response Schema:
application/json

Response Schema:
application/json

Response Schema:
application/json

Response Schema:
text/plain