Skip to main content

Supported Files and Limits

Transcription:BatchReal-TimeDeployments:All

Supported File Types

The following input types are supported for transcription:

Raw Audio streaming (RT only):

  • PCM F32 LE raw audio stream (32-bit float)
  • PCM S16 LE raw audio stream (16- bit signed int)
  • mu-law

Files (RT and Batch):

  • wav
  • mp3
  • aac
  • ogg
  • mpeg
  • amr
  • m4a
  • mp4
  • flac

The list above is exhaustive - any file format outside the list above is explicitly not supported.

Only files where the type can be determined by data inspection are supported. Raw audio formats where the codec is not embedded in the file cannot be processed in batch mode. This includes files commonly given extensions like ".raw" or ".g729" where the codec is only hinted at in the name.

Rate Limiting and Fair Usage

Speechmatics Batch SaaS applies rate limiting and fair queueing to provide a consistently high quality of service to all users.

If you make a large number of requests in a short period of time, some of these requests may fail with the response HTTP 429 - Rate Limited. To minimize the possibility of encountering rate limiting errors, we recommend that you do not exceed the following rates:

  • 10 new jobs per second (POST API calls)
  • 50 job status requests per second (GET API calls). Note that Speechmatics recommends using Notifications for job status updates in production

Aside from rate limiting, there is no limit to the number of jobs that you can submit. However, Speechmatics Batch SaaS applies a fair queueing policy which means that if you have a large number of jobs in progress at one time, the most recently submitted jobs may take longer to complete.

Hourly Usage Limits

Speechmatics limits the number of hours of audio users can process each month to help manage load on our servers. The current limits (in hours) by account type are listed in the table below:

Batch StandardBatch EnhancedReal-TimeFlow
Free Tier22450
Paid Tier10001000100050
EnterpriseCustomCustomCustomCustom

Please reach out to Support if you need to increase the above limits.

RConcurrency Limits

Speechmatics applies the following limits to the number of concurrent sessions based on your account type:

Real-TimeFlow
Free Tier21 .
Paid Tier101 .
EnterpriseCustomCustom

Please reach out to Support if you need to increase the above limits.

Real-Time Session Limits

info

From 01 March 2025 the real-time session limits described below will be strictly enforced. Please update or write your client as if they are already in place.

Real-time SaaS sessions may be automatically ended if any of the following criteria are met:

  • Session duration reaches 48 hours
  • No audio data (AddAudio messages) sent for 1 hour
  • No audio or ping/pongs sent for 3 minutes.

When a session is automatically ended, the Real-time SaaS service will send an in-band WebSocket message.

Guidance for users

Clients can disconnect a session before it is automatically terminated and immediately reconnect a new session. Note that new sessions will typically start in less than a second. If seamless transition is required, the new session can be connected a few seconds before disconnecting the old session.

Since unpredictable network issues can cause WebSocket connections to be dropped, we recommend graceful handling of session termination for long-running sessions.

File Size Limits

If you submit your media file in the body of the /jobs POST request to Speechmatics Batch SaaS, the file must be less than 1 GB in size or the job will be rejected. To process files larger than 1 GB, you can provide the URL of the audio file in the job config as described in the Fetch URL documentation.

Data Retention Limits

Audio files, transcripts, and configuration data are stored in the Speechmatics Batch SaaS for 7 days. Any request to retrieve a transcript or file more than 7 days after it was processed will receive an HTTP 404 error message and a status of expired. You can delete audio or transcripts in advance of this 7 day period - see the API Reference for details.

Speechmatics Real-Time SaaS does not store audio files, transcripts, or configuration data.