Speech to TextBatch Transcription

Language Identification (SaaS)

Learn about Speechmatics Language ID

Detect the predominant language spoken and transcribe using the appropriate language.

You can also learn about deploying this On-Prem by following our documentation.

Automatic Language Identification can be set when calling the Speechmatics transcription API. You can also try it for free in the Speechmatics On-Demand Portal with no code.

If you're new to Speechmatics, please see our guide on Transcribing a File through our API.

Once you are set up, just set language to auto to use Automatic Language Identification:

{
  "type": "transcription",
  "transcription_config": {
    "language": "auto"
  }
}

To reliably identify the predominant language, the file should contain at least 60 seconds of speech in that language.

Enabling this for a transcription job will result in a small increase in the total turnaround time.

Configuration

Expected Languages

If you expect the audio to be one of a restricted set of languages, you can provide this information through the expected_languages parameter:

{
  "type": "transcription",
  "transcription_config": {
    "language": "auto"
  },
  "language_identification_config": {
    "expected_languages": ["en", "es", "de", "fr"]
  }
}

If the language detected is not in the expected_languages list, the job will be rejected.

A list of possible Language Codes can be found here. The following languages are not supported for Language Identification: Interlingua (ia), Esperanto (eo), Uyghur (ug), Cantonese (yue), Irish (ga), Maltese (mt), Urdu (ur), Bengali (bn), Swahili (sw).

Low Confidence Action

By default, the job will be rejected if no language is identified with high enough confidence.

To prevent the job from being rejected, you can set a low_confidence_action with one of two options:

allow - Use the highest confidence identified language
use_default_language - Use your predefined Default Language

To configure a job which would use the highest confidence identified language:

{
  "type": "transcription",
  "transcription_config": {
    "language": "auto"
  },
  "language_identification_config": {
    "low_confidence_action": "allow"
  }
}

To configure a job which would use your predefined Default Language:

{
  "type": "transcription",
  "transcription_config": {
    "language": "auto"
  },
  "language_identification_config": {
    "low_confidence_action": "use_default_language",
    "default_language": "es"
  }
}

When getting Job Details or Transcript, the job will succeed and you will see an error message in the job metadata:

{
  "transcription_config": {
    "language": "auto"
  },
  "metadata": {
      "created_at": "2023-10-10T14:51:12.051413Z",
      "language_identification": {
          "error": "LOW_CONFIDENCE",
          "message": "Language identification could not identify any language with sufficient confidence."
      }
  },
  ...,
  "results": []
}

Default Language

By default, the job will be rejected if there is No Speech Detected.

To prevent the job from being rejected, you can set a default_language. This could also be used if the Low Confidence Action is set to use_default_language.

To configure a job with Default Language:

{
  "type": "transcription",
  "transcription_config": {
    "language": "auto"
  },
  "language_identification_config": {
    "default_language": "es"
  }
}

When getting Job Details or Transcript, the job will succeed and you will see an error message in the job metadata:

{
  "transcription_config": {
    "language": "auto"
  },
  "language_identification_config": {
      "default_language": "es"
  },
  "metadata": {
      "created_at": "2023-10-10T14:51:12.051413Z",
      "language_identification": {
          "error": "NO_SPEECH",
          "message": "No speech found for language identification"
      }
  },
  ...,
}

Transcription Result

You can determine the language used to transcribe the file from the first word in the response results.

{
  "job": { ... },
  "metadata": {
    "transcription_config": { "language": "auto" },
    "language_identification_config": {
      "expected_languages": ["en", "es", "de", "fr"]
    },
    "type": "transcription",
    "created_at": "2023-02-24T18:22:22.563358Z",
  },
  "results": [
    {
      "alternatives": [
        {
          "confidence": 1.0,
          "content": "It",
          "language": "en",
          "speaker": "UU"
        }],
          "end_time": 0.72,
          "start_time": 0.6,
          "type": "word"
        },
        ...
    ]
}

Usage with Other Features

The following considerations are required when using Automatic Language Identification along with other Speechmatics features.

Custom Dictionary

Custom Dictionary can be used with Automatic Language Identification.

The Custom Dictionary will be used with the identified language. Some language-specific features such as sounds_like might not behave as expected.

Output Locale

Output Locale is currently not supported in combination with using Automatic Language Identification. Jobs with this combination of features will be rejected.

Translation

Translation can be used with Automatic Language Identification.

If the identified transcription language and target translation language match, then the translation will contain the transcription sentences.

To reduce friction when using Automatic Language Identification, the translation target language is not validated when submitting the job. For each translation target language that is not supported for the identified language, there will be an error in the translation_errors field of the job metadata. For more information, see Errors When Used with Translation. Note that if the language is specified and an unsupported translation target language is selected then the job will be rejected.

Error Responses

Unsupported Expected Language

If one or more of the expected languages are not supported, a HTTP 400 error response is returned.

Language ID is supported for all of Speechmatics' languages except Interlingua (ia), Esperanto (eo), Uyghur (ug), Cantonese (yue), Irish (ga), Maltese (mt), Urdu (ur), Bengali (bn), Swahili (sw).

Example bad config:

{
  "type": "transcription",
  "transcription_config": {
    "language": "en"
  },
  "language_identification_config": {
    "expected_languages": ["zz"]
  }
}

Response:

{
  "code": 400,
  "detail": "Job config JSON is invalid. Error: Language(s) [zz] are not supported for language id",
  "error": "Job rejected"
}

Language Not in Expected Languages List

If the predicted language is not one of your expected languages, the job will be rejected.

In this example the expected languages are German or Spanish, but the predicted language is English.