Supported Languages
Transcription:BatchReal-TimeDeployments:AllThis page lists the range of languages supported by Speechmatics. For more information on how to use these, please refer to the guide on Accuracy and Language
To automatically identify the language in an audio file, use our Language Identification feature.
To dynamically update your system with the latest languages and features offered by Speechmatics, use our Feature Discovery endpoint.
Languages
Speechmatics supports the following languages. Your ability to use any or all of the languages will depend on what languages you are contracted to use.
Speechmatics takes a global-first approach to our languages. In a single language pack, we aim to support many different accents and dialects. This simplifies your workflow when selecting which language to use, not requiring you to know which accent is being spoken in your audio upfront. With this approach we still achieve very high accuracy compared to accent-specific language packs.
Language | Language Code | Description |
---|---|---|
Automatic | auto | Automatically detect the language using our Language Identification feature. |
Arabic | ar | Our global Arabic gives high-accuracy transcription across many different accents and dialects including (but not limited to) Modern Standard Arabic (MSA) and Arabic spoken in the Gulf, Egypt and the Levant. |
Bashkir | ba | |
Basque | eu | |
Belarusian | be | |
Bulgarian | bg | |
Cantonese | yue | |
Catalan | ca | |
Croatian | hr | |
Czech | cs | |
Danish | da | |
Dutch | nl | |
English | en | Our global English gives high-accuracy transcription across many different accents including (but not limited to) English spoken in the United Kingdom, United States, Australia, New Zealand and non-native speakers. |
Esperanto | eo | |
Estonian | et | |
Finnish | fi | |
French | fr | Our global French gives high-accuracy transcription across many different accents including (but not limited to) French spoken in France, Canada and Belgium. |
Galician | gl | |
German | de | Our global German gives high-accuracy transcription across many different accents including (but not limited to) German spoken in Germany, Austria and Switzerland. |
Greek | el | |
Hebrew | he | |
Hindi | hi | |
Hungarian | hu | |
Irish | ga | |
Interlingua | ia | |
Italian | it | |
Indonesian | id | |
Japanese | ja | |
Korean | ko | |
Latvian | lv | |
Lithuanian | lt | |
Maltese | mt | |
Malay | ms | |
Mandarin | cmn | Our global Mandarin can output Traditional or Simplified characters and gives high accuracy transcription across many different accents including (but not limited to) China, Taiwan, Singapore, Malaysia. |
Marathi | mr | |
Mongolian | mn | |
Norwegian | no | |
Persian | fa | |
Polish | pl | |
Portuguese | pt | Our global Portuguese gives high-accuracy transcription across many different accents including (but not limited to) Portuguese spoken in Portugal and Brazil. |
Romanian | ro | |
Russian | ru | |
Slovakian | sk | |
Slovenian | sl | |
Spanish | es | Our global Spanish gives high-accuracy transcription across many different accents including (but not limited to) Spanish spoken in Spain, US, Mexico, Colombia, Argentina, Venezuela, Chile and Peru. |
Spanish & English bilingual | es (with domain='bilingual-en') | Ideal when transcribing Spanish and English in the same media file or stream. Supports all accents and dialects listed under English and Spanish. Requires the domain config to be set. |
Swedish | sv | |
Tamil | ta | |
Thai | th | |
Turkish | tr | |
Urdu | ur | |
Uyghur | ug | |
Ukrainian | uk | |
Vietnamese | vi | |
Welsh | cy | Welsh must be explicitly added to the expected languages list when using our Language Identification feature, otherwise a language not supported for transcription error will be returned. |
Each language above is uniquely identified by a two-letter code (ISO639-1) or three-letter code (ISO639-3) in API requests and responses.
Domain Language
The Speechmatics SaaS supports specialized language packs that enhance the requested transcription language with optimization for a particular field through domains. The domain packs build on our global languages to give an extra boost to accuracy in specific areas. How to use domain config.
Domain | Supported Languages | Description |
---|---|---|
bilingual-en | es | Support transcribing bilingual Spanish and English content in the same media file or stream. |
finance | en | Improve accuracy for audio containing financial terms such as those found in earnings calls or financial broadcast |
Translation Languages
Translation is supported for the majority of Speechmatics' languages. The supported translation pairs are listed below. For more details, see Translation.
Audio Language | Translation Target Language |
---|---|
English (en) | Bulgarian (bg), Catalan (ca), Mandarin (cmn), Czech (cs), Danish (da), German (de), Greek (el), Spanish (es), Estonian (et), Finnish (fi), French (fr), Galician (gl), Hindi (hi), Croatian (hr), Hungarian (hu), Indonesian (id), Italian (it), Japanese (ja), Korean (ko), Lithuanian (lt), Latvian (lv), Malay (ms), Dutch (nl), Norwegian (no), Polish (pl), Portuguese (pt), Romanian (ro), Russian (ru), Slovakian (sk), Slovenian (sl), Swedish (sv), Turkish (tr), Ukrainian (uk), Vietnamese (vi) |
Bulgarian (bg), Catalan (ca), Mandarin (cmn), Czech (cs), Danish (da), German (de), Greek (el), Spanish (es), Estonian (et), Finnish (fi), French (fr), Galician (gl), Hindi (hi), Croatian (hr), Hungarian (hu), Indonesian (id), Italian (it), Japanese (ja), Korean (ko), Lithuanian (lt), Latvian (lv), Malay (ms), Dutch (nl), Norwegian (no), Polish (pl), Portuguese (pt), Romanian (ro), Russian (ru), Slovakian (sk), Slovenian (sl), Swedish (sv), Turkish (tr), Ukrainian (uk), Vietnamese (vi) | English (en) |
Norwegian Bokmål (no) | Norwegian Nynorsk (nn) |
Speechmatics languages currently not supporting translation: Arabic, Bashkir, Belarusian, Welsh, Esperanto, Basque, Hebrew, Interlingua, Irish, Maltese, Mongolian, Marathi, Tamil, Thai, Uyghur, Cantonese.