Speech to Text

Languages and models

Information about the wide array of languages Speechmatics supports transcription for

Operating points

Choose between two accuracy models when configuring your transcription session:

Standard — optimized for faster turnaround with strong accuracy. Recommended when speed and efficiency are your priorities
Enhanced — our highest-accuracy model with strong turnaround times. Recommended when precision is critical, and especially for complex audio (e.g. noisy environments, varied accents)

By default, the standard operating point is used. You can specify the enhanced operating point as a part of the transcription config. For example:

{
  "type": "transcription",
  "transcription_config": {
    "language": "en",
    "operating_point": "enhanced"    
  }
}

Transcription languages

To automatically identify the language in an audio file, use our Language Identification feature.

To dynamically update your system with the latest languages and features offered by Speechmatics, use our Feature Discovery endpoint.

Speechmatics supports the following languages. Your ability to use any or all of the languages will depend on what languages you are contracted to use.

Speechmatics takes a global-first approach to our languages. In a single language pack, we aim to support many different accents and dialects. This simplifies your workflow when selecting which language to use, not requiring you to know which accent is being spoken in your audio upfront. With this approach we still achieve very high accuracy compared to accent-specific language packs.

Language	Language Code	Description
Automatic	auto	Automatically detect the language using our Language Identification feature. Please note, this is currently only supported with Batch Transcriptions.
Arabic	ar	Our global Arabic gives high-accuracy transcription across many different accents and dialects including (but not limited to) Modern Standard Arabic (MSA) and Arabic spoken in the Gulf, Egypt and the Levant.
Arabic & English bilingual	ar_en	Ideal when transcribing Arabic and English in the same media file or stream. Supports all accents and dialects listed under Arabic and English.
Bashkir	ba
Basque	eu
Belarusian	be
Bengali	bn
Bulgarian	bg
Cantonese	yue
Catalan	ca
Croatian	hr
Czech	cs
Danish	da
Dutch	nl
English	en	Our global English gives high-accuracy transcription across many different accents including (but not limited to) English spoken in the United Kingdom, United States, Australia, New Zealand and non-native speakers. To standardise spelling, we recommend specifying the Output Locale.
Esperanto	eo
Estonian	et
Finnish	fi
French	fr	Our global French gives high-accuracy transcription across many different accents including (but not limited to) French spoken in France, Canada and Belgium.
Galician	gl
German	de	Our global German gives high-accuracy transcription across many different accents including (but not limited to) German spoken in Germany, Austria and Switzerland.
Greek	el
Hebrew	he
Hindi	hi
Hungarian	hu
Indonesian	id
Interlingua	ia
Irish	ga
Italian	it
Japanese	ja
Korean	ko
Latvian	lv
Lithuanian	lt
Malay	ms
Malay & English bilingual	en_ms	Ideal when transcribing Malay and English in the same media file or stream. Supports all accents and dialects listed under Malay and English.
Maltese	mt
Mandarin	cmn	Our global Mandarin can output Traditional or Simplified characters and gives high accuracy transcription across many different accents including (but not limited to) China, Taiwan, Singapore, Malaysia.
Mandarin & English bilingual	cmn_en	Ideal when transcribing Mandarin and English in the same media file or stream. Supports all accents and dialects listed under Mandarin and English.
Mandarin Malay Tamil & English multilingual	cmn_en_ms_ta	Ideal when transcribing Mandarin, Malay, Tamil and English in the same media file or stream. Supports all accents and dialects listed under Mandarin, Malay, Tamil and English.
Marathi	mr
Mongolian	mn
Norwegian	no
Persian	fa
Polish	pl
Portuguese	pt	Our global Portuguese gives high-accuracy transcription across many different accents including (but not limited to) Portuguese spoken in Portugal and Brazil.
Romanian	ro
Russian	ru
Slovakian	sk
Slovenian	sl
Spanish	es	Our global Spanish gives high-accuracy transcription across many different accents including (but not limited to) Spanish spoken in Spain, US, Mexico, Colombia, Argentina, Venezuela, Chile and Peru.
Spanish & English bilingual	es (with domain='bilingual-en')	Ideal when transcribing Spanish and English in the same media file or stream. Supports all accents and dialects listed under English and Spanish. Requires the domain config to be set.
Swahili	sw
Swedish	sv
Tagalog (Filipino) & English bilingual	tl	Ideal when transcribing Tagalog (Filipino) and English in the same media file or stream. Supports all accents and dialects listed under English.
Tamil	ta
Tamil & English bilingual	en_ta	Ideal when transcribing Tamil and English in the same media file or stream. Supports all accents and dialects listed under Tamil and English.
Thai	th
Turkish	tr
Ukrainian	uk
Urdu	ur
Uyghur	ug
Vietnamese	vi
Welsh	cy	Welsh must be explicitly added to the expected languages list when using our Language Identification feature, otherwise a language not supported for transcription error will be returned.

Each language above is uniquely identified by a two-letter code (ISO639-1) or three-letter code (ISO639-3) in API requests and responses.

Translation languages

Translation is supported for the majority of Speechmatics' languages. The supported translation pairs are listed below. For more details, see Translation.

Audio Language	Translation Target Language
English (en)	Bulgarian (bg), Catalan (ca), Mandarin (cmn), Czech (cs), Danish (da), German (de), Greek (el), Spanish (es), Estonian (et), Finnish (fi), French (fr), Galician (gl), Hindi (hi), Croatian (hr), Hungarian (hu), Indonesian (id), Italian (it), Japanese (ja), Korean (ko), Lithuanian (lt), Latvian (lv), Malay (ms), Dutch (nl), Norwegian (no), Polish (pl), Portuguese (pt), Romanian (ro), Russian (ru), Slovakian (sk), Slovenian (sl), Swedish (sv), Turkish (tr), Ukrainian (uk), Vietnamese (vi)
Bulgarian (bg), Catalan (ca), Mandarin (cmn), Czech (cs), Danish (da), German (de), Greek (el), Spanish (es), Estonian (et), Finnish (fi), French (fr), Galician (gl), Hindi (hi), Croatian (hr), Hungarian (hu), Indonesian (id), Italian (it), Japanese (ja), Korean (ko), Lithuanian (lt), Latvian (lv), Malay (ms), Dutch (nl), Norwegian (no), Polish (pl), Portuguese (pt), Romanian (ro), Russian (ru), Slovakian (sk), Slovenian (sl), Swedish (sv), Turkish (tr), Ukrainian (uk), Vietnamese (vi)	English (en)
Norwegian Bokmål (no)	Norwegian Nynorsk (nn)

Multilingual speech-to-text

These packs are ideal when transcribing multiple languages in the same media file or stream with high accuracy. For more information on the supported languages, please refer to Supported Language Packs.

Supported multilingual packs are:

Language Pack	Transcription config
Arabic and English	`{"language": "ar_en"}`
Malay and English	`{"language": "en_ms"}`
Mandarin and English	`{"language": "cmn_en"}`
Mandarin Malay Tamil and English	`{"language": "cmn_en_ms_ta"}`
Spanish and English	`{"language": "es", "domain": "bilingual-en"}`
Tamil and English	`{"language": "en_ta"}`
Tagalog (Filipino) and English	`{"language": "tl"}`

Bilingual (excluding Spanish and English) example:

{
  "type": "transcription",
  "transcription_config": {
    "language": "cmn_en"
  }
}

Bilingual Spanish and English example:

{
  "type": "transcription",
  "transcription_config": {
    "language": "es",
    "domain": "bilingual-en"
  }
}

Healthcare transcription

Speechmatics offers domain-specific medical transcription models which provide unparallelled accuracy for medical use cases such as ambient scribes and dictation tools.

These models are kept up to date using officially maintained data sources. This brings significant improvements in recognition of medical terminology such as names of procedures, medications, conditions, and anatomy.

Note that for languages without a medical transcription model, Speechmatics still offers industry-leading accuracy in the healthcare domain when using the general purpose enhanced operating point.

The medical domain-specific model must be used with the enhanced operating point.

Medical domain example:

{
  "type": "transcription",
  "transcription_config": {
    "language": "en",
    "operating_point": "enhanced",
    "domain": "medical"
  }
}

Language	Realtime	Batch
Arabic English	Available	Available
Danish	Available	Available
Dutch	Available	Available
English	Available	Available
Finnish	Available	Available
French	Available	Available
German	Available	Available
Norwegian	Available	Available
Spanish	Available	Available
Swedish	Available	Available
Additional languages	Contact us for more information

Operating points​

Transcription languages​

Translation languages​

Multilingual speech-to-text​

Healthcare transcription​

Operating points

Transcription languages

Translation languages

Multilingual speech-to-text

Healthcare transcription