This release provides new improved language packs for all Speechmatics' commercially available languages. Two new language packs Cantonese (yue)
and Indonesian (id)
are released. This release improves on existing punctuation in several of these languages, as well as existing custom dictionary features. Common concepts in 11 words called entities are now output in a consistent and predictable fashion. Additional data about these entities can be requested via the API. Transcript segments can flex in length when an entity is detected to ensure accurate output, but fixed behaviour can also now be requested using a new API parameter.
It is now necessary to use processors that support Advanced Vector Extensions 2 (AVX2) when running the container in order to take advantage of latest performance optimisations.
It is also recommended when using the enhanced model to use hardware that supports the AVX512_VNNI flag for optimal processing performance. For more information please see the quick start guide.
enable_entities
config parameter. For more information please see our documentation for entities heremax_delay_mode
max_delay_mode
defaults to flexible
which introduces a change in max delay behaviour to improve accuracy of entities. To maintain previous behaviour set max_delay_mode
to fixed
. The following are known issues in this release:
Issue ID | Summary | Detailed Description and Possible Workarounds |
---|---|---|
REQ-10634 | Putting "-" as an item in additional vocab configuration will cause the container to fail | Do not enter just a "-" on its own in Custom Dictionary either as an additional vocab item or in the sounds_like property . Hyphens are still supported when entered as part of phrases or words |
REQ-13240 | Chinese (cmn) container crashes occasionally when using certain additional vocabulary | Do not use whitespace characters in additional vocabulary sounds_like |
REQ-16256 | Audio Swapping between 8kHz and 16kHz causes memory leak | Repeatedly audio swapping between 8kHz and 16kHz files can cause an increase in memory over very long periods that causes the container to crash. If memory usage in this scenario becomes excessive it is recommended to restart the container |
The following is a list of any resolved issues within this release:
Issue ID | Summary | Resolution Description |
---|---|---|
REQ-17771 | Wide-space Unicode characters in Custom Dictionary cause a jobs to fail | This is now fixed and wide-spaced characters should be accepted |
These are the General Availability (GA) release notes for the Real-time ASR container images. Following languages are supported:
Language | ISO Code |
---|---|
Arabic | ar |
Bulgarian | bg |
Catalan | ca |
Mandarin | cmn |
Czech | cs |
Danish | da |
German | de |
Greek | el |
Global English | en |
Global Spanish | es |
Finnish | fi |
French | fr |
Hindi | hi |
Croatian | hr |
Hungarian | hu |
Indonesian | id |
Italian | it |
Japanese | ja |
Korean | ko |
Lithuanian | lt |
Latvian | lv |
Malay | ms |
Dutch | nl |
Norwegian | no |
Polish | pl |
Portuguese | pt |
Romanian | ro |
Russian | ru |
Slovakian | sk |
Slovenian | sl |
Swedish | sv |
Turkish | tr |
Cantonese | yue |
Container images are labelled using the following scheme, where language codes adhere the ISO-639 standard:
rt-asr-transcriber-<language>:<version>
For example,
rt-asr-transcriber-en:2.0.0
Docker 17.06.0+
Pull the container image from the Speechmatics Docker registry.