Batch SaaS Release Notes

This page documents updates to Batch SaaS including information about new or updated features, bug fixes, known issues, and deprecated functionality.

2024-08-12

New

New languages: Irish (ga) and Maltese (mt)

Improvements

Ursa2 models released, giving a broad accuracy uplift across languages:

Enhanced operating point: all languages, including a major improvement for Arabic dialects
Standard operating point: Basque (eu), Estonian (et), Polish (pl), Swedish (sv), Tamil (ta), Turkish (tr), Uyghur (ug)

2024-07-31

Fixes

Fix for occasional incorrect repetition of words in transcription output

2024-07-24

Removed

The legacy Speaker Change Detection feature is now obsolete. Any jobs using the speaker_change and channel_and_speaker_change parameters will be rejected

Fixes

Written form for negative percentages in German transcription is now output as "%" instead of "Prozent"

2024-07-01

Improvements

Improved transcription of music lyrics and a decrease in missing spoken words in transcripts

2024-06-25

Improvements

Initial improvements from our Ursa2 accuracy uplift; note that further improvements are on the way in the next few weeks
- Improved transcription accuracy and updated vocabulary for 31 languages (Enhanced Operating Point only): Bashkir (ba), Basque (eu), Belarusian (be), Bulgarian (bg), Cantonese (yue), Catalan (ca), Danish (da), Esperanto (eo), Estonian (et), Finnish (fi), French (fr), Galician (gl), Greek (el), Hindi (hi), Indonesian (id), Interlingua (ia), Japanese (ja), Korean (ko), Latvian (lv), Malay (ms), Marathi (mr), Mongolian (mn), Norwegian (no), Romanian (ro), Slovenian (sl), Spanish (es), Swedish (sv), Turkish (tr), Ukrainian (uk), Uyghur (ug), Vietnamese (vi)
- Updated vocabulary for English (Enhanced Operating Point only)
Improved music detection accuracy in Audio Events

Fixes

Fix for occasional incorrect repetition of words in transcription output when Audio Events is enabled
Fix for missing volume tags in output when Audio Filtering is enabled along with Speaker Diarization

2024-06-18

New

Audio Filtering: pre-process audio to remove low-volume background speech which might otherwise be detected and transcribed. Refer to documentation here to get started
Disfluency removal: automatically remove disfluencies from your transcript. Refer to documentation here to get started

2024-05-01

New

Chapters now supports all 50 languages supported by our Batch SaaS transcription API

Improvements

Chapters are now processed in the same geographic region as the Batch SaaS endpoint where the job was submitted (EU or USA)

2024-04-19

New

Audio Events: Detection of music, laughter and applause in media files now supported. Refer to documentation here to get started

Improvements

Accuracy improvements for Romanian (ro)

Fixes

Fix for issue affecting recognition of English words ending in 'erm'
Fix to address accuracy degradations observed in a small number of transcriptions

2024-02-20

New

New transcription language: Hebrew (he)
Automatic Language ID support for Hebrew
Increased language support for Summaries: Now Summarizing all 50 languages supported by our Batch SaaS transcription API (including Hebrew)

Improvements

Improved Summaries: In addition to overall quality improvements, our Summaries now capture a lot more factual data such as names, emails, phone numbers, addresses, dates & prices
Improved accuracy when transcribing audio with periods of silence

Fixes

Fix for profanity tagging in bilingual Spanish & English
Fixes for specific transcription accuracy issues in English, German, Swedish and Norwegian

2023 Release notes

2023-12-20

Spanish & English bilingual transcription now available (language='es' with domain='bilingual-en'). Ideal when transcribing Spanish and English in the same media file or stream. Supports all accents and dialects listed under English and Spanish. Requires the domain config to be set.

2023-12-07

Improved models for English transcription (Standard and Enhanced operating points):
- Enhanced transcription of disfluencies in English. The model now more accurately captures common disfluencies like "um" and "uh". This change makes our ASR even more accurate for verbatim transcription, great for use cases such as audio editing, analytics on hesitations for call centers and legal transcription. For details on how to identify disfluencies in output, see the documentation here
- More accurate transcription of short utterances of the word "I" in English
- More accurate transcription of acronyms in English
Channel Diarization now supports up to 100 separate input channels

2023-11-02

Introducing Chapters, which is the newest addition to our Speech Capabilities. As part of a transcription request you can now request for automatic detection of chapters. For each chapter, we publish the start & end times, an auto-generated title & a short summary. Refer to documentation here to get started

2023-10-24

New transcription language: Persian (fa)
Automatic Language ID support for Persian

2023-10-20

Ursa models for Bashkir (ba), Basque (eu), Belarusian (be), Bulgarian (bg), Croatian (hr), Esperanto (eo), Estonian (et), Galician (gl), Interlingua (ia), Indonesian (id), Latvian (lv), Lithuanian (lt), Marathi (mr), Mongolian (mn), Romanian (ro), Slovakian (sk), Slovenian (sl), Tamil (ta), Turkish (tr), Uyghur (ug), Ukrainian (uk), Welsh (cy). Ursa models are now available for all 48 supported languages, bringing improvements to both Standard and Enhanced operating points:
- Major transcription accuracy gains
- Major improvement in Speaker Diarization accuracy
- Faster transcription

2023-10-19

Ursa models for Cantonese (yue), Catalan (ca), Czech (cs), Finnish (fi), Greek (el), Hindi (hi), Hungarian (hu), Malay (ms), Polish (pl), Russian (ru), Thai (th), Vietnamese (vi), bringing improvements to both Standard and Enhanced operating points:
- Major transcription accuracy gains
- Major improvement in Speaker Diarization accuracy
- Faster transcription

2023-10-16

Ursa models for Arabic (ar), Danish (da), Dutch (nl), Italian (it), Japanese (ja), Korean (ko), Mandarin (cmn), Norwegian (no), Portuguese (pt), Swedish (sv), bringing improvements to both Standard and Enhanced operating points:
- Major transcription accuracy gains
- Major improvement in Speaker Diarization accuracy
- Faster transcription

2023-10-12

Introducing Topics, which extends our API for speech understanding. As part of a transcription request you can now get insights into the topics discussed in your media, or even supply your own list of topics to detect. Refer to documentation here to get started
Improved English transcription accuracy around capitalization
Transcription accuracy improvements for German (including Swiss and Austrian) and French (including French Canadian)

2023-10-11

New configuration options for Automatic Language Identification for more flexible handling of errors. When submitting transcription jobs you can now
- Use a default language to transcribe with if there is a LOW_CONFIDENCE or NO_SPEECH error
- Ignore LOW_CONFIDENCE errors and pick the top predicted language to transcribe with

2023-09-20

Improved accuracy for translation of short sentences
Updated the translation failure behaviour to enable the transcription job to complete. Refer to documentation here for the new behaviour
Resolved an issue with transcription accuracy for a small number of files when using the Enhanced Operating Point

2023-09-07

Improved accuracy for Automatic Language Identification

2023-08-29

Major accuracy gains for French and German transcription (Standard and Enhanced operating points)
Major improvement in Speaker Diarization accuracy for French and German (Standard and Enhanced operating points)
Faster transcription for French and German (Standard and Enhanced operating points)

2023-08-01

Introducing Sentiment, which extends our API for speech understanding. As part of a transcription request you can now get insight into the sentiment of what was spoken in your media, providing an understanding of whether the expressed sentiment is positive, negative, or neutral. Refer to documentation here to get started
The JSON-v2 output version is now 2.9 for all jobs

2023-07-12

Improved speaker diarization accuracy for noisy audio (English and Spanish only, Standard and Enhanced operating points)
Faster transcription for audio files with duration over 5 minutes

2023-06-19

Fix for transcribed words returned during non-speech audio when custom dictionary is used

2023-06-05

Introducing Summaries. As part of your transcription request you can now generate a concise summary of your audio, including for meetings, calls, lectures, interviews, and podcasts. Refer to documentation here to get started

2023-05-26

Major improvement in Speaker Diarization accuracy for English and Spanish (Standard and Enhanced operating points)
Major accuracy gains for Spanish transcription (Standard and Enhanced operating points)
Faster transcription for Spanish (Standard and Enhanced operating points)
Improved transcription accuracy for Basque, Belarusian, Estonian, Mongolian, Thai, Vietnamese, and Welsh
Improvements to capitalization for English transcription
Fix for zero-duration word timings

2023-05-11

Resolved an issue where decades from "twenties" to "nineties" could be incorrectly transcribed in some contexts
Fix for mismatch between word timings for spoken and written forms when entity metadata is enabled

2023-03-02

Major accuracy gains for English transcription (Standard and Enhanced operating points)
Improved Speaker Diarization accuracy for English (Standard and Enhanced operating points)
Improved numeral formatting in English
- Improved formatting for common telephone numbers, measurements, websites, email addresses and credit cards
- Alphanumerics now have upper-case letters
- Added regional handling for en-AU and en-US output locale to keep 'pounds' as words
- A number of other improvements and fixes for better readability
Faster transcription for English using the Standard Operating Point
Resolved an issue where words would occasionally be fully upper-cased

2023-02-28

Introducing Automatic Language Identification, enabling you to automatically transcribe your audio in the correct language. Refer to documentation here to get started

2023-02-07

Fix for missing accented characters in Dutch transcription

2023-01-31

Introducing the new Translation API, tightly integrated with the Transcription API. Translate your audio to one or more languages through a single API call. Refer to documentation here to get started
Translation will be offered at no additional cost until 30th April 2023.
Translate speech to and from English for 34 languages
Translate from Norwegian Bokmål to Nynorsk
The JSON-v2 output version is now 2.9 for jobs with translation enabled

2022 Release notes

2022-11-30

New Batch SaaS environment AU1 (au1.asr.api.speechmatics.com) hosted in Australia. Refer to documentation here for details of all supported endpoints and regions
New egress IP addresses to allow notifications for new AU1 environment. Refer to documentation here for more details

2022-11-09

Language vocabulary improvements for French (fr), Italian (it), Hindi (hi), and Korean (ko)
Remodelled German (de) language pack to utilize subwords, separating words into smaller segments to reduce word error rate

2022-09-28

Language vocabulary improvements for Latvian (lv), Swedish (sv), Hungarian (hu), Portuguese (pt), Polish (pl), Mandarin Chinese (cmn), Arabic (ar), Dutch (nl), Slovak (sk), Bulgarian (bg), Romanian (ro), Slovenian (sl), Lithuanian (It), Croatian (hr), Malay (ms), Catalan (ca), Czech (cs), Danish (da), Greek (el), Turkish (tr)
Improved formatting of numeric entities such as dates, currencies and large numbers for Swedish (sv), Norwegian (no), and Dutch (nl)
The JSON-v2 output version is now 2.8, specific changes are:
- Additional language pack information has been added to the metadata section of the transcription results. There is now more detailed information about properties of the language being used, such as writing direction and word delimiter.
- We now also record the correct attachment direction for punctuation (e.g. before or after a space) in a new attaches_to field.

2022-09-05

14 new languages: Bashkir, Basque, Belarusian, Esperanto, Estonian, Galician, Interlingua, Marathi, Mongolian, Tamil, Thai, Uyghur, Vietnamese, and Welsh
Resolved an issue where the French word où (where) is recognised as ou (or)

2022-08-11

New SaaS environments EU2 (eu2.asr.api.speechmatics.com) and US2 (us2.asr.api.speechmatics.com) available in EU and US regions respectively. Refer to documentation here for more details
New egress IP addresses to allow notifications for new EU2 and US2 environments. Refer to documentation here for more details

2022-06-21

New English finance domain language pack. Provides accuracy improvements when specific financial jargon is spoken in your audio. Refer to documentation here for more details
16 Languages updated with additional punctuation marks for improved readability
- The following languages now support (. ? , !): Bulgarian, Catalan, Czech, Greek, Finnish, Croatian, Hungarian, Lithuanian, Latvian, Norwegian, Polish, Romanian, Slovak, Slovenian, Ukrainian, Korean
Improved accuracy for French, including more data for Canadian French (fr-ca)
Improved accuracy for Portuguese, including more data for Brazilian Portuguese (pt-br)
Standard Operating Point improved accuracy for Romanian, Hungarian, Danish, Slovakian, Croatian, Bulgarian, Finnish, Slovenian, Lithuanian
Updated Danish, Norwegian and Swedish to remove undesired character sets
Improved accuracy in localised spelling for English output locale feature
Fixes for English and Italian written form numeric entities
Improved accuracy of percentage symbol recognition in French

2022-05-25

New parameter added for controlling Speaker Diarization sensitivity: speaker_sensitivity. Refer to our documentation here documentation here for more details
New Ukrainian (uk) language pack
Resolved an issue where a small number of files with multiple audio channels were mistakenly detected as containing inverted audio, which lead to no transcription being returned. The check for inverted audio is now more robust.

2022-03-24

Resolves an issue where Profanity and Disfluency Tagging were not output when Speaker Diarization was enabled

2022-03-16

Improved accuracy for all 31 language packs. Gains will be for both Standard and Enhanced operating points
- Biggest gains: Danish, Dutch, Norwegian, Lithuanian and Turkish
New Cantonese (yue) and Indonesian (id) language packs
Improved formatting of numeric entities such as dates, currencies and large numbers for 10 languages (cmn, de, en, es, fr, hi, it, ja, pt, ru, yue). Additional metadata about these entities can be requested by using the new enable_entities config parameter. For more information please see documentation here.
Improvements to Speaker Diarization functionality in scenarios where two speakers are labelled when it is only a single speaker
Improvements to custom dictionary functionality. Custom dictionary entries should now have less false positives
Languages updated with additional punctuation marks
- Japanese (。、)
- Italian (. ? , !)
- Portuguese (. ? , !)
- Russian (. ? , !)
- Mandarin (。？！、)
- Hindi (। ? , !)
The JSON-v2 output version is now 2.7
Non-breaking spaces are now possible in a single word
Speaker Diarization sensitivity parameters (previously deprecated in March 2021) are now removed from the API
- Jobs will now be rejected if these parameters are included in the job config
- This includes speaker_diarization_params, new_speaker_sensitivity, segment_boundary_sensitivity

2021 Release notes

2021-12-13

New resource allowing you to retrieve details of your SaaS usage. For more information please see documentation here.
New resource allowing you to retrieve details of your SaaS usage. For more information please see documentation here.
Option to cancel and delete a running job
Updated IP address allowlist

2021-09-07

Enhanced model available for all 31 language packs
- Please contact your account manager if you would like access to the Enhanced model
General improvements in pop culture terms recognition for the English language pack
Removal of foreign characters from English and German language packs

2021-08-23

New language packs for all 31 language model. By default, a language pack will contain a Standard and Enhanced model for all 31 languages. The Standard model is now available to use, with no user change required. The Enhanced model will be released in September. Please see the API how-to guide for how to request the Enhanced model to prepare your integration in advance
Profanity tagging in Italian and Spanish
The Chinese Mandarin language pack now supports Traditional as well as Simplified Mandarin. Please see 'Configuring the Job Request' for guidelines of how to do so.

2021-08-10

Error information added in API response for Fetch URL and Notification failures

2021-03-24

Improved Speaker Diarization
- Speaker Diarization has been completely re-designed internally and should now be significantly more accurate
- Instead of gendered speaker labels (M1, F2) speaker labels will be now (S1, S2 etc.) in the json-v2 and txt output. Speaker gender identification is no longer a supported feature
- If requesting an output in txt format, and requesting no diarization, there will be no Speaker:UU at the start of a transcript
- Users may still request Speaker Diarization as before via the configuration object
- Beta sensitivity parameters will be removed. The parameters will remain within the API but will not have any effect
- This update to Speaker Diarization feature can mean the turnaround time for your transcript will in some cases take longer
Improved Swedish and Arabic language packs, both now have advanced punctuation enabled (Swedish supports . ? , ! and Arabic supports . ؟ ، !)
For the English language pack only, a new tag, [disfluency] has been added to a pre-set list of words that imply hesitation or interjection in the JSON-v2 output only. Examples include 'hmm' and 'umm'. Customers may use this tag to carry out their own post-processing
The json-v2 API schema has been updated to v2.6

Known Issues

Issue ID	Summary	Detailed Description and Possible Workarounds
REQ-20261	The Japanese language pack may output fewer punctuation marks in certain scenarios	In some cases, users may see a decreased output in punctuation marks when transcribing in Japanese. Please report this if this is the case

Batch SaaS Release Notes

2024-08-12​

2024-07-31​

2024-07-24​

2024-07-01​

2024-06-25​

2024-06-18​

2024-05-01​

2024-04-19​

2024-02-20​

2023 Release notes​

2023-12-20​

2023-12-07​

2023-11-02​

2023-10-24​

2023-10-20​

2023-10-19​

2023-10-16​

2023-10-12​

2023-10-11​

2023-09-20​

2023-09-07​

2023-08-29​

2023-08-01​

2023-07-12​

2023-06-19​

2023-06-05​

2023-05-26​

2023-05-11​

2023-03-02​

2023-02-28​

2023-02-07​

2023-01-31​

2022 Release notes​

2022-11-30​

2022-11-09​

2022-09-28​

2022-09-05​

2022-08-11​

2022-06-21​

2022-05-25​

2022-03-24​

2022-03-16​

2021 Release notes​

2021-12-13​

2021-09-07​

2021-08-23​

2021-08-10​

2021-03-24​

Known Issues​

2024-08-12

2024-07-31

2024-07-24

2024-07-01

2024-06-25

2024-06-18

2024-05-01

2024-04-19

2024-02-20

2023 Release notes

2023-12-20

2023-12-07

2023-11-02

2023-10-24

2023-10-20

2023-10-19

2023-10-16

2023-10-12

2023-10-11

2023-09-20

2023-09-07

2023-08-29

2023-08-01

2023-07-12

2023-06-19

2023-06-05

2023-05-26

2023-05-11

2023-03-02

2023-02-28

2023-02-07

2023-01-31

2022 Release notes

2022-11-30

2022-11-09

2022-09-28

2022-09-05

2022-08-11

2022-06-21

2022-05-25

2022-03-24

2022-03-16

2021 Release notes

2021-12-13

2021-09-07

2021-08-23

2021-08-10

2021-03-24

Known Issues