Skip to main content

Batch SaaS Release Notes

This page documents updates to Batch SaaS including information about new or updated features, bug fixes, known issues, and deprecated functionality.

2024-03-28

  • Audio Events: Detection of music, laughter and applause in media files now supported. Refer to documentation here to get started

2024-02-20

  • New transcription language: Hebrew (he)
  • Automatic Language ID support for Hebrew
  • Increased language support for Summaries: Now Summarizing all 50 languages supported by our Batch SaaS transcription API (including Hebrew)
  • Improved Summaries: In addition to overall quality improvements, our Summaries now capture a lot more factual data such as names, emails, phone numbers, addresses, dates & prices
  • Improved accuracy when transcribing audio with periods of silence
  • Fix for profanity tagging in bilingual Spanish & English
  • Fixes for specific transcription accuracy issues in English, German, Swedish and Norwegian

2023-12-20

  • Spanish & English bilingual transcription now available (language='es' with domain='bilingual-en'). Ideal when transcribing Spanish and English in the same media file or stream. Supports all accents and dialects listed under English and Spanish. Requires the domain config to be set.

2023-12-07

  • Improved models for English transcription (Standard and Enhanced operating points):
    • Enhanced transcription of disfluencies in English. The model now more accurately captures common disfluencies like "um" and "uh". This change makes our ASR even more accurate for verbatim transcription, great for use cases such as audio editing, analytics on hesitations for call centers and legal transcription. For details on how to identify disfluencies in output, see the documentation here
    • More accurate transcription of short utterances of the word "I" in English
    • More accurate transcription of acronyms in English
  • Channel Diarization now supports up to 100 separate input channels

2023-11-02

  • Introducing Chapters, which is the newest addition to our Speech Capabilities. As part of a transcription request you can now request for automatic detection of chapters. For each chapter, we publish the start & end times, an auto-generated title & a short summary. Refer to documentation here to get started

2023-10-24

  • New transcription language: Persian (fa)
  • Automatic Language ID support for Persian

2023-10-20

  • Ursa models released for Bashkir (ba), Basque (eu), Belarusian (be), Bulgarian (bg), Croatian (hr), Esperanto (eo), Estonian (et), Galician (gl), Interlingua (ia), Indonesian (id), Latvian (lv), Lithuanian (lt), Marathi (mr), Mongolian (mn), Romanian (ro), Slovakian (sk), Slovenian (sl), Tamil (ta), Turkish (tr), Uyghur (ug), Ukrainian (uk), Welsh (cy). Ursa models are now available for all 48 supported languages, bringing improvements to both Standard and Enhanced operating points:
    • Major transcription accuracy gains
    • Major improvement in Speaker Diarization accuracy
    • Faster transcription

2023-10-19

  • Ursa models for Cantonese (yue), Catalan (ca), Czech (cs), Finnish (fi), Greek (el), Hindi (hi), Hungarian (hu), Malay (ms), Polish (pl), Russian (ru), Thai (th), Vietnamese (vi), bringing improvements to both Standard and Enhanced operating points:
    • Major transcription accuracy gains
    • Major improvement in Speaker Diarization accuracy
    • Faster transcription

2023-10-16

  • Ursa models for Arabic (ar), Danish (da), Dutch (nl), Italian (it), Japanese (ja), Korean (ko), Mandarin (cmn), Norwegian (no), Portuguese (pt), Swedish (sv), bringing improvements to both Standard and Enhanced operating points:
    • Major transcription accuracy gains
    • Major improvement in Speaker Diarization accuracy
    • Faster transcription

2023-10-12

  • Introducing Topics, which extends our API for speech understanding. As part of a transcription request you can now get insights into the topics discussed in your media, or even supply your own list of topics to detect. Refer to documentation here to get started
  • Improved English transcription accuracy around capitalization
  • Transcription accuracy improvements for German (including Swiss and Austrian) and French (including French Canadian)

2023-10-11

  • New configuration options for Automatic Language Identification for more flexible handling of errors. When submitting transcription jobs you can now
    • Use a default language to transcribe with if there is a LOW_CONFIDENCE or NO_SPEECH error
    • Ignore LOW_CONFIDENCE errors and pick the top predicted language to transcribe with

2023-09-20

  • Improved accuracy for translation of short sentences
  • Updated the translation failure behaviour to enable the transcription job to complete. Refer to documentation here for the new behaviour
  • Resolved an issue with transcription accuracy for a small number of files when using the Enhanced operating point

2023-09-07

  • Improved accuracy for Automatic Language Identification

2023-08-29

  • Major accuracy gains for French and German transcription (Standard and Enhanced operating points)
  • Major improvement in Speaker Diarization accuracy for French and German (Standard and Enhanced operating points)
  • Faster transcription for French and German (Standard and Enhanced operating points)

2023-08-01

  • Introducing Sentiment, which extends our API for speech understanding. As part of a transcription request you can now get insight into the sentiment of what was spoken in your media, providing an understanding of whether the expressed sentiment is positive, negative, or neutral. Refer to documentation here to get started
  • The JSON-v2 output version is now 2.9 for all jobs

2023-07-12

  • Improved speaker diarization accuracy for noisy audio (English and Spanish only, Standard and Enhanced operating points)
  • Faster transcription for audio files with duration over 5 minutes

2023-06-19

  • Fix for transcribed words returned during non-speech audio when custom dictionary is used

2023-06-05

  • Introducing Summaries. As part of your transcription request you can now generate a concise summary of your audio, including for meetings, calls, lectures, interviews, and podcasts. Refer to documentation here to get started

2023-05-26

  • Major improvement in Speaker Diarization accuracy for English and Spanish (Standard and Enhanced operating points)
  • Major accuracy gains for Spanish transcription (Standard and Enhanced operating points)
  • Faster transcription for Spanish (Standard and Enhanced operating points)
  • Improved transcription accuracy for Basque, Belarusian, Estonian, Mongolian, Thai, Vietnamese, and Welsh
  • Improvements to capitalization for English transcription
  • Fix for zero-duration word timings

2023-05-11

  • Resolved an issue where decades from "twenties" to "nineties" could be incorrectly transcribed in some contexts
  • Fix for mismatch between word timings for spoken and written forms when entity metadata is enabled

2023-03-02

  • Major accuracy gains for English transcription (Standard and Enhanced operating points)
  • Improved Speaker Diarization accuracy for English (Standard and Enhanced operating points)
  • Improved numeral formatting in English
    • Improved formatting for common telephone numbers, measurements, websites, email addresses and credit cards
    • Alphanumerics now have upper-case letters
    • Added regional handling for en-AU and en-US output locale to keep 'pounds' as words
    • A number of other improvements and fixes for better readability
  • Faster transcription for English using the Standard operating point
  • Resolved an issue where words would occasionally be fully upper-cased

2023-02-28

  • Introducing Automatic Language Identification, enabling you to automatically transcribe your audio in the correct language. Refer to documentation here to get started

2023-02-07

  • Fix for missing accented characters in Dutch transcription

2023-01-31

  • Introducing the new Translation API, tightly integrated with the Transcription API. Translate your audio to one or more languages through a single API call. Refer to documentation here to get started
  • Translation will be offered at no additional cost until 30th April 2023.
  • Translate speech to and from English for 34 languages
  • Translate from Norwegian Bokmål to Nynorsk
  • The JSON-v2 output version is now 2.9 for jobs with translation enabled

2022 Release notes

2022-11-30

  • New Batch SaaS environment AU1 (au1.asr.api.speechmatics.com) hosted in Australia. Refer to documentation here for details of all supported endpoints and regions
  • New egress IP addresses to allow notifications for new AU1 environment. Refer to documentation here for more details

2022-11-09

  • Language vocabulary improvements for French (fr), Italian (it), Hindi (hi), and Korean (ko)
  • Remodelled German (de) language pack to utilize subwords, separating words into smaller segments to reduce word error rate

2022-09-28

  • Language vocabulary improvements for Latvian (lv), Swedish (sv), Hungarian (hu), Portuguese (pt), Polish (pl), Mandarin Chinese (cmn), Arabic (ar), Dutch (nl), Slovak (sk), Bulgarian (bg), Romanian (ro), Slovenian (sl), Lithuanian (It), Croatian (hr), Malay (ms), Catalan (ca), Czech (cs), Danish (da), Greek (el), Turkish (tr)
  • Improved formatting of numeric entities such as dates, currencies and large numbers for Swedish (sv), Norwegian (no), and Dutch (nl)
  • The JSON-v2 output version is now 2.8, specific changes are:
    • Additional language pack information has been added to the metadata section of the transcription results. There is now more detailed information about properties of the language being used, such as writing direction and word delimiter.
    • We now also record the correct attachment direction for punctuation (e.g. before or after a space) in a new attaches_to field.

2022-09-05

  • 14 new languages: Bashkir, Basque, Belarusian, Esperanto, Estonian, Galician, Interlingua, Marathi, Mongolian, Tamil, Thai, Uyghur, Vietnamese, and Welsh
  • Resolved an issue where the French word (where) is recognised as ou (or)

2022-08-11

  • New SaaS environments EU2 (eu2.asr.api.speechmatics.com) and US2 (us2.asr.api.speechmatics.com) available in EU and US regions respectively. Refer to documentation here for more details
  • New egress IP addresses to allow notifications for new EU2 and US2 environments. Refer to documentation here for more details

2022-06-21

  • New English finance domain language pack. Provides accuracy improvements when specific financial jargon is spoken in your audio. Refer to documentation here for more details
  • 16 Languages updated with additional punctuation marks for improved readability
    • The following languages now support (. ? , !): Bulgarian, Catalan, Czech, Greek, Finnish, Croatian, Hungarian, Lithuanian, Latvian, Norwegian, Polish, Romanian, Slovak, Slovenian, Ukrainian, Korean
  • Improved accuracy for French, including more data for Canadian French (fr-ca)
  • Improved accuracy for Portuguese, including more data for Brazilian Portuguese (pt-br)
  • Standard operating point improved accuracy for Romanian, Hungarian, Danish, Slovakian, Croatian, Bulgarian, Finnish, Slovenian, Lithuanian
  • Updated Danish, Norwegian and Swedish to remove undesired character sets
  • Improved accuracy in localised spelling for English output locale feature
  • Fixes for English and Italian written form numeric entities
  • Improved accuracy of percentage symbol recognition in French

2022-05-25

  • New parameter added for controlling Speaker Diarization sensitivity: speaker_sensitivity. Refer to our documentation here documentation here for more details
  • New Ukrainian (uk) language pack
  • Resolved an issue where a small number of files with multiple audio channels were mistakenly detected as containing inverted audio, which lead to no transcription being returned. The check for inverted audio is now more robust.

2022-03-24

  • Resolves an issue where Profanity and Disfluency Tagging were not output when Speaker Diarization was enabled

2022-03-16

  • Improved accuracy for all 31 language packs. Gains will be for both Standard and Enhanced operating points
    • Biggest gains: Danish, Dutch, Norwegian, Lithuanian and Turkish
  • New Cantonese (yue) and Indonesian (id) language packs
  • Improved formatting of numeric entities such as dates, currencies and large numbers for 10 languages (cmn, de, en, es, fr, hi, it, ja, pt, ru, yue). Additional metadata about these entities can be requested by using the new enable_entities config parameter. For more information please see documentation here.
  • Improvements to Speaker Diarization functionality in scenarios where two speakers are labelled when it is only a single speaker
  • Improvements to custom dictionary functionality. Custom dictionary entries should now have less false positives
  • Languages updated with additional punctuation marks
    • Japanese (。 、)
    • Italian (. ? , !)
    • Portuguese (. ? , !)
    • Russian (. ? , !)
    • Mandarin (。 ? ! 、)
    • Hindi (। ? , !)
  • The JSON-v2 output version is now 2.7
  • Non-breaking spaces are now possible in a single word
  • Speaker Diarization sensitivity parameters (previously deprecated in March 2021) are now removed from the API
    • Jobs will now be rejected if these parameters are included in the job config
    • This includes speaker_diarization_params, new_speaker_sensitivity, segment_boundary_sensitivity

2021 Release notes

2021-12-13

  • New resource allowing you to retrieve details of your SaaS usage. For more information please see documentation here.
  • New resource allowing you to retrieve details of your SaaS usage. For more information please see documentation here.
  • Option to cancel and delete a running job
  • Updated IP address allowlist

2021-09-07

  • Enhanced model available for all 31 language packs
    • Please contact your account manager if you would like access to the Enhanced model
  • General improvements in pop culture terms recognition for the English language pack
  • Removal of foreign characters from English and German language packs

2021-08-23

  • New language packs for all 31 language model. By default, a language pack will contain a Standard and Enhanced model for all 31 languages. The Standard model is now available to use, with no user change required. The Enhanced model will be released in September. Please see the API how-to guide for how to request the Enhanced model to prepare your integration in advance
  • Profanity tagging in Italian and Spanish
  • The Chinese Mandarin language pack now supports Traditional as well as Simplified Mandarin. Please see 'Configuring the Job Request' for guidelines of how to do so.

2021-08-10

  • Error information added in API response for Fetch URL and Notification failures

2021-03-24

  • Improved Speaker Diarization
    • Speaker Diarization has been completely re-designed internally and should now be significantly more accurate
    • Instead of gendered speaker labels (M1, F2) speaker labels will be now (S1, S2 etc.) in the json-v2 and txt output. Speaker gender identification is no longer a supported feature
    • If requesting an output in txt format, and requesting no diarization, there will be no Speaker:UU at the start of a transcript
    • Users may still request Speaker Diarization as before via the configuration object
    • Beta sensitivity parameters will be removed. The parameters will remain within the API but will not have any effect
    • This update to Speaker Diarization feature can mean the turnaround time for your transcript will in some cases take longer
  • Improved Swedish and Arabic language packs, both now have advanced punctuation enabled (Swedish supports . ? , ! and Arabic supports . ؟ ، !)
  • For the English language pack only, a new tag, [disfluency] has been added to a pre-set list of words that imply hesitation or interjection in the JSON-v2 output only. Examples include 'hmm' and 'umm'. Customers may use this tag to carry out their own post-processing
  • The json-v2 API schema has been updated to v2.6

Known Issues

Issue IDSummaryDetailed Description and Possible Workarounds
REQ-20261The Japanese language pack may output fewer punctuation marks in certain scenariosIn some cases, users may see a decreased output in punctuation marks when transcribing in Japanese. Please report this if this is the case