This page documents production updates to SaaS. This page contains information about new or updated features, bug fixes, known issues, and deprecated functionality.
- Enhanced model available for all 31 language packs
- Please contact your account manager if you would like access to the enhanced model
- General improvements in pop culture terms recognition for the English language pack
- Removal of foreign characters from English and German language packs
- New language packs for all 31 language model. By default a language pack will contain a standard and enhanced model for all 31 languages. The standard model is now available to use, with no user change required. The enhanced model will be released in September. Please see the API how-to guide for how to request the enhanced model to prepare your integration in advance
- Profanity tagging in Italian and Spanish
- The Chinese Mandarin language pack now supports Traditional as well as Simplified Mandarin. Please see 'Configuring the Job Request' for guidelines of how to do so.
- Error information added in API response for Fetch URL and Notification failures
- Improved speaker diarization
- Speaker diarization has been completely re-designed internally and should now be significantly more accurate
- Instead of gendered speaker labels (M1, F2) speaker labels will be now (S1, S2 etc.) in the
txt output. Speaker gender identification is no longer a supported feature
- If requesting an output in
txt format, and requesting no diarization, there will be no
Speaker:UU at the start of a transcript
- Users may still request speaker diarization as before via the configuration object
- Beta sensitivity parameters will be removed. The parameters will remain within the API but will not have any effect
- This update to speaker diarization feature can mean the turnaround time for your transcript will in some cases take longer
- Improved Swedish and Arabic language packs, both now have advanced punctuation enabled (Swedish supports . ? , ! and Arabic supports . ؟ ، !)
- For the English language pack only, a new tag,
[disfluency] has been added to a pre-set list of words that imply hesitation or interjection in the JSON-v2 output only. Examples include 'hmm' and 'umm'. Customers may use this tag to carry out their own post-processing
- The json-v2 API schema has been updated to v2.6
- When a user makes a GET request without a job ID, the response will list up to 100 jobs submitted most recently in the past 7 days, rather than just the previous 48 hours. This does not include jobs that a user has already deleted.
- When a job is deleted by a customer via the API using a HTTP DELETE request, the status of the job in the response will now say deleted instead of done
|Issue ID||Summary||Detailed Description and Possible Workarounds|
|REQ-20261||The Japanese language pack may output fewer punctuation marks in certain scenarios||In some cases, users may see a decreased output in punctuation marks when transcribing in Japanese. Please report this if this is the case|
To help customers comply with data protection obligations from GDPR and other regulations, we assume that all media, transcript, and configuration files processed by the Speechmatics SaaS may contain personal data. Media, transcript, and configuration data are only processed to perform automated speech transcription following customer instructions conveyed via the cloud API.
All media, transcript, and configuration data will not be stored any longer than 7 days, and after this period they are deleted. This process will occur unless a user has explicitly deleted them through the API before they are deleted automatically. GET & DELETE request for jobs and/or media files more than 7 days after their submission or that have already been deleted will return a 4xx response.
Beyond the 7 day window, logs will still be present for troubleshooting and support purposes identifying whether features such as Custom Dictionary have been used but no information of its contents will be available.
Any URLs provided by users within the job config relating to fetching media or for notifications on the job are not recorded by logs. However, the client IP addresses are recorded.