This is the documentation for a previous version of our product. Click here to see the latest version.

Batch Virtual Appliance

High-Level Summary

This release provides substantially improved speaker diarization. It provides a new version of the Speech API that is also used on the Speechmatics Cloud Offering. This release updates the language packs for Swedish and Arabic. It also adds metadata tags for disfluencies in English in the JSON output and updates Linux Ubuntu OS on which the appliance runs.

It is recommended that customers on previous releases upgrade to this version.

Important Notices

Deprecation Note

The legacy V1 API that the Batch Virtual Appliance currently supports will be removed by February 2022. The V2 API as used in the Speechmatics SaaS (https://asr.api.speechmatics.com/v2/docs) is now supported in the Batch Virtual Appliance. We recommend all customers move to using the V2 API. Please see the section How to use the V2 API.

If you are importing an appliance using VMWare, please note that the hardware_version of the appliance has been updated from 9 to 11. This is to automatically take advantages of performance optimisation using Advanced Vector Extensions 2 (AVX2). This should have no effect on the appliance assuming you are on a version of VMWare ESXi supported by Speechmatics (versions 6.5 onwards). If you are importing an appliance through VirtualBox and AVX2 is not automatically enabled, you can also take advantage of the the performance benefits from AVX2 following these guidelines.

It is recommended to run the appliance on processors that support AVX2 in order to take advantage of latest performance optimisations.

What's New

  • Improved speaker diarization
    • Speaker diarization has been completely re-designed internally and should now be significantly more accurate.
    • Instead of gendered speaker labels (M1, F2) speaker labels will be now (S1, S2 etc.) in the json-v2 and txt output. Speaker gender identification is no longer a supported feature.
    • If requesting an output in txt format, and requesting no diarization, there will be no Speaker:UU at the start of a transcript.
    • Users may still request speaker diarization as before via the configuration object.
    • Beta sensitivity parameters will be removed. The parameters will remain within the API but will not have any effect.
    • This update to speaker diarization feature does mean the turnaround time for your transcript will take longer (see documentation section on "Speaker Diarization" for further details).
      • Users of the V1 API in the Batch Virtual Appliance, where speaker diarization is enabled by default, may notice a slowdown in performance under high load. This is because diarization is enabled by default. If diarization is not required users can set the diarization parameter to none in the configuration object to avoid this slowdown. How to do so is described in the Speech API guide.
  • The Batch Virtual Appliance now supports the V2 REST API used in the Speechmatics Cloud Offering. The Speech API has been updated to reflect this.
    • Customers can retrieve logs for any single successful or unsuccessful transcription job via the V2 API. This is intended for debugging when the transcription fails, or to provide extra details when contacting Speechmatics Support.
    • Customers can cancel a running job they no longer want by using the force=TRUE query.
  • Improved Swedish and Arabic language packs, both now have advanced punctuation enabled (Swedish supports . ? , ! and Arabic supports . ؟ ، !).
  • Disfluency tagging in English. Certain words in English only that imply hesitation (e.g. 'hmm') will have a metadata tag of disfluency in the json-v2 output.
  • The Host VM of the Appliance now runs on Ubuntu Bionic (OS 18.04).
  • The json-v2 API schema has been updated to v2.6

Issues Fixed

The following issues are addressed since the previous release:

Issue IDSummaryResolution Description
REQ-111353.2.0 introduced unwanted hesitations in transcripts for English (en).Hesitations, disfluencies, and some interjections are tagged in the JSON output with a metadata tag of disfluency
REQ-15418Custom dictionary with splitting characters gets incorrect pronunciationWhen using words with splitting characters in a Custom Dictionary (for example covid-19) where a number follows a word we now have the correct pronunciations created. Splitting characters include ["-", "_", "/", "<", ">", ":", " "]. This is for all languages
REQ-17771Wide-space Unicode characters in Custom Dictionary cause a jobs to failThis is now fixed and wide-spaced characters should be accepted

Known Limitations

The following are known issues in this release:

Issue IDSummaryDetailed Description and Possible Workarounds
REQ-10634Putting "-" as an item in additional vocab configuration will cause the container to failDo not enter just a "-" on its own in Custom Dictionary either as an additional vocab item or in the sounds_like property. Hyphens are still supported when entered as part of phrases or words
REQ-1409Proteus HCL with <unk> causes out of memory errorA custom dictionary list that contains the word '' causes the worker to crash.
REQ-7549Memory leak affecting gRPCThere is a small memory leak in the gRPC Python server https://github.com/grpc/grpc/issues/5913.
REQ-10160Advanced punctuation for Spanish (es) does not contain inverted marks.Inverted marks [ ¿ ¡ ] are not currently available for Spanish advanced punctuation.
REQ-10627Double full stops when acronym is at the end of the sentenceIf there is an acronym at the end of the sentence, then a double full stop will be output, for example: "team G.B.."
REQ-10634Putting "-" as an item in additional vocab configuration will cause the container to failDo not enter just a "-" on its own in Custom Dictionary either as an additional vocab item or in the sounds_like property. Hyphens are still supported when entered as part of phrases or words
REQ-14402When running very large numbers of small jobs (less than 10 seconds) offline, this may cause some of the jobs to be rejectedIf you encounter this issue, please ensure licensing is in offline mode when running the appliance offline

Supported Platforms

Virtual Appliance image (OVA) for installation on:

  • VMware ESXi 6.5+ or VMware Workstation Player.
  • VirtualBox 5.2+
  • Amazon EC2

See the Installation and Admin Guide for details on the minimum specifications for the VM. The maximum number of concurrent jobs (maxworkers) that you can run on a single appliance is 30.

Form Factors

VariantImage SizeMax. Disk SpaceLanguages
nano12GB40GBen
mini19GB40GBen, de, es
midi38GB60GBen, de, es, fr, ko, ja, nl, pt
maxi64GB80GBen, de, es, fr, ko, ja, nl, pt, it, da, pl, ca, hi, ru, sv
plus65GB80GBen, cmn, no, ar, bg, cs, el, fi, hu, hr, lt, lv, ro, sk, sl, tr, ms

Upgrade Path

Remove the license from your old appliance (see the Admin Guide), then re-import the new OVA and configure networking as per the Installation and Admin guide. You will need to re-apply the license code you have once the OVA has imported.

Installation

Upload the OVA to VMWare ESX, VMWare Workstation Player, or VirtualBox. See the Installation and Admin Guide for more information.

Performance at Scale

Further notes on IOPS requirements under heavy usage of the appliance are now provided in the System Requirements section of the Installation Guide.