Configuration Parameters
Learn about configuration parameters for the voice SDKBasic Parameters
language (str, default: "en") Language code for transcription (e.g., "en", "es", "fr"). See supported languages.
operating_point (OperatingPoint, default: ENHANCED) Balance accuracy vs latency. Options: STANDARD or ENHANCED.
domain (str, default: None) Domain-specific model (e.g., "finance", "medical"). See supported languages and domains.
output_locale (str, default: None) Output locale for formatting (e.g., "en-GB", "en-US"). See supported languages and locales.
enable_diarization (bool, default: False) Enable speaker diarization to identify and label different speakers.
Turn Detection Parameters
end_of_utterance_mode (EndOfUtteranceMode, default: FIXED) Controls how turn endings are detected:
FIXED - Uses fixed silence threshold. Fast but may split slow speech. ADAPTIVE - Adjusts delay based on speech rate, pauses, and disfluencies. Best for natural conversation. SMART_TURN - Uses ML model to detect acoustic turn-taking cues. Requires [smart] extras. EXTERNAL - Manual control via client.finalize(). For custom turn logic. end_of_utterance_silence_trigger (float, default: 0.2) Silence duration in seconds to trigger turn end.
end_of_utterance_max_delay (float, default: 10.0) Maximum delay before forcing turn end.
max_delay (float, default: 0.7) Maximum transcription delay for word emission.
Speaker Configuration
speaker_sensitivity (float, default: 0.5) Diarization sensitivity between 0.0 and 1.0. Higher values detect more speakers.
max_speakers (int, default: None) Limit maximum number of speakers to detect.
prefer_current_speaker (bool, default: False) Give extra weight to current speaker for word grouping.
speaker_config (SpeakerFocusConfig, default: SpeakerFocusConfig()) Configure speaker focus/ignore rules.
from speechmatics.voice import SpeakerFocusConfig, SpeakerFocusMode
# Focus only on specific speakers
config = VoiceAgentConfig(
enable_diarization=True,
speaker_config=SpeakerFocusConfig(
focus_speakers=["S1", "S2"],
focus_mode=SpeakerFocusMode.RETAIN
)
)
# Ignore specific speakers
config = VoiceAgentConfig(
enable_diarization=True,
speaker_config=SpeakerFocusConfig(
ignore_speakers=["S3"],
focus_mode=SpeakerFocusMode.IGNORE
)
)
known_speakers (list[SpeakerIdentifier], default: []) Pre-enrolled speaker identifiers for speaker identification.
from speechmatics.voice import SpeakerIdentifier
config = VoiceAgentConfig(
enable_diarization=True,
known_speakers=[
SpeakerIdentifier(label="Alice", speaker_identifiers=["XX...XX"]),
SpeakerIdentifier(label="Bob", speaker_identifiers=["YY...YY"])
]
)
Language & Vocabulary
additional_vocab (list[AdditionalVocabEntry], default: []) Custom vocabulary for domain-specific terms.
from speechmatics.voice import AdditionalVocabEntry
config = VoiceAgentConfig(
language="en",
additional_vocab=[
AdditionalVocabEntry(
content="Speechmatics",
sounds_like=["speech matters", "speech matics"]
),
AdditionalVocabEntry(content="API"),
]
)
punctuation_overrides (dict, default: None) Custom punctuation rules.
Audio Parameters
sample_rate (int, default: 16000) Audio sample rate in Hz.
audio_encoding (AudioEncoding, default: PCM_S16LE) Audio encoding format.
Advanced Parameters
transcription_update_preset (TranscriptionUpdatePreset, default: COMPLETE) Controls when to emit updates: COMPLETE, COMPLETE_PLUS_TIMING, WORDS, WORDS_PLUS_TIMING, or TIMING.
speech_segment_config (SpeechSegmentConfig, default: SpeechSegmentConfig()) Fine-tune segment generation and post-processing.
smart_turn_config (SmartTurnConfig, default: None) Configure SMART_TURN behavior (buffer length, threshold).
include_results (bool, default: False) Include word-level timing data in segments.
include_partials (bool, default: True) Emit partial segments. Set to False for final-only output.
Configuration with Overlays
Use presets as a starting point and customize with overlays:
from speechmatics.voice import VoiceAgentConfigPreset, VoiceAgentConfig
# Use preset with custom overrides
config = VoiceAgentConfigPreset.SCRIBE(
VoiceAgentConfig(
language="es",
max_delay=0.8
)
)
Available presets
presets = VoiceAgentConfigPreset.list_presets()
# Output: ['low_latency', 'conversation_adaptive', 'conversation_smart_turn', 'scribe', 'captions']
Configuration Serialization
Export and import configurations as JSON:
from speechmatics.voice import VoiceAgentConfigPreset, VoiceAgentConfig
# Export preset to JSON
config_json = VoiceAgentConfigPreset.SCRIBE().to_json()
# Load from JSON
config = VoiceAgentConfig.from_json(config_json)
# Or create from JSON string
config = VoiceAgentConfig.from_json('{"language": "en", "enable_diarization": true}')
For more information, see the voice agent Python SDK on github.