Shiny.Maui.Shell v6 support for AI routing tools Learn More
Speech Releases
1.0 - May 2, 2026
Section titled “1.0 - May 2, 2026” Feature
ISpeechToTextService interface — platform-native speech recognition with permission management, continuous streaming, and listen-until-silence modes Feature
ITextToSpeechService interface — platform-native text-to-speech with voice selection, speech rate, pitch, and volume control Feature
IAudioSource interface — raw PCM audio capture from the device microphone (16kHz, 16-bit, mono) Feature
IAudioPlayer interface — MP3 audio stream playback with play/stop control Feature
SpeechRecognitionOptions — configurable culture, silence timeout, and on-device preference for STT Feature
TextToSpeechOptions — configurable culture, voice, speech rate, pitch, and volume for TTS Feature
ContinuousRecognize() — streaming recognition results via IAsyncEnumerable<SpeechRecognitionResult> with partial and final results Feature
ListenUntilSilence() — simple dictation mode that returns the final transcription after silence is detected Feature
GetVoicesAsync() — enumerate available TTS voices with optional culture filtering Feature
AddSpeechServices() — single extension method to register all core services (STT, TTS, AudioSource, AudioPlayer) Feature Android
Android STT implementation using
SpeechRecognizer with streaming partial results Feature Android
Android TTS implementation using
Android.Speech.Tts.TextToSpeech Feature Android
Android audio capture via
AudioRecord with 16kHz PCM streaming Feature Android
Android audio playback via
MediaPlayer Feature iOS
iOS STT implementation using
SFSpeechRecognizer with SFSpeechAudioBufferRecognitionRequest Feature iOS
iOS TTS implementation using
AVSpeechSynthesizer Feature iOS
iOS audio capture via
AVAudioEngine with PCM tap Feature iOS
iOS audio playback via
AVAudioPlayer Feature
Cloud provider abstraction —
ISpeechToTextProvider and ITextToSpeechProvider interfaces for pluggable cloud backends Feature
CloudSpeechToText and CloudTextToSpeech — bridge classes that combine platform audio with cloud provider APIs Feature
AddCloudSpeechToText<T>() and AddCloudTextToSpeech<T>() — generic DI registration for custom cloud providers Feature
Azure AI Speech provider —
AddAzureSpeech() registers Azure STT and/or TTS with subscription key and region Feature
Azure TTS with SSML prosody control — speech rate, pitch, and volume mapped to SSML elements
Feature
ElevenLabs TTS provider —
AddElevenLabsTextToSpeech() registers ElevenLabs cloud TTS with configurable voice and model Feature
PipeStream utility — thread-safe producer-consumer stream using System.IO.Pipelines for bridging audio capture with cloud providers Feature WASM
Browser/WebAssembly support — STT and TTS via Web Speech API, auto-detected at runtime via
OperatingSystem.IsBrowser() Feature WASM
Browser STT implementation using
SpeechRecognition API with streaming partial and final results Feature WASM
Browser TTS implementation using
SpeechSynthesis API with voice selection, rate, pitch, and volume control Feature WASM
Browser audio playback via HTML5
Audio element with base64 data URL conversion Feature
Blazor WebAssembly sample app demonstrating STT, TTS, and voice listing