Skip to content
Shiny.Maui.Shell v6 support for AI routing tools Learn More

Speech Releases

Feature
ISpeechToTextService interface — platform-native speech recognition with permission management, continuous streaming, and listen-until-silence modes
Feature
ITextToSpeechService interface — platform-native text-to-speech with voice selection, speech rate, pitch, and volume control
Feature
IAudioSource interface — raw PCM audio capture from the device microphone (16kHz, 16-bit, mono)
Feature
IAudioPlayer interface — MP3 audio stream playback with play/stop control
Feature
SpeechRecognitionOptions — configurable culture, silence timeout, and on-device preference for STT
Feature
TextToSpeechOptions — configurable culture, voice, speech rate, pitch, and volume for TTS
Feature
ContinuousRecognize() — streaming recognition results via IAsyncEnumerable<SpeechRecognitionResult> with partial and final results
Feature
ListenUntilSilence() — simple dictation mode that returns the final transcription after silence is detected
Feature
GetVoicesAsync() — enumerate available TTS voices with optional culture filtering
Feature
AddSpeechServices() — single extension method to register all core services (STT, TTS, AudioSource, AudioPlayer)
Feature Android
Android STT implementation using SpeechRecognizer with streaming partial results
Feature Android
Android TTS implementation using Android.Speech.Tts.TextToSpeech
Feature Android
Android audio capture via AudioRecord with 16kHz PCM streaming
Feature Android
Android audio playback via MediaPlayer
Feature iOS
iOS STT implementation using SFSpeechRecognizer with SFSpeechAudioBufferRecognitionRequest
Feature iOS
iOS TTS implementation using AVSpeechSynthesizer
Feature iOS
iOS audio capture via AVAudioEngine with PCM tap
Feature iOS
iOS audio playback via AVAudioPlayer
Feature
Cloud provider abstraction — ISpeechToTextProvider and ITextToSpeechProvider interfaces for pluggable cloud backends
Feature
CloudSpeechToText and CloudTextToSpeech — bridge classes that combine platform audio with cloud provider APIs
Feature
AddCloudSpeechToText<T>() and AddCloudTextToSpeech<T>() — generic DI registration for custom cloud providers
Feature
Azure AI Speech provider — AddAzureSpeech() registers Azure STT and/or TTS with subscription key and region
Feature
Azure TTS with SSML prosody control — speech rate, pitch, and volume mapped to SSML elements
Feature
ElevenLabs TTS provider — AddElevenLabsTextToSpeech() registers ElevenLabs cloud TTS with configurable voice and model
Feature
PipeStream utility — thread-safe producer-consumer stream using System.IO.Pipelines for bridging audio capture with cloud providers
Feature WASM
Browser/WebAssembly support — STT and TTS via Web Speech API, auto-detected at runtime via OperatingSystem.IsBrowser()
Feature WASM
Browser STT implementation using SpeechRecognition API with streaming partial and final results
Feature WASM
Browser TTS implementation using SpeechSynthesis API with voice selection, rate, pitch, and volume control
Feature WASM
Browser audio playback via HTML5 Audio element with base64 data URL conversion
Feature
Blazor WebAssembly sample app demonstrating STT, TTS, and voice listing