VibeVoice-Realtime TTS Demo

Streaming Input Text
This area will display the streaming input text in real time.
This demo requires the full text to be provided upfront. The model then receives the text via streaming input during synthesis.
For non-punctuation special characters, applying text normalization before processing often yields better results.
Speaker
Model Generated Audio0.00s Audio Played0.00s
Runtime Logs