Introduction
Speech transcription and synthesis are useful capabilities in many scenarios, including:
- Documenting spoken conversations in calls and meetings.
- Generating captions for videos or presentations.
- Creating audible user interfaces to improve application accessibility.
- Developing hands-free AI assistants that read text messages or emails aloud.
In this module, we'll explore how to use speech-capable generative AI models in Microsoft Foundry to convert speech to text and text to speech.