Introduction

Completed

Speech transcription and synthesis are useful capabilities in many scenarios, including:

  • Documenting spoken conversations in calls and meetings.
  • Generating captions for videos or presentations.
  • Creating audible user interfaces to improve application accessibility.
  • Developing hands-free AI assistants that read text messages or emails aloud.

In this module, we'll explore how to use speech-capable generative AI models in Microsoft Foundry to convert speech to text and text to speech.