Overview of Fast Transcription API
The Fast Transcription API offered by Azure Custom Speech Service allows users to transcribe audio files efficiently and quickly, providing results synchronously faster than real-time. This API is ideal for scenarios where immediate transcription results are required with predictable latency, such as quick audio or video transcriptions, subtitles, and edits, as well as video translations. Unlike the batch transcription API, the fast transcription API delivers transcriptions in a display form that includes punctuation and capitalization, making it more human-readable.
Prerequisites for Using Fast Transcription API
Before utilizing the fast transcription API, users need to have an Azure AI Speech resource in one of the supported regions and an audio file in a format and codec compatible with the API. The supported regions for the fast transcription API include Australia East, Brazil South, East US, West Europe, and others. Additionally, the audio file should be less than 2 hours long and less than 200 MB in size, in formats like WAV, MP3, OPUS/OGG, FLAC, and more.
How to Utilize the Fast Transcription API
Users can access the fast transcription API through the Transcriptions endpoint, enabling them to transcribe audio files efficiently. By following scenarios like specifying a known locale, enabling language identification, diarization, or multi-channel transcriptions, users can enhance the accuracy and functionality of the transcription process. It involves making a multipart/form-data POST request to the transcriptions endpoint with the audio file and required body properties, including locales for specifying the expected locale of the audio data.
Example of Transcribing with Specified Locale
To transcribe an audio file with a specified locale, users replace placeholders like SubscriptionKey, ServiceRegion, and AudioFile in the cURL command with their specific details. The form definition should include the locales property set to the expected locale, such as en-US, ensuring accurate transcription. The API response includes essential details like duration, offset, and combined phrases containing the full transcriptions for all speakers, improving accessibility and usability.
Stay Ahead in Today’s Competitive Market!
Unlock your company’s full potential with a Virtual Delivery Center (VDC). Gain specialized expertise, drive
seamless operations, and scale effortlessly for long-term success.
Book a Meeting to Avail the Services of Azure Custom Speech Service