Get named entities extraction insights

Applies to: Cloud-based Azure AI Video Indexer

Named entities extraction uses natural language processing (NLP) to find locations, people, and brands in audio and images in media files. Named entities extraction uses transcription and optical character recognition (OCR).

Named entities use cases

Contextual advertising, for example, placing an ad for a pizza chain following footage on Italy.
Deep searching media archives for insights on people or locations to create feature stories for the news.
Creating a verbal description of footage using optical character recognition (OCR) processing to enhance accessibility for the visually impaired, for example a background storyteller in movies.
Extracting insights on brand names.

View the insight JSON with the web portal

After you upload and index a video, download insights in JSON format from the web portal.

Select the Library tab.
Select the media you want.
Select Download, and then select Insights (JSON). The JSON file opens in a new browser tab.
Find the key pair described in the example response.

Use the API

Use a Get Video Index request. Pass &includeSummarizedInsights=false.
Find the key pairs described in the example response.

Example response

    namedPeople: [
    {
    referenceId: "Satya_Nadella",
    referenceUrl: "https://en.wikipedia.org/wiki/Satya_Nadella",
    confidence: 1,
    description: "CEO of Microsoft Corporation",
    seenDuration: 33.2,
    id: 2,
    name: "Satya Nadella",
    appearances: [
    {
    startTime: "0:01:11.04",
    endTime: "0:01:17.36",
    startSeconds: 71,
    endSeconds: 77.4
    },
    {
    startTime: "0:01:31.83",
    endTime: "0:01:37.1303666",
    startSeconds: 91.8,
    endSeconds: 97.1
    },

Important

Read the transparency note overview for VI features.

Sample code

See all samples for VI

Azure AI Video Indexer documentation

Feedback

Was this page helpful?

Last updated on 2026-01-06