End to End Lineage not visible between Azure Databricks and Power BI in MS Purview

Question

End to End Lineage not visible between Azure Databricks and Power BI in MS Purview

SudhakarReddy Marepalli 0

Issue Description

We are implementing Microsoft Purview for enterprise data governance for a client.

The client’s data platform consists of:

Azure Data Factory for orchestration
Azure Databricks with Unity Catalog for data storage and transformations
Power BI for analytics and reporting

We have completed the following successfully:

Scanned Azure Databricks Unity Catalog and captured table-level and column-level lineage across Databricks layers
Scanned Power BI and can view lineage for datasets, reports, and dashboards

However, the lineage is currently disconnected:

Databricks lineage appears independently
Power BI lineage appears independently
There is no end-to-end lineage connection from Databricks source tables through transformations and into Power BI datasets, reports, and dashboards
No Transformation logic

Requirement

The client requires a single, end-to-end lineage view in Microsoft Purview that:

Starts from Databricks source tables
Flows through all Databricks transformation layers
Show what has been there in transformation logic or details inside the transformation
Extends to downstream Power BI datasets, reports, and dashboards consuming those tables

Questions

Could you please confirm:

What is the supported method in Microsoft Purview to enable cross-platform lineage from Azure Databricks Unity Catalog tables to Power BI datasets, reports, and dashboards?
Are there any specific scanner configurations, permissions, or prerequisites required on either the Databricks or Power BI side to allow Purview to associate upstream Databricks assets with downstream Power BI artifacts?
Does Microsoft Purview rely on Azure Data Factory, direct dataset connections, or metadata correlation to establish lineage between Databricks and Power BI?
Are there any known constraints that prevent Databricks-to-Power BI lineage from appearing as a single end-to-end flow?
Are transformation-level or cross-service lineage enhancements planned or on the roadmap?
If not supported, is this considered a product limitation or a configuration gap?

Business Impact

Without connected lineage across Databricks and Power BI, the client cannot achieve full transparency for data governance, impact analysis, and audit requirements.

Manoj Kumar Boyini 9,180 Reputation points Microsoft External Staff Moderator

2026-02-13T18:10:29.5433333+00:00

Hi SudhakarReddy Marepalli,

I hope you had a chance to review the information shared earlier, and I hope this information has been helpful! If you still have questions, please let us know what is needed in the comments so the question can be answered.

3 answers

Your answer

Manoj Kumar Boyini 9,180 Reputation points Microsoft External Staff Moderator

2026-02-13T18:10:29.5433333+00:00

Hi SudhakarReddy Marepalli,

I hope you had a chance to review the information shared earlier, and I hope this information has been helpful! If you still have questions, please let us know what is needed in the comments so the question can be answered.

Answer 1

Manoj Kumar Boyini 9,180 Microsoft External Staff Moderator

Hi SudhakarReddy Marepalli,

Based on the current capabilities of Microsoft Purview and the architecture you are using (Azure Databricks with Unity Catalog feeding Power BI), the behavior you are observing is expected. Purview is able to extract lineage inside Azure Databricks—both table-level and column-level from Unity Catalog, and it can also extract dataset, model, report, and dashboard lineage inside Power BI. However, the platform does not automatically connect these two lineage graphs. Even if Power BI is sourcing data directly from Databricks, Purview does not receive the metadata needed to infer that upstream relationship, so Databricks lineage and Power BI lineage appear as two disconnected graphs.

Regarding configuration, your scanning setup on both sides is correct. Databricks requires the workspace and catalogs to be registered and scanned with a scanning identity that has Metadata Reader or equivalent permissions in Unity Catalog. Power BI requires tenant registration, admin consent for metadata scanning, and full dataset/report scanning. Even with all prerequisites met, Purview still cannot stitch the lineage, because it connects services only when it detects a metadata-visible data movement or transformation activity. These activities are usually produced by orchestrators like Azure Data Factory, Synapse pipelines, or supported Spark lineage connectors. A direct Power BI → Databricks connection does not expose an activity or data-movement event that Purview can correlate.

This leads to the core limitation: Purview currently cannot produce an automatic end-to-end lineage from a Databricks Unity Catalog table through transformations and into Power BI datasets, even though it successfully captures lineage inside each platform. Additionally, transformation‑level logic (SQL/Python from Databricks notebooks) is not extracted today, so Purview cannot display the internal transformation steps between layers.

If end-to-end lineage is required, there are supported ways to approximate it. Some customers introduce an orchestrator such as Azure Data Factory between Databricks and the downstream consumption layer, because ADF activities generate lineage events that Purview can use to link upstream and downstream assets. Another option is to use the Databricks → Purview lineage connector (OpenLineage), which improves dataflow visibility inside Databricks, although it still does not create automatic stitching to Power BI. For cases where a complete governance view is mandatory, Purview allows manual lineage creation through the UI or through the Apache Atlas API, enabling you to draw the missing link between Databricks assets and Power BI datasets.

The scenario you described is a current product limitation rather than a configuration issue. Purview supports Databricks lineage and Power BI lineage individually, but the platform does not yet support native, automatic stitching between these systems. ADF-based orchestration or manual lineage creation are presently the only ways to form a complete end-to-end picture.

Hope this helps, Please let us know if you have any questions and concerns.

Manoj Kumar Boyini 9,180 Reputation points Microsoft External Staff Moderator

2026-02-16T20:14:15.61+00:00

Hi SudhakarReddy Marepalli,

Thank you for the follow-up question.

At present, Microsoft Purview does not display the actual SQL, Python, or PySpark transformation logic executed inside Azure Databricks notebooks. Purview captures structural lineage (input and output tables, and column-level lineage when Unity Catalog lineage is enabled), but it does not parse or render the notebook source code itself in the lineage graph.

You will be able to see how data flows from source tables to target tables across Databricks layers, but you will not see the detailed transformation logic such as joins, filters, aggregations, or business rules implemented inside notebooks. This is a current platform capability limitation rather than a configuration issue.

If additional visibility is required, there are a few supported approaches:

You can implement the Azure Databricks → Purview OpenLineage connector to enhance job-level lineage visibility. This improves traceability of notebook executions and data flow but still does not expose transformation code.

If Databricks jobs are orchestrated through Azure Data Factory or Synapse pipelines, Purview can capture pipeline-level lineage across services, which helps create more complete end-to-end visibility.

For governance and audit requirements, you can supplement lineage by adding custom lineage relationships or business metadata in Purview via the UI or REST APIs to document transformation intent and logic.

Helpful references:
https://learn.microsoft.com/en-us/samples/microsoft/purview-adb-lineage-solution-accelerator/azure-databricks-to-purview-lineage-connector/
https://learn.microsoft.com/en-us/azure/data-factory/tutorial-push-lineage-to-purview
https://learn.microsoft.com/en-us/purview/data-gov-api-create-lineage-relationships

Hope this helps, Please let us know if you have any questions and concerns.
SudhakarReddy Marepalli 0 Reputation points

2026-02-17T13:27:58.5333333+00:00

Hi Manoj,

Thank you for your detailed reply. wanted to check.

Will I be able to see the transformation logic executed within Azure Databricks notebooks? If not fully supported, are there any recommended workarounds or alternative approaches to enable visibility of these transformation details?

Thanks,

Sudhakar
Manoj Kumar Boyini 9,180 Reputation points Microsoft External Staff Moderator

2026-02-17T22:35:05.35+00:00

Hi SudhakarReddy Marepalli,

To clarify your question no, Microsoft Purview cannot show the actual transformation logic inside Azure Databricks notebooks. This means Purview does not display the SQL, PySpark, or Python code executed inside the notebook cells.

Purview only captures structural lineage, such as the input and output tables that a notebook reads from or writes to.

If you need additional visibility into transformations, there are supported alternatives:

• Databricks → Purview OpenLineage connector Captures job runs, inputs, outputs, and execution metadata, but not the internal code.

• Orchestrate notebooks using ADF or Synapse Purview will show pipeline‑level lineage across services, even though code itself is not shown.

• Document transformation logic within Purview Using business metadata, custom lineage, or glossary terms to record notebook logic for governance and audit needs.

Hope this helps, Please let us know if you have any questions and concerns.
Manoj Kumar Boyini 9,180 Reputation points Microsoft External Staff Moderator

2026-02-18T17:54:47.7233333+00:00

Hi SudhakarReddy Marepalli,

If this solution helped resolve your issue, please consider clicking ‘Accept Answer’ or giving it an upvote to help others find it easily.
SudhakarReddy Marepalli 0 Reputation points

2026-02-18T18:29:48.9366667+00:00

Hi Manoj,

Can you please also help me, Is there any future enhancements or improvements for this issue in future releases.

Thanks,

Sudhakar
Manoj Kumar Boyini 9,180 Reputation points Microsoft External Staff Moderator

2026-02-18T20:18:13.8166667+00:00
Hi SudhakarReddy Marepalli,

At this time, Microsoft has not announced any roadmap or upcoming release that will enable:

Automatic Databricks → Power BI end‑to‑end lineage stitching, or

Extraction of transformation logic from Databricks notebooks (SQL/Python/PySpark)

Both of these are known platform limitations, and there is no public ETA for when these capabilities may be added.

Microsoft is continuing to improve cross‑service lineage and OpenLineage integration, but nothing has been committed or published regarding:

Power BI having visibility into Databricks upstream lineage

Purview parsing notebook code

Full transformation‑level lineage coming from Databricks workloads

For now, the supported approaches remain:

ADF/Synapse orchestration for stitched lineage

Databricks OpenLineage connector for enhanced internal Databricks lineage

Manual lineage or custom metadata for documenting transformation logic

If Microsoft publishes an update in the future, it will appear in the official Purview release notes.

Hope this helps, Please let us know if you have any questions and concerns.

If the response helped, do "Accept Answer". All community members with similar questions will benefit by doing so. Your contribution is highly appreciated.
SudhakarReddy Marepalli 0 Reputation points

2026-02-18T21:37:27.67+00:00
Hi Vijay,

Thank you for sharing more details and prompt reply. based on the above answer, Here are the follow up questions to get details

If I enable the OpenLineage connector, what enhanced Databricks lineage will be captured from Databricks notebooks?

How can I create manual or custom lineage to document transformation logic within Databricks notebooks?
Manoj Kumar Boyini 9,180 Reputation points Microsoft External Staff Moderator

2026-02-19T20:17:47.6566667+00:00

Hi SudhakarReddy Marepalli,

Based on the current Microsoft Purview and Databricks capabilities,

• Purview cannot extract Databricks notebook transformation logic this is a current product limitation.
• OpenLineage improves dataflow lineage (inputs/outputs, process metadata, column lineage) but not notebook code.
• Transformation logic can only be represented today through manual lineage, custom lineage APIs, or metadata documentation.
• There are no announced roadmap items that change this behavior at this time.

Thank you again for the collaboration.
SudhakarReddy Marepalli 0 Reputation points

2026-02-19T20:29:45.9933333+00:00

Transformation logic can only be represented today through manual lineage, custom lineage APIs, or metadata documentation. can you explain this. what does this mean?
SudhakarReddy Marepalli 0 Reputation points

2026-02-23T20:22:48.61+00:00

requesting to your valuable update. been discussing from last one week. request you to provide prompt replies to close the thread.
Manoj Kumar Boyini 9,180 Reputation points Microsoft External Staff Moderator

2026-02-23T20:31:32.9933333+00:00

Hi SudhakarReddy Marepalli,

Thank you for the follow-up question.

At present, Microsoft Purview does not display the actual SQL, Python, or PySpark transformation logic executed inside Azure Databricks notebooks. Purview captures structural lineage (input and output tables, and column-level lineage when Unity Catalog lineage is enabled), but it does not parse or render the notebook source code itself in the lineage graph.

You will be able to see how data flows from source tables to target tables across Databricks layers, but you will not see the detailed transformation logic such as joins, filters, aggregations, or business rules implemented inside notebooks. This is a current platform capability rather than a configuration issue.

If additional visibility is required, there are a few supported approaches:

You can implement the Azure Databricks → Purview OpenLineage connector to enhance job-level lineage visibility. This improves traceability of notebook executions and data flow but still does not expose transformation code.

If Databricks jobs are orchestrated through Azure Data Factory or Synapse pipelines, Purview can capture pipeline-level lineage across services, which helps provide more complete end-to-end visibility.

For governance and audit requirements, you can supplement lineage by adding custom lineage relationships or business metadata in Purview via the UI or REST APIs to document transformation intent and logic.

Helpful references:
https://learn.microsoft.com/en-us/purview/data-gov-api-create-lineage-relationships
https://learn.microsoft.com/en-us/azure/data-factory/tutorial-push-lineage-to-purview
https://learn.microsoft.com/en-us/samples/microsoft/purview-adb-lineage-solution-accelerator/azure-databricks-to-purview-lineage-connector/

Hope this helps. Please let us know if you have any questions or concerns.
SudhakarReddy Marepalli 0 Reputation points

2026-02-23T21:42:31.2666667+00:00
Hi Manoj,

I am still not clear on this point. can you explain in detail please.

"you can supplement lineage by adding custom lineage relationships or business metadata in Purview via the UI or REST APIs to document transformation intent and logic. and

Transformation logic can only be represented today through manual lineage, custom lineage APIs, or metadata documentation."

Thanks,

Sudhakar
Smaran Thoomu 33,920 Reputation points Microsoft External Staff Moderator

2026-02-25T04:49:44.87+00:00
Hi SudhakarReddy Marepalli

Thanks for your patience. Let me explain this more clearly.

When we say “manual lineage or metadata documentation”, it means the following:

Today, Microsoft Purview cannot automatically read or display the SQL/PySpark code inside Azure Databricks notebooks. So it cannot show joins, filters, aggregations, or business rules in the lineage view.

Because of that, if you want transformation details visible in Purview, they must be documented manually.

There are two ways to do this:

Manual lineage relationships (UI or API)

You can manually create lineage links between:

Source table → Target table

Databricks table → Power BI dataset

This is done either:

In the Purview UI (by editing lineage), or

Using the Purview REST / Apache Atlas APIs

This does not expose notebook code, but it allows you to visually represent the transformation step in the lineage graph.

Business metadata documentation

You can add custom attributes or descriptions to assets in Purview, such as:

“This table is generated by joining Customer and Orders tables”

“Aggregation logic: SUM(Sales) grouped by Region”

Reference to notebook name or job ID

This way, governance users can see transformation intent even though Purview does not parse the code.

This is a product limitation today, not a configuration issue.

Hope this clarifies it better.
SudhakarReddy Marepalli 0 Reputation points

2026-02-25T13:54:54.2133333+00:00

@Manoj Kumar Boyini Can you confirm the above points please.
Smaran Thoomu 33,920 Reputation points Microsoft External Staff Moderator

2026-02-26T10:41:32.91+00:00

SudhakarReddy Marepalli I have shared the clarification regarding manual lineage and metadata documentation based on the current Purview capabilities. Please let me know if that answers your question, or if there is any specific part you would like me to explain further.
SudhakarReddy Marepalli 0 Reputation points

2026-02-26T19:53:17.3466667+00:00

@Smaran Thoomu Thank you for the update
Smaran Thoomu 33,920 Reputation points Microsoft External Staff Moderator

2026-03-02T07:55:50.08+00:00

SudhakarReddy Marepalli Following up to see if the above answer was helpful. If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

Answer 2

SudhakarReddy Marepalli 0

Hi Manoj,

Thank you for your detailed reply. wanted to check.

Will I be able to see the transformation logic executed within Azure Databricks notebooks? If not fully supported, are there any recommended workarounds or alternative approaches to enable visibility of these transformation details?

Thanks,

Sudhakar

Answer 3

To enable cross-platform lineage from Azure Databricks Unity Catalog to Power BI in Microsoft Purview, consider the following points based on the current capabilities:

Supported Method for Cross-Platform Lineage: Microsoft Purview captures lineage for assets scanned through it. For end-to-end lineage to be visible, ensure that both Azure Databricks and Power BI assets are included in the scans. This means that lineage is only shown for objects that have been scanned and are connected through the data flow.
Scanner Configurations and Permissions: Ensure that all relevant Databricks workspaces and Power BI datasets are scanned by Microsoft Purview. You may need to check that the necessary permissions are granted for Microsoft Purview to access and scan the assets in both Databricks and Power BI. Specific privileges may be required to link external metadata and lineage relationships.
Lineage Establishment: Microsoft Purview does not inherently rely on Azure Data Factory for lineage between Databricks and Power BI. Instead, it captures lineage based on the assets scanned and their relationships. If Azure Data Factory is used for orchestration, it can facilitate lineage if it is also scanned and integrated into the lineage view.
Known Constraints: There are limitations that may prevent Databricks-to-Power BI lineage from appearing as a single flow. For instance, if not all objects involved in the data flow are scanned, the lineage will be incomplete. Additionally, lineage may not be preserved for renamed objects or if certain transformation logic is not captured in the lineage system tables.
Transformation-Level Enhancements: As of now, there is no specific information available regarding planned enhancements for transformation-level lineage or cross-service lineage in Microsoft Purview. It is advisable to keep an eye on updates from Microsoft regarding future capabilities.
Product Limitation vs. Configuration Gap: If the desired lineage is not supported, it may be considered a product limitation rather than a configuration gap, as the current capabilities of Microsoft Purview dictate how lineage is captured and displayed.

In summary, achieving a connected lineage view requires careful configuration of scans and permissions across both Azure Databricks and Power BI, along with an understanding of the limitations in the current lineage capabilities of Microsoft Purview.

References:

Share via

End to End Lineage not visible between Azure Databricks and Power BI in MS Purview

3 answers

Your answer