Data governance and security baselines with Microsoft Purview

With the foundational strategy and architecture in place, establish strong governance and security practices from the very beginning. This approach makes sure that as data from all corners of the organization flows into your unified platform, it remains well managed, compliant, and secure. Recommendation: Set Microsoft Purview as the system of record for data governance and security so your organization can apply consistent policy, accountability, and compliance across the full data estate (see Figure 1). To apply this recommendation, use this article as a checklist:

Figure 1. Microsoft Purview's role in data governance and security baselines.

1. Data visibility baseline

Data governance starts with a clear view of all your data assets across the entire estate. Recommendation: Establish a consistent visibility baseline with Microsoft Purview before data enters OneLake. This approach helps governance decisions remain consistent across platforms and teams. To apply this recommendation, use the following checklist:

1.1. Create a data catalog

A centralized data catalog provides decision makers with a single system of record for understanding what data exists, where it resides, and who owns it. Recommendation: Use Microsoft Purview's Unified Catalog as the authoritative inventory for all enterprise data assets. To apply this recommendation, use the following checklist:

Use governance domains. Organize the Unified Catalog by using Purview governance domains that align to existing data domains. Collections, roles, and permissions should reflect these boundaries, so each data domain has authority in its area. The Unified Catalog begins empty, but as data sources are registered and scanned, it grows into an accurate map of the estate. Setting domains early brings clarity as the catalog expands.
Define a data glossary. With structure in place, create a shared business glossary that anchors the organization's most important concepts and metrics. Terms such as Customer, Product, Employee, and Location, and measures like Revenue or Headcount, often vary subtly across departments unless intentionally harmonized. Document these definitions as Glossary Terms in the Unified Catalog and communicate them broadly to help eliminate ambiguity. This clarity supports consistent AI, analytics, and reporting practices across teams.

1.2. Map data sources

The Purview Data Map provides visibility into data assets without copying data or replacing source‑level security controls. Recommendation: Register and scan all relevant data systems to populate the Data Map. To apply this recommendation, use the following checklist:

Create Purview architecture. Use Purview collections and Data Map domains to align permissions and governance with the needs of your data domains. These constructs define boundaries for access control, policy management, and operational responsibility. Follow Purview domain and collection architecture best practices.
Set up Purview for Microsoft 365 data. Purview has native integration with Microsoft 365 data (such as SharePoint, OneDrive, Exchange, Teams). Make sure that you also govern content in Microsoft 365. When you bring Microsoft 365 documents and messages into OneLake or otherwise use them in analytics, any labels or classifications from Microsoft 365 carry over. In Purview, you can see sensitivity labels and retention labels that you applied in the Microsoft 365 environment. To learn more, see Microsoft Purview setup guides.
Scan Microsoft Fabric OneLake. Fabric OneLake isn't automatically included in the Purview Data Map. You must explicitly register and scan it. Scanning OneLake enables metadata discovery, lineage tracking, and cataloging of Fabric assets in Purview. To learn more, see Register and scan Microsoft Fabric.

1.3. Scan cloud, SaaS, and on-premises data

Connector-based scanning is required for data stored in Azure services, Microsoft Dataverse, on-premises systems, and other clouds (AWS, Google Cloud, Oracle). You need to register and scan supported data sources in Data Map. Recommendation: Follow Purview's scanning best practices. Choose whether to scan source systems or only scan the Microsoft Fabric OneLake layer. This decision depends on visibility needs, compliance requirements, operational overhead, and the role each system plays in analytics and reporting. To apply this recommendation, review the following options:

Option 1: Scan source systems. Scanning operational systems such as Azure databases or AWS S3 provides end-to-end lineage from the system of record. This information is important for regulated or business-critical data where full provenance supports audits and compliance. It provides the clearest visibility into upstream changes but introduces complexity. Connector configuration, credentials, and scheduling require attention, and some sources require you to Choose the right integration runtime.

Option 2: Scan only the Fabric layer. If deep visibility into upstream systems isn't required, scanning only the Fabric layer simplifies the governance model. Lineage begins when data enters Fabric. This approach reduces integration work. It's best for data where upstream systems are well governed or where compliance obligations don't require full lineage. To learn more, see Fabric in Microsoft Purview.

1.4. Apply sensitivity labels

Sensitivity labels are a fundamental tool for classifying and protecting data. In Microsoft Purview, there are typically two kinds of sensitivity labels:

Metadata-only labels: These labels are metadata tags in the Purview catalog. For example, labeling an Azure SQL table as "Confidential" in the catalog without affecting the source system directly. These labels help track and manage data sensitivity in Purview for assets outside of Microsoft 365/Fabric.
Protective labels: These labels not only mark data with a classification, like "Confidential," but also enforce protection. They can enforce encryption or restricted access on files and emails. Use these labels heavily in Microsoft 365 and now extend into Fabric as well.

Both contribute to a consistent governance model, and clarity about their purpose helps align the right label type to the right scenario. Recommendation: Apply a unified sensitivity labeling strategy across Microsoft 365, Fabric, and the Purview Data Map. To apply this recommendation, use the following checklist:

Define a sensitivity label taxonomy. Define an organizational labeling schema. A common schema is Non-business, Public, General, Confidential, and Highly Confidential. Make sure everyone understands what each label means and what types of data fall under each category. Get buy-in on this taxonomy from compliance and business stakeholders. See Get started with sensitivity labels.
Label Microsoft 365 data (protective label). Make sure Microsoft 365 information is labeled by using Microsoft Purview Information Protection. These labels impose security controls (encryption, restricted access). They persist with the data when brought into OneLake or shared via other tools. Best practices: Many organizations set up auto-labeling policies to detect sensitive info, like credit card numbers or personal data, and automatically apply a label. See automatically apply sensitivity labels to Microsoft 365 data.
Label Microsoft Fabric data (protective label). Microsoft Fabric supports protected sensitivity labels on its own assets, like tables in a Lakehouse, and Power BI datasets. Configure default label policies in Fabric so that new data in OneLake is labeled appropriately from creation. For instance, you could specify that any new dataset in certain workspaces is by default labeled as internal or confidential unless changed. This approach makes sure no data enters the lake without classification. Adjust these defaults for areas that handle sensitive data. See Govern Fabric.
Label entries in Purview Data Map (metadata-only label). For data sources that are scanned into Purview (like an AWS S3 bucket or an on-premises database), apply metadata labels in the Purview Data Map. While these labels don't encrypt or restrict the data at the source, they do inform users and systems that the data is, say, confidential. They can also trigger other governance workflows. Everything in your catalog should have a sensitivity designation. See Use autolabeling policies to detect data assets and apply these metadata-only sensitivity labels automatically.

1.5. Capture data lineage

Data lineage provides visibility into how data moves and changes across systems. Recommendation: Enable automated lineage where available and close gaps manually where required. Best practices: Use Microsoft Purview to automate lineage capture for many assets. Where automation isn't available, add lineage manually in Purview to fill gaps. See Manual lineage setup. Configure data lineage in Purview for Fabric and, as needed, for Azure Databricks.

2. Data estate compliance baseline

Compliance defines the legal, regulatory, and internal obligations that apply to data across platforms. Decision makers must understand these obligations before enforcing controls. This approach makes sure governance actions remain defensible and auditable across Microsoft Fabric and the broader data estate. Recommendation: Define compliance requirements centrally and monitor alignment continuously by using Microsoft Purview. Data governance decisions should remain consistent across Microsoft 365, Azure, other clouds, and on‑premises systems. To apply this recommendation, use the following checklist:

Define your compliance requirements. Compliance requirements vary by industry, region, and workload. The governance model must reflect them before you apply any protective or technical controls. Identify which regulatory frameworks or industry standards are relevant to your organization. Best practices: Use Microsoft Purview regulatory templates and assessments. They represent standards such as GDPR, HIPAA and HITECH, PCI DSS v4.0, Sarbanes-Oxley Act, and ISO 27001. Review these templates to understand which rules apply across Microsoft 365, Azure, AWS, and Google Cloud. These templates provide a checklist of controls and practices you should have in place. Use the Compliance Manager's scores and recommendations to gauge your current posture, identify gaps, and prioritize what to address first.
Monitor data compliance. After you set up Purview's compliance templates, regularly review the compliance score and reports. Purview automatically assesses aspects of compliance for Microsoft 365 data and data across Azure, AWS, and Google Cloud. It highlights issues and suggested actions to improve compliance. Set up alerts for any critical compliance policy violations. Best practices: If Purview finds sensitive data in an unapproved location or if a retention policy is violated, notify responsible teams immediately so they can take action. Steady monitoring and incremental improvements make sure you're not caught off guard by audits or incidents.
Configure Microsoft 365 data retention. Decide how long to keep different types of data and when to delete or archive it. Ambiguity in retention can either lead to keeping data too long (risking compliance breaches or unnecessary storage costs) or deleting too soon (losing valuable history). Best practices: Use Microsoft Purview's data lifecycle management for Microsoft 365 data to set retention or deletion policies on emails, documents, and Teams messages.
Configure Azure data retention. Azure services require service‑specific retention and backup configuration. Best practices: Configure backup retention for services, such as Azure SQL, Cosmos DB, and MySQL. Use Azure Storage lifecycle management rules to archive or delete data. Reference service guidance for Azure SQL Database, Cosmos DB, and MySQL.
Configure on-premises and other clouds data retention. Data outside Microsoft platforms still requires compliant lifecycle controls. To avoid unmanaged compliance risk, apply intentional retention strategies to all non‑Azure and on‑premises data sources. Best practices: Use Azure Backup or third‑party solutions to retain on‑premises, AWS, and GCP data. Follow guidance to Backup cloud and on-premises workloads to cloud. Where needed, manually upload data to Azure Storage and archive the blob.

3. Data estate security baseline

A consistent security baseline makes sure that sensitive data remains protected across Microsoft 365, Microsoft Fabric, Azure services, and AI workloads. Recommendation: Define and enforce a single security posture across the entire data estate so that classification, protection, and enforcement operate uniformly at scale. To apply this recommendation, use the following checklist.

Enable sensitive data classifiers. Sensitive data classifiers in Microsoft Purview identify regulated and business‑critical data so that protection controls can act automatically and consistently across services. Best practices: Enable the relevant built‑in sensitive information types that align to regulatory and business risk. Create custom classifiers for domain‑specific data, such as trade secrets or proprietary research. Use these classifiers as the foundation for labeling, data loss prevention, and audit controls. Decision guidance: Decide to rely on default classifiers when regulatory alignment is the primary goal and the data types match standard definitions. Choose custom classifiers when business‑specific data carries risk that standard classifiers don't detect. Custom classifiers increase coverage but require governance oversight to remain accurate over time.
Apply data loss prevention (DLP). Configure DLP policies in Microsoft Purview to prevent leaks of sensitive data through everyday actions. Set up DLP for both Microsoft 365 (to cover emails, Office documents, and Teams chats) and Microsoft Fabric. A DLP policy can block a user from sharing a file externally if it contains confidential data like customer personal data or health information. In Fabric, DLP can prevent analysts from sharing a dataset or report that includes sensitive data. For detailed configuration, refer to the guidance for Microsoft 365 apps, Copilot, and Microsoft Fabric. Decision guidance:
- Monitor-only (audit mode): Start with DLP in monitor-only mode when you're concerned about disrupting work. It lets you observe and fine-tune policy behavior, but data might still be exposed because enforcement isn't active.
- Blocking or restricting: Move to blocking or restricting mode when data leakage would have severe effects and your detection rules are reliable. Some legitimate actions might be blocked initially and require exception handling and policy adjustments.
Protect data in Azure services. Microsoft Purview catalogs and labels data in Azure services but doesn't replace native security controls for those services. Best practices: Apply service‑level security controls, such as network isolation and Microsoft Defender coverage for Azure SQL Database, Azure Cosmos DB, and Azure Storage. Align these controls with Purview classifications so that monitoring and alerting reflect data sensitivity and business risk.
Protect data used by AI apps. AI applications introduce new data exposure paths that require explicit alignment with enterprise data governance and security policies. Microsoft Purview integrates with Microsoft Foundry and other AI platforms to provide this control. Best practices: Use Purview APIs and AI‑specific security features to pass sensitivity context into AI workflows so applications can apply masking or response restrictions when required. Establish review checkpoints for high‑impact AI scenarios to confirm that data access aligns with enterprise standards. To learn more, see Microsoft Purview and AI and Purview with Foundry.

Next step

Fabric governance and security baselines

Feedback

Was this page helpful?

Last updated on 2026-03-10