Introduction to Azure Purview – Data Governance

The word “purview” is loosely defined as range of visionscope, operations, and/or insight. In the vast cloud technology landscape that most organizations operate today, data governance is becoming overwhelmingly challenging to maintain.  Data sprawl and multi-tenancy make these challenges even more difficult.  It is nearly impossibly to truly know your data with absolute certainty, let alone to protect it.  

Azure Purview provides a data governance service that helps you control and manage data sources spread throughout your environment.  The service works in conjunction with existing solutions such as Microsoft Information Protection (Scan and protect M365 data sources) and the Unified Labeling Scanner App (Scan and protect on-premises SharePoint and File Share resources).  

Along with MIP and the unified labeling scanner, Azure Purview fills in a major gap in data discovery, classification, and protection.  Properly planned and deployed, Azure Purview can provide efficient data discovery, increased analytics of what data exists, as well as the ability to protect that data with familiar technology in Sensitivity Labels. 

Prerequisites 

  • Access to Microsoft Azure with a development or production subscription and the following resource providers enabled: 
    • Microsoft.Purview 
    • Microsoft.Storage 
    • Microsoft.EventHub 
  • Ability to create Azure resources including Purview accounts 
    • Access to data sources that you intend to register for Azure Security Center integration 
  • Access to Azure Security Center or ability to collaborate with Security Center Admin for data labeling 
  • For Sensitivity Label integration: 
    • Microsoft Entra ID (formerly Azure AD) Tenant 
    • M365 E5 licenses 

Prerequisites 

Azure Purview uses capacity units to determine its size.  Currently, you can specify either 4 or 16 capacity units.  1 Capacity unit can perform 1 API/sec throughput.  

While currently in preview, several items are provided at no cost with estimated pricing not advertised.  However, the following categories will incur charges once generally available:  

Azure Purview Data Map

This is the foundation of Azure Purview and the core cost. 

Scanning and Classification

Charges are only incurred for the duration of the scans.  For example, if a scan runs for 1 hour a month, you are only charged 1 unit based on the data source. 

Azure Purview Data Catalog

There are 3 tiers of the data catalog, each providing specific insights to your environment.

Supported Sources

The following data sources can be added to Azure Purview:  

  • Azure Blob storage 
  • Azure Cosmos DB 
  • Azure SQL Database 
  • Azure SQL Managed Instance 
  • Azure Data Explorer 
  • Azure Data Lake Storage Gen1 and Gen2 
  • Azure Synapse Analytics (SQL DW) 
  • Power BI 
  • SQL Server on-premises 

 The following file types are supported for scanning, for schema extraction and classification where applicable: 

 Structured file formats supported by extension:  

AVRO, ORC, PARQUET, CSV, JSON, PSV, SSV, TSV, TXT, XML 

Document file formats supported by extension:  

DOC, DOCM, DOCX, DOT, ODP, ODS, ODT, PDF, POT, PPS, PPSX, PPT, PPTM, PPTX, XLC, XLS, XLSB, XLSM, XLSX, XLT 

Purview Account

An Azure Purview account is required to get started.  This can be created within Azure Portal or PowerShell.  Simply choose the subscription, resource group, location, and provide a name for the Basic configurations.  Next choose platform size and additional options.  When finished, click create to finish the creation of your purview account. 

The following new roles exist in Azure to delegate permissions to the Azure Purview account:

Purview Studio

Purview Studio is the central management site to view and manage most aspects of Azure Purview.  Purview Studio can be accessed directly at https://web.purview.azure.com//  

There are currently 5 areas within Purview Studio.

Sensitivity Labels

Azure Purview integrates with Microsoft Information Protection Sensitivity Labels.  This integration requires Microsoft 365 E5 license and Information Protection Labels created to perform auto-labeling.  The following shows the supported sources for labeling. 

To configure Microsoft Information Protection, you first need to opt in or turn on the ability to extend labeling to Azure Purview.  This is a one-time task. 

Once enabled, you can then create a new label, or modify an existing one, and scope it to Azure Purview assets as well as configure Auto-labeling conditions. 

Conclusion

Azure Purview can help unify a variety of data sources within Azure as well as non-Azure resources.  The potential data source types can be endless, and we anticipate many additional data sources being supported over time with Azure Purview.  Combined with MIP and Unified Labeling scanner app, you can truly start to confidently say you know all about your data and can accurately classify and security protect each data source in your environment.  

Contact our team of Cloud Computing Consultants to help you get started!

Last updated on August 3rd, 2023 at 01:32 pm