How Data Archival Brings Stronger Data Governance

Tom Papahronis
Tom Papahronis

Strategic Advisor - eGroup Enabling Technologies

The Challenge

Quite often I work with clients that are trying to untie the Gordian knot that is data retention and stale data deletion. In most cases, these organizations have adopted some form of “keep everything forever” because it is difficult to get internal legal or risk teams to define a realistic retention policy—or if a policy exists, no one has the time and span of authority to implement it. As result, the IT and security teams are faced with a risk they can’t realistically mitigate. Organizational hesitancy to delete data may not be something that can be overcome, so archival focused on reducing risk may be the next best option.

In this blog post, we will discuss the considerations and processes of data governance that need to be implemented for piloting Copilot for Microsoft 365. The suggestions are based on experiences from a growing number of live customer engagements. For a more comprehensive approach to rolling out Generative AI in the M365 stack, check out our Copilot for Microsoft 365 Launch Protocols.

The Risks of Inactive or Stale Data

The more data you have, the more that can be exfiltrated or misused:

  • Often, stale data has no real business value, but it does present risk. As an example, organizations may retain former employee identity, health, or financial data longer than is required or useful. In the case of a breach, the organization would be held accountable for credit monitoring costs and liable not only for active or recently active employees, but also for employees from years ago.
  • Active data is often broadly accessible so that it is available to staff during normal business processes. When that data becomes inactive over time, that broad access often remains because the inactive data is stored alongside the active. In the event of a compromised employee account, all the file data that the employee has access to is at risk of exfiltration, including both active and inactive files. This risk is often overlooked. A compromised account will only give access to one employee’s email data, but an entire workgroup or company’s file data can potentially be accessed since employees have such broad access.
  • Better indexing and automated tools, like Enterprise Search and Copilot for Microsoft 365, can inadvertently cause internal data overexposure and misuse. Search Indexes and tools like Copilot for M365, while invaluable from an efficiency perspective, may also present stale or outdated data alongside and equal to recent, more accurate files.

  • Compliance requirements that mandate retention can inadvertently put data at risk. Similar to the risk outlined in the bullet above, compliance mandates require the organization to keep specific data for long periods, but often this retained data is not separated from the active files. Data that is held only for compliance reasons is often held alongside active data for years—even if it is an easy category of data to separate into a less-accessible archive location.

The Approach

Rather than pushing for the deletion of stale data, which can be a scary proposition and often impossible to get agreement on, consider data archival as perhaps a lesser, but more achievable path. Relocating inactive data into archival locations can significantly reduce user access to data they don’t need, therefore making it harder to exfiltrate or to be mistakenly referenced by automated tools.

If you combine archival with other data governance controls like sensitivity labels, data loss prevention policies, and data access reviews, you can achieve significant risk reduction without being hamstrung by a lack of opportunity to delete things. Again, this may not be the ideal solution, but it is far better than doing nothing, which is often what effectively happens otherwise. Archival can also be a stepping stone to a data deletion policy in the future.

  • The archive location should be accessible to as few users as possible so that the compromise of a typical user account would not grant access to the archive.
  • Archived files are still discoverable and can be searchable, even if most employees do not have direct access.
  • Automation can be built to ease the process of moving archived data back to active status if needed. Power Automate applications are a great way to provide this, and approval workflows can be included so that any potential misuse or exfiltration can be detected.
  • Encourage users to get business record attachments out of email and into the appropriate file locations or systems of record. This will enable you to have a shorter email retention policy that would further reduce email data exposure in the event of a compromised account.
  • SharePoint document libraries can be configured to apply default retention labels to all files they contain so that the organization does not need to rely on employees manually labeling files or auto-labeling based on document contents. These library labels can still be used to help automate the movement of inactive data to the archive after a prescribed length of time.
  • Archival processes should complement (not replace) the roles of sensitivity labels, retention policies, and DLP as parts of your data governance strategy as a whole. Archival is just one more tool in that toolbox provided by Purview.
  • If the organization does decide to take on deletion decisions down the road, the archive is the best place to start that work, since you know the data is inactive and not in use.

Your existing E3 or E5 licensing already includes many or all of the capabilities to leverage archival as an additional option to improve data governance. Like many other applications of Microsoft 365 features, use a crawl-walk-run strategy to get an archival initiative started. Begin by approaching amenable groups such as HR, Finance, or Legal to partner with, and start by identifying where the high risk but inactive files live in their groups. You can even position this effort as a prerequisite to widespread Copilot for Microsoft 365 use, as it will help mitigate any overexposure risks that may exist.

Learn More About Data Governance

Contact our team of experts who can help ensure you’re properly protecting and managing your data!