Office 365 Retention Labels – can it work?

Microsoft continues to evolve the Office 365 service to meet governance, compliance and regulatory requirements across the globe. The Microsoft compliance page is filled with lists of compliance statements and evidence organised by geographic location, regulation, policy and standard. One of the compliance areas that Microsoft is improving is Records Management, in particular being able to automatically apply a retention policy against documents.

There are some key shortcomings in the Office 365 functionality, and they are mainly around the ability to apply a Records Management Policy to documents. This isn’t necessarily a problem with how the Office 365 Records Management functionality works, it’s just written for a different type of Records Management – document-by-document management, instead of the ISO 15489 and ISO 161675 standards, that are focused on context.

Here is an example – in Office 365, Records Management is implemented by applying a Retention Label to a document. The Microsoft documentation on Retention Labels is very clear:

“It’s important to know that content like an email or document can have only a single retention label assigned to it at a time…”

A retention label can be anything you want: an invoice, a corporate policy, a project plan, or approval email for publishing a website. You just can’t have more than one label. A project plan to build a new building can’t have two labels: one for “a project plan” and another for “asset management”. With only one label allowed, this means you can only have one retention policy applied to the document as well.

Applying retention labels is performed either manually or automatically. The manual application of a Retention Label follows the pattern of requiring the end-user to pick the “correct” label. How many labels do we show to our end user? How many is too many? Who checks that the label is correct? Can we check every label that is applied to make sure it is appropriate?

Office 365 also allows an Administrator to configure automatic labels to be applied to documents and email, and again this can cause problems when assigning the correct label. Our Records Authorities and Disposal Authorities often have hundreds of different schedules to apply. Automating the application of these labels is time consuming and can result in a large and often complex “rules engine” to maintain the Labels. The Microsoft Machine Learning tools still require up to 10,000 curated documents per rule to train the automation, which can any project to establish automation (and the ongoing cost to add new rules) quite expensive.

At this stage, the model of applying a single label falls short of the requirements of the international standard – to manage at the aggregation level, and sentence for the highest-retention rule that applies to the content. For example, a contract looks like Routine Procurement, 7 Years. However, if you read the detail of that contract, you may find a section on escalations to the Minister in the event of a breach. That would make the document Routine Government Relations, 10 years. And what about freezes, holds, and other obligations from Acts and Regulations?

Users, and rules engines, are not well-versed enough in all the rules that could apply, and don’t have the ability to read and interpret thousands of words of content for each document in order to map them. A single label will never be enough for accurate sentencing, and without true automation, it’s not possible to apply multiple labels to comprehensively address the content.

Additionally, this is only a point solution. It addresses documents and emails, within Office365. But under the standards, and for good governance overall, we need to manage all parts of a record in context. We can’t just manage the documents about the project that are in SharePoint – we also need to relate them to the project invoices in the finance system, the project communications on social media, and the project specialist outputs in line-of-business systems. Remember, you cannot dispose of a document by itself, the whole aggregation (the ‘meaningful record’) has to be disposed of as a unit. Knowing the retention policy for a single document does not tell you how long it needs to be kept – you can only know how long to keep each item if you know how long to keep every item. When we sentence content, it has to be across systems and formats, so that the whole record is included.

There are three other main issues:

1 – Documents change, and so do rules. Who tracks this, and updates the labels to match? Sentencing in particular is supposed to be reassessed each time the document changes. Who will track this, determine if a new label is required, and apply the label?

2 – Sentencing is based on function and activity pairs. In the example above, you may have a project plan, for a building. This will have a different retention to a project plan for a runway, for example. So, you need a rule for every possible combination. AFDA v2 has been streamlined to cover administrative records, but there are still 86 function/activity pairs, and 257 subclasses within those classes. That’s just for one Records Authority. By the time you add your core business RAs, and other GRAs, there are likely to be thousands of rules, which could mean thousands of labels. Asking users to choose their class from the full BCS never worked in EDRMS systems and won’t work now.

3 – Sentencing is only one type of obligation for a record. Continuing the runway example, under the Airports Act, if you have incorrect information about a runway on your network, you can go to jail for 6 months. So, you would want to tag all runway information presumably, to keep a close track on it! But do you use your one and only compliance label for this regulatory rule, or for the sentencing rule? How do you choose? There are many types of information compliance obligations that pertain to how you retain, not just how long you retain. Can a single retention label manage this nuance?

So, is it promising? Yes! Microsoft has traditionally focused on an archiving model, which does not align with the international standard, so any move towards applying continuum and in-place management is good news. But there are limitations, the same as with any solution that attempts to apply ‘automation’ with manually created and curated rules engines, and/or user intervention. It will be good to see some outcomes of case studies with this new approach to see if the challenges can be overcome.