This morning, I was reading Lawrence’s blog titled “Does Records Management Give Content Management a Bad Name?”, which picks on one of the points in Cheryl’s article “It’s a Digital-First World: Five Trends Reshaping Records Management As You Know It”, with some very insightful comments added by Christian. I started leaving a comment under Lawrence’s blog (which I will still do, pointing back to this) but there are too many points I wanted to add to the debate and it was becoming too long…
So, here is my take:
First of all, I want to move away from the myth that RM is a single requirement. Organisations look to RM tools as the digital equivalent to a Swiss Army Knife, to address multiple requirements:
- Classification – Often, the RM repository is the only definitive Information Management taxonomy managed by the organisation. Ironically, it mostly reflects the taxonomy needed by retention management, not by the operational side of the business. Trying to design a taxonomy that serves both masters, leads to the huge granularity issues that Lawrence refers to.
- Declaration – A conscious decision to determine what is a business record and what is not. This is where both the workflow integration and the auto-classification have a role to play, and where in an ideal world we should try to remove the onus of that decision from the hands of the end-user. More on that point later…
- Retention management – This is the information governance side of the house. The need to preserve the records for the duration that they must legally be retained, move them to the most cost-effective storage medium based on their business value, and actively dispose of them when there is no regulatory or legal reason to retain them any longer.
- Security & auditability – RM systems are expected to be a “safe pair of hands”. In the old world of paper records management, once you entrusted your important and valuable documents to the records department, you knew that they were safe. They would be preserved and looked after until you ask for them. Digital RM is no different: It needs to provide a safe-haven for important information, guaranteeing its integrity, security, authenticity and availability. Supported by a full audit trail that can withstand legal scrutiny.
Auto-categorisation or auto-classification, relates to both the first and the second of these requirements: Classification (using linguistic, lexical and semantical analysis to identify what type of document it is, and where it should fit into the taxonomy) and Declaration (deciding if this is a business document worthy of declaration as a record). Auto-classification is not new, it’s been available both as a standalone product and integrated within email and records capture systems for several years. But its adoption has been slow, not for technological reasons, but because culturally both compliance and legal departments are reluctant to accept that a machine can be good enough to be allowed to make this type of decisions. And even thought numerous studies have proven that machine-based classification can be far more accurate and consistent than a room full of paralegals reading each document, it will take a while before the cultural barriers are lifted. Ironically, much of the recent resurgence and acceptance of auto-classification is coming from the legal field itself, where the “assisted review” or “predictive coding” (just a form of auto-classification to you and me) wars between eDiscovery vendors, have brought the technology to the fore, with judges finally endorsing its credibility [Magistrate Judge Peck in Moore v. Publicis Groupe & MSL Group, 287 F.R.D. 182 (S.D.N.Y.2012), approving use of predictive coding in a case involving over 3 million e-mails.].
The point that Christian Walker is making in his comments however is very important: Auto-classification can help but it is not the only, or even the primary, mechanism available for Auto-Declaration. They are not the same thing. To take the records declaration process away from the end-user, requires more than understanding the type of document and its place in a hierarchical taxonomy. It needs the business context around the document, and that comes from the process. A simple example to illustrate this would be a document with a pricing quotation. Auto-classification can identify what it is, but not if it has been sent to a client or formed part of a contract negotiation. It’s that latter contextual fact that makes it a business record. Auto-Declaration from within a line-of-business application, or a process management system is easy: You already know what the document is (whether it has been received externally, or created as part of the process), you know who it relates to (client id, case, process) and you know what stage in its lifecycle it is at (draft, approved, negotiated, signed, etc.). These give enough definitive context to be able to accurately identify and declare a record, without the need to involve the users or resort to auto-classification or any other heuristic decision. That’s assuming, of course, that there is an integration between the LoB/process and the RM system, to allow that declaration to take place automatically.
The next point I want to pick up is the issue of Cloud. I think cloud is a red herring to this conversation. Cloud should be an architecture/infrastructure and procurement/licensing decision, not a functional one. Most large ECM/RM vendors can offer similar functionality hosted on- and off-premises, and offer SaaS payment terms rather than perpetual licensing. The cloud conversation around RM however, comes to its own sticky mess where you start looking at guaranteeing location-specific storage (critical issue for a lot of European data protection and privacy regulation) and when you start looking at the integration between on-premise and off-premise systems (as in the examples of auto-declaration above). I don’t believe that auto-classification is a significant factor in the cloud decision making process.
Finally, I wanted to bring another element to this discussion. There is another RM disruptive trend that is not explicit in Cheryl’s article (but it fits under point #1) and it addresses the third RM requirement above: “In-place” Retention Management. If you extract the retention schedule management from the RM tool and architect it at a higher logical level, then retention and disposition can be orchestrated across multiple RM repositories, applications, collaboration environments and even file systems, without the need to relocate the content into a dedicated traditional RM environment. It’s early days (and probably a step too far, culturally, for most RM practitioners) but the huge volumes of currently unmanaged information are becoming a key driver for this approach. We had some interesting discussions at the IRMS conference this year (triggered partly because of IBM’s recent acquisition of StoredIQ, into their Information Lifecycle Governance portfolio) and James Lappin (@JamesLappin) covered the concept in his recent blog here: The Mechanics on Manage-In-Place Records Management Tools. Well worth a read…
So to summarise my points: RM is a composite requirement; Auto-Categorisation is useful and is starting to become legitimate. But even though it can participate, it should not be confused with Auto-Declaration of records; “Cloud” is not a functional decision, it’s an architectural and commercial one.
Devin Krugly published a very interesting blog/article, describing the “The 7 Deadly Sins of Information Governance“. I enjoyed the article, and I can’t find anything to disagree with, but I have to admit that it left me wanting… The 7 sins presented by Devin are well known and very common problems that plague most Enterprise scale projects, as he points out within the article itself. They could equally apply to HR, supply chain, claims processing or any other major IT implementation. Devin has done a great job of projecting these pitfalls to an Information Governance program.
For me, however, what is really missing from the article is a list of “sins” that are unique to Information Governance projects. So let me try and add some specific Information Governance colour to the picture… Here is my list of seven even deadlier sins:
Governance needs a government. Information governance touches the whole of the organisation. It touches every system, every employee and every process. Decisions therefore that govern information, must be taken by a well defined governance body, that accurately represents the business, compliance, legal, audit and IT, at the very least. You cannot solve the Information Governance problem by throwing technology at it. Sure, technology plays a key part as an enabler, a catalyst and as an automation framework. But technology cannot determine policy, priorities, responsibility and accountability. Nor can it decide the organisation’s appetite for risk, or changes in strategic direction. For that, you need a governing body that defines and drives the implementation of governance.
Information does not mean data. I have talked about this in an earlier blog (Data Governance is not about Data). We often see Information Governance projects that focus primarily (or even exclusively) on transactional data, or data warehousing, or records management, or archiving, etc. Information Governance should be unified and consistent. There isn’t a different regulator for data, for documents, for emails or for tweeter messages. ANY information that enters, leaves or stays in the organisation should be subject to a common set of Governance policies and guidelines. The technical implementation a may be different but the governance should be consistent.
It is a marathon not a sprint. You can never run an “Information Governance Project”. That would imply a defined set of deliverables and a completion point at some specific date. As long as your business changes (new products, new suppliers, new customers, new employees, new markets, new regulations, new infrastructure, etc.) your Information Governance needs will also change. Policies will need revising, responsibilities will need adjusting, information sources will need adding and processes re-evaluating. Constantly! If your Information Governance project is “finished”, frankly, so is your business.
Keep it lean and clean. Information governance is the only cure for Content Obesity. Organisations today are plagued by information ROT (information that is Redundant, Outdated or Trivial). A core outcome of any Information Governance initiative should be the regular disposal of redundant information which has to be done consistently, defensibly and with the right level of controls around it. It is a key deliverable and it requires both the tools and the commitment of the governing body.
Remember: Not who or how, but why… Information Governance projects often get tangled up in the details. Tools, formats, systems, volumes, stakeholders, stewards, regulators, litigators, etc., become the focus of the project and, more often the not, people forget the main driver: Businesses need good, clean and accessible information to operate. The primary role of Information Governance is to deliver accurate, timely and reliable information to the business, for making decisions, for creating products and for delivering services. Every other issue must come second in priority.
The ministry of foreign affairs. The same way that a country cannot be governed without due consideration to the relationship with its neighbours, Information Governance does not stop at the company’s firewall. Your organisation continuously trades information with suppliers, customers, partners, competitors and the wider community. Each of these exchanges has value and carries risks. Monitoring and managing the quality, the trustworthiness, the volume and the frequency of the information exchanged, is a core part of Information Governance and should be clearly articulated in the relevant policies and implemented in the relevant systems.
This is not a democracy, it’s a revolution. Implementing Information Governance is not an IT project, it is a business transformation project. Not only because of its scope and the potential benefit and risk that it represents, but also because of the level of commitment and engagement it requires from every part of the organisation. Ultimately, Information Governance has a role in enforcing information quality, regulatory and legal controls, and it is contributing to the organisation’s accountability. The purpose of on Information Governance implementation is not to ensure that everyone is happy and has an equal voice on the table. The purpose is to ensure that the organisation does the right thing and behaves responsibly. And that may require significant cultural change and a few ruffled feathers…
If you don’t already have an Information Governance initiative in your organisation, now is the time to raise the issue to the board. If you do, then you should carefully consider if the common pitfalls presented here are addressed by your program, or if you are in danger of committing one or more of these sins.
Content Obesity: An organisational condition in which excess redundant information has accumulated to the extent that it may have an adverse effect on business efficiency, leading to depleted budgets, reduced business agility and/or increased legal and compliance risks.
First of all, let me apologise to all the people who are currently suffering from obesity, or who are supporting friends and family that do. I have no intention of making fun of obese people and I have great sympathy and respect for the pain they are going through. I lost my best friend to a heart attack. He was obese.
In a recent conversation with a colleague, about Information Lifecycle Governance and Defensible Disposal, I made a casual remark about an organisation suffering from Content Obesity. I have to admit that it was an off-the-cuff remark, but it conveyed very succinctly the picture I was trying to paint. Since then, the more I think about this analogy the more sense it makes.
People are not born obese, they become obese. And they don’t become obese overnight, it’s a slow, steady process. Unless it’s addressed early, the problem grows in very predictable stages: gaining weight, being overweight, being obese, being morbidly obese, dying. Most people, however, do not want to acknowledge the problem until it is too late. They live in denial, they make excuses, they make jokes. Until it’s often too late to reverse the process.
Organisations consume and generate content at an incredible rate: IDC’s Digital Universe study (2011), predicts an information growth factor of 50x between 2010 and 2020. Just to give that figure some context: If an average grown up person would grow at the same rate, they would weigh 3.5 tons by 2020!. Studies we conducted with our own customers, puts the annual growth rate at a slightly more conservative figure of 35-40% per year, which is still significant.
We love our digital content these days, we can’t get enough!
We all create office files and our presentations are growing larger, our email rate is not slowing down (we have several accounts each), we communicate with our customers electronically more than ever before, we collaborate inside and outside the firewall, we engage in social media, we text, we document life with our mobile phones’ cameras and we use YouTube videos extensively for marketing and education. We collect and analyse blogs and conferences and twitter streams. We analyse historical transactional data and we create new predictive ones. And if collecting our own streams is not enough, we also collect those of our competitors so that we can analyse them too. Our electricity meter collects data, our car collects data, our traffic sensors collect data, our mobile phones collect data, our supermarkets collect data. We have an average of two game consoles per family (all of which connect to the internet), we watch high-definition TV, from every fixed or portable device that has a screen, our kids have mobile phones, and PSPs and DSs and laptops. We have our home computer, our work laptop, our BYOD tablet and our smart phones. Our average holiday yields over 500 pictures, all of which are 12 Megapixel. And the kids take another 500 with their camera… In fact we generate so much digital data, that we now have special ways of handling it with Big Machines that manage Big Data to give Big Insights. And that is all wonderful, and it all exploded in the last five years.
I’ll say it again: We love digital content.
Going back to my health analogy, you could say that we gorge on content. The problem is, we are now overweight with content, since most of that content has been accumulated without any particular thought of organisation or governance. So today, we can’t lose weight, we can’t clean it up because IT doesn’t know what it is, where it is, who owns it or if it’s of any use to anyone. And, frankly, because it’s far too much hassle and we have better things to do. It’s all digital so… “storage is cheap, we’ll just buy some more storage”: A staggering 78% of respondents to another recent study, stated that their strategy for dealing with data growth was to “buy more storage”!
Newsflash: Storage is not cheap! By the time you create your high-availability, tier-1 storage with 3 generations of backup tapes and put it in a data centre, pay for electricity and air-conditioning, and pay people to manage it, it’s no longer cheap. Even if storage prices go down by 20% per year, if your data grows at 40%, you are still 20% worse off… Simple maths!
Most organisations are still in denial about the problem. The usual answer to the question “How much storage do you currently have and how much does it grow each year?” is “We don’t really know, we never measured it that way”. Well, I would argue that whoever is writing the cheque to the storage vendors every year, ought to know.
Fortunately, for large multinational organisations (banks, pharmaceuticals, energy, etc), the penny has finally dropped. Growth rates of 40%, on a storage estate of 20 Petabytes, translates to an increase of dozens of millions of storage costs per year. In an economy where IT budgets are shrinking, this is not a pleasant conversation to have with your CFO. These organisations are now self-diagnosed as Content Obese, and are desperately looking for ways to curb the growth, before they become Morbidly Obese.
And, similarly to the human disease, Content Obesity has side effects. Even if you could somehow overcome (or overlook, or sweep under the carpet…) the cost implications, it creates huge health risks for the organisation.
Firstly, it creates risks for the Business. Unruly, high volumes of content clog up processes, the arteries of the business. Content that is lost in the bulk, uncategorised and not readily available to support decision making, is slowing down the flow of information across the organisation. Content that is obsolete or outdated can create confusion and lead to incorrect decisions. Unmanaged content volumes do not lend themselves to fast changing business models, marketing innovation, shared services or better customer support. And by consuming huge amount of IT capital, they also stifle investment and innovation into new business services.
Secondly, it creates a huge Legal risk. All electronic content in the organisation, is potentially discoverable. The legal group has a duty to preserve information that is relevant to litigation. When information is abundant and not governed, the only method that the legal group has to identify and preserve it, is by notifying all people that may have access to it – custodians – asking them to protect it. This approach is inaccurate, expensive and time consuming. And when it comes to delivering that information to opposing parties or the courts, the organisation has to sift through these huge volumes of content to identify what is actually relevant, often incurring huge legal fees in the process. (Unashamed plug: If you are interested to find out more about the role of Information governance in UK civil litigation, I recommend this excellent IBM paper authored by Chris Dale, respected author of the eDisclosure Information Project)
Finally, Content Obesity creates a huge Compliance risk. Different regulations dictate that records are kept for defined periods of time. Privacy and data protection regulations, dictate that certain types of content are disposed of, after defined periods of time. Record Managers often have to comply with multiple (and often conflicting) regulations, from multiple jurisdictions, affecting hundreds of systems and millions of records. An ever-growing volume of unclassified content, means that records cannot be correctly identified, disposition schedules cannot be executed consistently and policies remain on a binder on the shelf (or in a PDF file somewhere on the intranet). Regulatory audits become impossible, wasting valuable resources and often leading to significant fines (As the regulator put it in one of many examples: “These failings were made worse by their inability to determine the areas in which the breakdown in its record keeping systems had occurred“)
So, how much of that content do organisations actually need to keep? And who has the responsibility and the right to get rid of it?
I must confess: I am not a legal expert and my closest encounter with a courtroom is through the safety of a television screen.
I realised recently however, that inside my brain I have multiple and conflicting views of “The law”.
I grew up in Athens and, even though my grandfather was a lawyer (or maybe echoing his cynicism), I have grown up with an inherent mistrust of all thing ‘legal’: Legalese language that seeks to confuse and befuddle the average mortal; vulcher lawyers that procrastinate in order to maximise their hourly fees; legal cases that run for years and years because scheduled court hearings get postponed on technicalities; the list goes on…
In another compartment of my brain lives the virtuous, almost glamourous, world of TV courtroom drama with a very diverse portrayal of reality, ranging from Rumpole Of the Bailey and Kavanagh QC to Ally McBeal and Law and Order. Where young and old conscientious lawyers are burning the midnight oil, over endless stacks of case law books, looking for the one nugget that will exonerate their Ill-accused client and where honour, ethics and the omnipotent sage of the presiding Judge, prevail to save the day.
Many many years ago, I was involved in the delivery of early, bespoke Document Management systems to large law firms, such as Clifford Chance, Linklaters, Cameron Markby Hewitt (as it was then…), and others, which gave me yet a different perspective: One where Law firm partners are considered akin to deity, hordes of hopeful legal students and young lawyers work through endless hours of menial tasks in order to establish themselves on a career ladder, where information is king but information systems are a foe and where laborious, manual processes represent the status quo. Admittedly that experience was over ten years ago, but it was a cut-throat business then and I doubt that much has changed since.
More recently, I have been marginally involved with the world of electronic discovery and reading about legal proceedings on both sides of the Atlantic, often through the excellent commentary of Chris Dale’s insightful blog. Through this, I have seen a more earthy view of litigation, where monetary considerations, negotiations, common sense (if such a thing exists….), judgments written in plain English, project management, geopolitical variances and the general admission that nobody, not even judges, are immune to the complexities of technological innovation, paint a picture of a legal environment that looks, well… almost business like! Commercial reality (and the associated astronomical costs of litigation) often dictate that cases are assessed, negotiated and settled on the merits of cost and objectives, not just “fairness” and “justice”.
Which of the views in my brain is more realistic? I don’t know. I find all of them fascinating: I am intrigued, watching an industry which is thousands of years old, constantly evolving and seeking to learn new tricks, acknowledging its own shortcomings and fighting to keep up with technological innovation – just like the rest of us!