Archive

Posts Tagged ‘ECM’

ECM is dead. Long live ECM…

December 2, 2013 7 comments

It’s Autumn. The trees are losing their leaves, the nights are getting longer, it’s getting cold and grey and generally miserable. It’s also the time for the annual lament of the Enterprise Content Management industry and ECM… the name that refuses to die!

At least once a year, ECM industry pundits go all depressed and introspect and predict, once again, that our industry is too wide, too narrow, too complex, too simplified, too diverse or too boring and dying or not dying or dead and buried. Once again this year, Laurence Hart (aka Pie), Marko Sillanpää, Daniel Antion, John Mancini and, undoubtedly, several other esteemed colleagues, with a collective experience of several hundred years of ECM on their backs, will try (and fail) to reconcile and rationalize the semantics of one of the most diverse sectors in the software industry.

You will find many interesting points and universal truths about ECM if you follow the links to these articles above. Some I agree with wholeheartedly, some I would take with a pinch of salt.

But let me assure you, concerned reader, that the ECM industry is not going anywhere, the name will not change and we will again be lamenting its demise, next Autumn!

There is a fundamental reason why this industry is so robust and so perplexing: This is not a single industry, or even a single coherent portfolio of products. It’s a complex amalgamation of technologies that co-exist and complement each other, with the only common denominator being an affinity for managing “stuff” that does not fit in a traditional relational database. And every time one of these technologies grows out of favour, another new discipline joins the fold: Documents and emails and archives and repositories and processes and cases and records and images and retention and search and analytics and ETL and media and social and collaboration and folksonomies and cloud, and, and, and… The list, and its history, is long. The reason this whole hotchpotch will continue to be called Enterprise Content Management, is that we don’t have a better collective noun that even vaguely begins to describe what these functions do for the business. And finally, more and more of the market (you know, the real people out there, not us ECM petrolheads…) are starting to recognise the term, however vague, inappropriate and irrational it may be to the purists among us.

And there is one more reason: Content Management is not a technology, it’s an operational discipline. Organisations will manage content with or without ECM products. It’s just faster, cheaper and more consistent if they use tools.

As I said, if you have an academic interest in this ECM industry, the articles above are definitely worth reading. For my part, I would like to add one more thought into that mix:

The word “Enterprise” in “ECM” has been the source of much debate. And whilst I agree with Laurence that originally some of the vendors attempted to promote the idea of a single centralised ECM repository for the whole enterprise, that idea was quickly abandoned in the early ’00s as generally a bad idea. Anyone who has tried to deploy this approach in a real world environment, can give you a dozen reasons why it’s really, really a very naïve idea.

Nevertheless, Content Management has always been, and will always be “Enterprise”, in the sense that it very rarely works as a simple departmental solution. There is very little value in doing that, especially when you combine it with process management, which adds the most value when crossing inter-departmental boundaries. It is also “Enterprise” in the sense that as a platform it can support both vertical and horizontal applications across most parts of an organisation. Finally, there are certain applications of ECM, that can only be deployed as “Enterprise” tools: It would be madness to design Records Management, eMail archiving, eDiscovery or Social collaboration solutions, on a department by department basis. There is no point!

That’s why, in my opinion at least, the term ECM will live for a long time yet… Long Live ECM!

Advertisements

Do you organise your fridge like your information??

September 24, 2013 2 comments

It’s not often that I describe a refrigerator as a taxonomy, so bear with me here… So, you loaded up the car with your grocery shopping, you brought it all in the kitchen from the car, and you are about to load up the fridge. Do you organise your fridge layout based on the  “Use By” date of the products? No, nobody does. You put the vegetables in the vegetable drawer, you put the raw meats on a shelf of their own, the yoghurts and the desert puddings on a separate shelf. The eggs go in the door. You may consider the use-by date as you stack things of the same category, e.g. the fresh chicken will have to be eaten before the sausages which will still last until next week, but that’s incidental, it’s not the primary organisational structure. Your fridge has a taxonomy, a classification scheme, and it is organised functionally, by product class, not by date.

Where am I going with this? Records and retention management (where else?). It’s over fours years ago, that I wrote an article called “Is it a record? Who cares!”  which created quite a bit of animosity in the RM community, and I quickly had to follow it up with a Part 2  to explain that my original title was quite literal, not sarcastic.

Four years later, I find myself still having very similar conversations with clients and colleagues. The more we move into an era of Information Governance, the more the distinction between records and non-records becomes irrelevant. And the more we move from the world of paper documents to the multi-faceted world of electronic content, the more we need to move away from the “traditional” records management organisational models of retention-based fileplans: The physical management of paper records necessitated their organisation in clusters of documents with similar retention requirements in order to dispose of them, so classification taxonomies (fileplans) were organised around that requirement.

In the digital world, this is no longer a requirement. Retention period, is just another logical attribute (metadata) applied to each individual content piece, not an organisational structure. With the right tools in place, a retention model can be associated with each piece of content individually, and collections of content with the same retention and – more importantly, disposition – periods, can be assembled dynamically as and when required.

For me, there are only two logical questions that drive the classification of digital content: “What is it?” (the type of content, or class) and “What is it for?” (the context under which it has been, or will be used). To use an example: An application form for opening a new account, is a certain type of content which will determine its initial retention period while it’s being processed. If that application is approved or rejected, is context that will further affect its retention period. If the client raises a dispute about his new account, it may further impact that retention period of that application form. This context-driven variance, cannot be supported in a traditional fileplan-based records management system, which permanently fixes the record – fileplan – retention relationship.

The classification (organisation, taxonomy, use any term you like…) of that content, is not even relevant to this fileplan/retention discussion. The application form in the previous example, will need to be associated with the customer, the account type, and the approval process or the dispute process. That is the context under which the organisation will need to organise and find that particular application form. You will not look for it by its retention period, unless you are specifically looking to dispose of it.

To go back to my original fridge metaphor: You will not start cooking dinner by picking up the item in the fridge that will expire first – that’s probably the pudding. You will look in the relevant shelf for the food you are trying to cook: meat or vegetables or eggs. Only after that you may double check the date, to see if it is still valid or expired.

So… I remain convinced that:
(a) there is no point in distinguishing between records and non-records any more, non-records are just records with zero shelf-life
(b) the concept of a “fileplan” as a classification structure is outdated and unnecessary for digital records, and
(c) it’s time we start managing content “in context”, based on its usage history and not as an isolated self-defining entity.

As always, I’m keen to hear your thoughts on this.

P.S. I read some blogs to learn, some for their amusing content, and some because (even if their content sometimes irritates me) force me to re-think. I read Chris Walker’s blog because it generally makes me nod my head in violent agreement 🙂 . He often expresses very similar views to mine and I find his approach to Information Governance (which he is now consolidating into a book) extremely down to earth. The reason for this shameless plug to his blog, is that as I was writing the thoughts expressed above, I caught up with his article from last week Big Buckets of Stuff, that covers very similar ground… Well worth a read.

A clouded view of Records and Auto-Classification

When you see Lawrence Hart (@piewords), Christian Walker (@chris_p_walker) and Cheryl McKinnon (@CherylMcKinnon) involved in a debate on Records Management, you know it’s time to pay attention! 🙂

This morning, I was reading Lawrence’s blog titled “Does Records Management Give Content Management a Bad Name?”, which picks on one of the points in Cheryl’s article “It’s a Digital-First World: Five Trends Reshaping Records Management As You Know It”, with some very insightful comments added by Christian.  I started leaving a comment under Lawrence’s blog (which I will still do, pointing back to this) but there are too many points I wanted to add to the debate and it was becoming too long…

So, here is my take:

First of all, I want to move away from the myth that RM is a single requirement. Organisations look to RM tools as the digital equivalent to a Swiss Army Knife, to address multiple requirements:

  • Classification – Often, the RM repository is the only definitive Information Management taxonomy managed by the organisation. Ironically, it mostly reflects the taxonomy needed by retention management, not by the operational side of the business. Trying to design a taxonomy that serves both masters, leads to the huge granularity issues that Lawrence refers to.
  • Declaration – A conscious decision to determine what is a business record and what is not. This is where both the workflow integration and the auto-classification have a role to play, and where in an ideal world we should try to remove the onus of that decision from the hands of the end-user. More on that point later…
  • Retention management – This is the information governance side of the house. The need to preserve the records for the duration that they must legally be retained, move them to the most cost-effective storage medium based on their business value, and actively dispose of them when there is no regulatory or legal reason to retain them any longer.
  • Security & auditability – RM systems are expected to be a “safe pair of hands”. In the old world of paper records management, once you entrusted your important and valuable documents to the records department, you knew that they were safe. They would be preserved and looked after until you ask for them. Digital RM is no different: It needs to provide a safe-haven for important information, guaranteeing its integrity, security, authenticity and availability. Supported by a full audit trail that can withstand legal scrutiny.

Auto-categorisation or auto-classification, relates to both the first and the second of these requirements: Classification (using linguistic, lexical and semantical analysis to identify what type of document it is, and where it should fit into the taxonomy) and Declaration (deciding if this is a business document worthy of declaration as a record). Auto-classification is not new, it’s been available both as a standalone product  and integrated within email and records capture systems for several years. But its adoption has been slow, not for technological reasons, but because culturally both compliance and legal departments are reluctant to accept that a machine can be good enough to be allowed to make this type of decisions. And even thought numerous studies have proven that machine-based classification can be far more accurate and consistent than a room full of paralegals reading each document, it will take a while before the cultural barriers are lifted. Ironically, much of the recent resurgence and acceptance of auto-classification is coming from the legal field itself, where the “assisted review” or “predictive coding” (just a form of auto-classification to you and me) wars between eDiscovery vendors, have brought the technology to the fore, with judges finally endorsing its credibility [Magistrate Judge Peck in Moore v. Publicis Groupe & MSL Group, 287 F.R.D. 182 (S.D.N.Y.2012), approving use of predictive coding in a case involving over 3 million e-mails.].

The point that Christian Walker is making in his comments however is very important: Auto-classification can help but it is not the only, or even the primary, mechanism available for Auto-Declaration. They are not the same thing. To take the records declaration process away from the end-user, requires more than understanding the type of document and its place in a hierarchical taxonomy. It needs the business context around the document, and that comes from the process. A simple example to illustrate this would be a document with a pricing quotation. Auto-classification can identify what it is, but not if it has been sent to a client or formed part of a contract negotiation. It’s that latter contextual fact that makes it a business record. Auto-Declaration from within a line-of-business application, or a process management system is easy: You already know what the document is (whether it has been received externally, or created as part of the process), you know who it relates to (client id, case, process) and you know what stage in its lifecycle it is at (draft, approved, negotiated, signed, etc.). These give enough definitive context to be able to accurately identify and declare a record, without the need to involve the users or resort to auto-classification or any other heuristic decision. That’s assuming, of course, that there is an integration between the LoB/process and the RM system, to allow that declaration to take place automatically.

The next point I want to pick up is the issue of Cloud. I think cloud is a red herring to this conversation. Cloud should be an architecture/infrastructure and procurement/licensing decision, not a functional one. Most large ECM/RM vendors can offer similar functionality hosted on- and off-premises, and offer SaaS payment terms rather than perpetual licensing. The cloud conversation around RM however, comes to its own sticky mess where you start looking at guaranteeing location-specific storage (critical issue for a lot of European data protection and privacy regulation) and when you start looking at the integration between on-premise and off-premise systems (as in the examples of auto-declaration above). I don’t believe that auto-classification is a significant factor in the cloud decision making process.

Finally, I wanted to bring another element to this discussion. There is another RM disruptive trend that is not explicit in Cheryl’s article (but it fits under point #1) and it addresses the third RM requirement above: “In-place” Retention Management. If you extract the retention schedule management from the RM tool and architect it at a higher logical level, then retention and disposition can be orchestrated across multiple RM repositories, applications, collaboration environments and even file systems, without the need to relocate the content into a dedicated traditional RM environment. It’s early days (and probably a step too far, culturally, for most RM practitioners) but the huge volumes of currently unmanaged information are becoming a key driver for this approach. We had some interesting discussions at the IRMS conference this year (triggered partly because of IBM’s recent acquisition of StoredIQ, into their Information Lifecycle Governance portfolio) and James Lappin (@JamesLappin) covered the concept in his recent blog here: The Mechanics on Manage-In-Place Records Management Tools. Well worth a read…

So to summarise my points: RM is a composite requirement; Auto-Categorisation is useful and is starting to become legitimate. But even though it can participate, it should not be confused with Auto-Declaration of records;  “Cloud” is not a functional decision, it’s an architectural and commercial one.

I buy, sell, market, service… When did ECM become a Monte Carlo celeb?

P1030993sI am writing this at 40,000 feet, on a morning flight to Nice, final destination Monte-Carlo, for what promises to be a very busy 4-day event. The European leg of IBM’s Smarter Commerce Global Summit runs from 17-20 June at the Grimaldi Forum in Monaco, and in a strange twist of fate I am neither a speaker nor an attendee. I am staff!

The whole event is structured around the four commerce pillars of IBM’s Smarter Commerce cycle: Buy, Sell, Market and Service. Each pillar represents a separate logical track at the event, covering the software, services and customer stories.

Enough with the corporate promo already, I hear you say, where does Enterprise Content Management come into this? Surely, SmarterCommerce is all about retail, transactional systems, procurement, supply chain, CRM and marketing campaign tools?

Yes and no. It’s true that in the fast moving, high volume commercial transaction world, these tools share the limelight. But behind every new promotion, there is a marketing campaign review; behind every supplier and distributor channel, there is a contract negotiation; behind every financial transaction there is compliance; behind every customer complaint there is a call centre; and behind every customer loyalty scheme, there is an application form: ECM underpins every aspect of Commerce. From the first approach to a new supplier to the friendly resolution of a loyal customer’s problem, there is a trail of communication and interaction, that needs to be controlled, managed, secured and preserved. Sometimes paper-based, but mostly electronic.

ECM participates in all commerce cycles: Buy (think procurement contracts and supplier purchase orders and correspondence), Sell (invoices, catalogues, receipts, product packaging, etc.), Market (collateral review & approval, promotion compliance, market analysis, etc.).

But the Service cycle is where ECM has the strongest contribution, and its role goes much beyond providing a secure repository for archiving invoices and compliance documents: The quality, speed and efficiency of customer service, relies on understanding your customer. It relies on knowing what communication you have previously had with your customer or supplier (regardless of the channel they chose), it relies on understanding their sentiment about your products, it relies on anticipating and quickly resolving their requests and their problems.

As a long-standing ECM advocate, I have had the privilege of leading the Service track content at this year’s IBM Smarter Commerce Global Summit in Monaco. A roller-coaster two month process, during which we assembled over 250 breakout sessions for the event, covering all topics related to commerce cycles, and in particular for customer service: Advanced Case management for handling complaints and fraud investigations; Content Analytics for sentiment analysis on social media; Mobile interaction monitoring, to optimise the user’s experience; Channel-independent 360 degree view of customer interaction; Digitising patient records to minimise hospital waiting times; Paperless, on-line billing; Collaboration tools to maximise the responsiveness of support staff; and many more.

A global panel of speakers, with a common goal: putting the customer at the very centre of the commercial process and offering the best possible experience with the most efficient tools.

More comments after the event…

Seven even deadlier sins of Information Governance

October 7, 2012 3 comments

Devin Krugly published a very interesting blog/article, describing the “The 7 Deadly Sins of Information Governance“. I enjoyed the article, and I can’t find anything to disagree with, but I have to admit that it left me wanting… The 7 sins presented by Devin are well known and very common problems that plague most Enterprise scale projects, as he points out within the article itself. They could equally apply to HR, supply chain, claims processing or any other major IT implementation. Devin has done a great job of projecting these pitfalls to an Information Governance program.

For me, however, what is really missing from the article is a list of “sins” that are unique to Information Governance projects. So let me try and add some specific Information Governance colour to the picture… Here is my list of seven even deadlier sins:

Governance needs a government. Information governance touches the whole of the organisation. It touches every system, every employee and every process. Decisions therefore that govern information, must be taken by a well defined governance body, that accurately represents the business, compliance, legal, audit and IT, at the very least. You cannot solve the Information Governance problem by throwing technology at it. Sure, technology plays a key part as an enabler, a catalyst and as an automation framework. But technology cannot determine policy, priorities, responsibility and accountability. Nor can it decide the organisation’s appetite for risk, or changes in strategic direction. For that, you need a governing body that defines and drives the implementation of governance.

Information does not mean data. I have talked about this in an earlier blog (Data Governance is not about Data). We often see Information Governance projects that focus primarily (or even exclusively) on transactional data, or data warehousing, or records management, or archiving, etc. Information Governance should be unified and consistent. There isn’t a different regulator for data, for documents, for emails or for tweeter messages. ANY information that enters, leaves or stays in the organisation should be subject to a common set of Governance policies and guidelines. The technical implementation a may be different but the governance should be consistent.

It is a marathon not a sprint. You can never run an “Information Governance Project”. That would imply a defined set of deliverables and a completion point at some specific date. As long as your business changes (new products, new suppliers, new customers, new employees, new markets, new regulations, new infrastructure, etc.) your Information Governance needs will also change. Policies will need revising, responsibilities will need adjusting, information sources will need adding and processes re-evaluating. Constantly! If your Information Governance project is “finished”, frankly, so is your business.

Keep it lean and clean. Information governance is the only cure for Content Obesity. Organisations today are plagued by information ROT (information that is Redundant, Outdated or Trivial).  A core outcome of any Information Governance initiative should be the regular disposal of redundant information which has to be done consistently, defensibly and with the right level of controls around it. It is a key deliverable and it requires both the tools and the commitment of the governing body.

Remember: Not who or how, but why Information Governance projects often get tangled up in the details. Tools, formats, systems, volumes, stakeholders, stewards, regulators, litigators, etc., become the focus of the project and, more often the not, people forget the main driver: Businesses need good, clean and accessible information to operate. The primary role of Information Governance is to deliver accurate, timely and reliable information to the business, for making decisions, for creating products and for delivering services. Every other issue must come second in priority.

The ministry of foreign affairs. The same way that a country cannot be governed without due consideration to the relationship with its neighbours, Information Governance does not stop at the company’s firewall. Your organisation continuously trades information with suppliers, customers, partners, competitors and the wider community. Each of these exchanges has value and carries risks. Monitoring and managing the quality, the trustworthiness, the volume and the frequency of the information exchanged, is a core part of Information Governance and should be clearly articulated in the relevant policies and implemented in the relevant systems.

This is not a democracy, it’s a revolution. Implementing Information Governance is not an IT project, it is a business transformation project. Not only because of its scope and the potential benefit and risk that it represents, but also because of the level of commitment and engagement it requires from every part of the organisation. Ultimately, Information Governance has a role in enforcing information quality, regulatory and legal controls, and it is contributing to the organisation’s accountability. The purpose of on Information Governance implementation is not to ensure that everyone is happy and has an equal voice on the table. The purpose is to ensure that the organisation does the right thing and behaves responsibly. And that may require significant cultural change and a few ruffled feathers…

If you don’t already have an Information Governance initiative in your organisation, now is the time to raise the issue to the board. If you do, then you should carefully consider if the common pitfalls presented here are addressed by your program, or if you are in danger of committing one or more of these sins.

Looking for Mr. Right – Revisited

I was reading a recent article by Chris Dale, where he gave an overview of Debra Logan‘s “Why Information Governance fails and how to make it succeed” keynote speech. It’s difficult to disagree with most points made in the session, but one point in particular caught my attention. Chris transcribes Debra’s thoughts as:

“…we are at the birth of a new profession, with hybrid players who have multiple strands of skills and experience. You need people with domain expertise, not just about apps and servers but data and information. The usual approach is to take people who already have jobs and give them something else to do on top or instead. You need to find people who understand the subject and teach them to attach metadata to their material, to understand document retention, perhaps even send them to law school to turn them into a legal/IT/subject matter expert hybrid.”

In parallel, I have also had several conversations, recently, relating to AIIM‘s new “Certified Information Professional” accreditation (which I am proud to possess, having passed their stringent exam). It is a valiant attempt to recognise individuals who have enough breadth of skills in Information Management, to cover most of the requirements of Debra’s “new profession“.

These two – relatively unrelated – events, prompted me to go and re-discover an article that I wrote for AIIM’s eDoc online magazine, published sometime around June 2005. Unfortunately the article is no longer online, so apologies for  embedding it here, in its entirety:

Looking for Mr. Right

Why advances in ECM technology have generated a serious skills gap in the market.

ECM technologies have advanced significantly in the last ten years. The convergence of Document/Content Management, Workflow, Searching, web technologies, records management, email capture, imaging and intelligent forms processing, has created a new information management environment that is much more aware of the value of information assets.

Most analysts agree that we are entering a new phase in ECM, where medium and large size organizations are looking to invest in ECM as a strategic enterprise deployment in order to leverage their investment in multiple business areas – especially where improving operational efficiencies and compliance are the key drivers, as these tend to have a more horizontal appeal across the organization.

But as ECM technologies are starting to become pervasive, there is a lot of confusion on the operational management of these systems. Technically, the IT department is responsible for ensuring the systems are up and running as optimally as the technology permits. But whose responsibility is it, to make sure that these systems are configured appropriately and that the information held within them is managed correctly as a valuable asset?

Think about your own company: Who decides how information is managed across your organization? With ECM, you are generating a virtual library of information that should be used and leveraged consistently across departments, geographical boundaries, organizational structures and individual responsibility areas. And if you include Business Process Management in the picture, you are also looking for common, accountable and integrated business practices across the same boundaries. Does this responsibility sit within the business community, the IT department or as a separate internal service function? And what skills would be required to support this?

There is a new role requirement emerging, which is not very well defined or understood at the moment. There is a need for an individual or a group, depending on the size of the organization, who can combine the following capabilities:

  • identify what information should be managed and how, based on its intrinsic value and legal status
  • implement mechanisms for filtering and purging redundant information
  • design and maintain information structures
  • define metadata and classification schemes and policies
  • design folder structures and record management file plans
  • define indexing topologies, thesauri and search strategies
  • implement policies and timelines for content lifecycle management
  • devise and implement record retention and disposition strategies
  • define security models, access controls and auditing requirements
  • devise schemes for the most efficient location of information across distributed architectures
  • devise content and media refresh strategies for long-term archiving
  • consolidate information management practices across multiple communication channels: e.g. email, web, fax, instant messaging, SMS, VoIP
  • consolidate taxonomies, indexing schemes and policies across organizational structures
  • etc.

And all of this, for different business environments and different vertical needs with a good understanding of both business requirements and the capabilities offered by the technology –  someone who can comfortably bridge the gap between the business requirements and the IT department.

People who can effectively combine the skills of librarian, administrator, business analyst, strategist and enterprise architect are extremely rare to find. If you can find one, hire them today!

The closest title one can use for this role today is “Information Architect” although job descriptions with that title differ significantly. More importantly, people with this collective skill set are very difficult to find today and even more difficult to train since a lot of “best practices” in this area are not established or documented.

This is a wakeup call for universities, training agencies, consultants and people wanting to re-skill: While the ECM technology itself is being commoditised, more and more application areas are opening up which will require these specialist skills. Companies need more people with these capabilities and they need them today. Without them, successful ECM deployments will remain difficult and expensive to achieve.

The more pervasive ECM becomes as an infrastructure discipline, the bigger the skill gap will become, unless we start addressing this today.

Apart from feeling slightly proud that I highlighted in June 2005 something that Gartner is raising as an issue today, this doesn’t reassure me at all: 7 years have passed and Debra Logan is (and organisations are…) still looking for Mr. Right!

I am happy that Information Governance has finally come to the forefront as an issue, and that AIIM’s CIP certification is making some strides in helping the match-making process.

But I really hoped we would have come a bit further by now…

Content Obesity – Part 2:Treatment

(…continued from Content Obesity – Part 1: Diagnosis)

You can’t, and don’t want to, stop data growth.

The growth of digital volume has been instrumental in driving major operational and cultural change in today’s business. Better, more personalised customer interaction; Insight from BigData business analytics;  Social media and collaboration;  effective training and multi-media marketing, all rely on the flow of much higher volumes of information through the organisation. Not taking advantage of this would make your organisation less competitive.

So, if reducing the volume of data being consumed is not an option, how else can you manage Content Obesity? There are two approaches to this:

Managing the symptoms

There are some key technologies that help alleviate some of the symptoms of content obesity. These, in our human analogy, are the equivalent of liposuction and nip-and-tuck.

  • De-duplication can identify and remove multiple copies of identical documents. It is only effective if you can apply it across all your document stores (ECM systems, Records management, Shared file drive, personal file drives, SharePoint, email servers, etc.). This rarely happens, and when it does, it is usually restricted to one or two of these sources and focuses only on files, not structured data.
  • Archiving and tiered storage Being able to select the most appropriate storage type for archived data, can have a positive impact on reducing storage costs. Not everything needs to be stored in expensive high-availability devices. A lot of the organisation’s data can sit on lower cost equipment, that can be restored from backups in hours, or days, rather than instantly. But how do you decide which information goes where? Most organisations will use this expensive high-availability storage for core systems, regardless of the age or significance of the date stored by these systems, as there is no easy way to apply policies at a granular level. There is certainly no way to map those logical “shared” network drives, where the majority of documents is stored, to tiered storage.
  • Compression. There are storage systems that use very sophisticated algorithms to reduce the physical space required, by compressing the data when stored and de-compressing it when it needs to be used. These are also expensive and require additional computing power to be able to maintain reasonable speeds in the compressing and de-compressing process.

All of these techniques offer some relief, but the relief is marginal, if it’s not driven by a unified policy, and they do not address the fundamental issue: Whilst they temporarily reduce the impact of storage cost, they do not curb the information growth rate.

They also do not address any of the compliance or legal risks associated with content obesity: The same logical volume of data needs to be preserved, analysed and delivered to litigation and the same effort is required to manually manage the multiple retention policies and respond to regulatory challenges.

Treating the disease

In order to properly resolve content obesity, we need to consider the organisation’s metabolism: How quickly information is digested, which nutrients (value) can be extracted from content and how the organisation disposes of the waste.

The key question to ask is: “How much of this content do organisations actually need to keep?”, Discussions with our customers indicate that an average of 70% of all retained data, is obsolete! (the actual number will vary somewhat by organisation, but I’ll use the 70%/30% analogy for the purposes of this article) This represents information that is duplicated, it is outdated, it has become irrelevant or has no business value. Or, it is content that can be readily obtained or reproduced from other sources.

The problem, however, is that nobody within the organisation knows which 70% of the data is obsolete. So nobody has the knowledge, or the authority, to allow that content to be deleted. The criteria for defining or identifying which information that 70% represents, are virtually impossible to determine systemically.

A more drastic and more realistic approach is required, to provide a permanent solution to the problem.

The concept behind treating Content Obesity is simple: If, and only if, the organisation was able to identify the 30% of information which they need to keep then, by definition, any information that falls outside that, could be legitimately deleted.

If this level of content metabolism could be controlled automatically, regularly, and effectively, it would free up critical IT storage resources and the corresponding budget that can be used to invest in growth projects instead.

What organisations need, is the equivalent of a Thyroid gland: A centralised Information Lifecycle Governance mechanism, that monitors the all the different retention requirements, regulates the content metabolism and drives a digestive system that extracts the value from the content and disposes of all the waste. Most organisations do not have such a regulating organ, or function, at all.

Sounds simple enough, but how can you create a centralised policy that determines precisely, which 30% of the content, needs to be preserved?

Studies conducted by the CGOC (Compliance, Governance and Oversight Council), have shown that there are only three key reasons why companies need to preserve data for any length of time:

  • Regulatory obligation – controlled by Records Managers
  • Litigation – controlled by the Legal department
  • Business Utility – controlled by each business function or department.

These are the three groups in the organisation that are responsible for the metabolic rate of content. Yet these groups rarely connect with each other, do not use the same terminology and, certainly, never had common policies and control mechanisms that they can communicate to IT. The legal group issues data preservation orders (legal holds) to custodians. Records Managers define taxonomies, fileplans and retention schedules, and task the business to abide by them. Business functions have more important things to do (like… keeping the business running) and, frankly, don’t have much appetite for understanding, let alone complying with, either legal hold orders or retention schedules. Business functions need the correct information to be available to them, at the right time, to make decisions on and to service their customers.

And who has the responsibility to physically protect, or to destroy, digital information? The IT group, which is not usually part of any of the conversations above.

At the heart of an Information Lifecycle Governance function, is a unified policy engine. A common logical repository, where Records Managers can document, manage and communicate their multiple retention schedules and produce consolidated fileplans; the Legal Group, can manage its ongoing legal matters, issue legal hold and preservation orders and communicate with custodians and the other parts of the business; IT and the business functions can identify and document which information is stored in each device and each application, and the business requirements for information preservation. A place where all of these disparate groups can determine the value that each information asset brings to the business – for both structured and unstructured information.

Once this thyroid function is established to control the content metabolism, it is key to connect it to the mechanisms that physically manage information – the “organs”. Connecting this policy engine to the document collection tools and repositories, records management systems, structured data archives, eDiscovery tools, tiered storage archives, etc., provides the instrumentation which is needed to monitor the data growth, execute the policies and provide the auditability and defencibility that is needed to justify regular content purging.

Conclusion

There is no quick fix for Content Obesity and, like medical obesity, it requires a fundamental change in behaviour. But it is achievable. Organisations need to design a governance model that transparently joins the dots: The business needs to describe the information entities, based on their value and utility, mapping them to the asset, system and application descriptions that IT understands. Legal can then manage their legal holds and eDiscovery, based on knowing what information exists, what part of the business it relates to, and where information lives, not only by custodians. Compliance groups can then consolidate their records management directives and apply a unified taxonomy and disposition schedule, relevant to the territory and business function. When all of these policies are systematically connected to the data sources, IT can accurately identify what information should be preserve and, by definition then, what information can be justifiably disposed of. (IBM calls this process Defensible Disposal).

%d bloggers like this: