Every so often, an idea comes along that stops you in your tracks.
Innovation is happening at the speed of light all around us but most of of the time it consists only of incremental, evolutionary thinking, which takes us a little bit further in the same direction we were going all along. We have become fairly blazé about innovation.
And then you spot something that makes you sit up, pay attention, change direction, and re-think everything. I had one of these moments a few weeks back.
The name “EpyDoc” will probably mean nothing to most of you. Even looking at their existing website I would have dismissed it as a second or third-rate Document Management wannabe. Yet, EpyDoc is launching a new concept in April, that potentially re-defines the whole Data / Content / Information / Process Management industry, as we know it today. You know what happens when you mix comets and dinosaurs? It is that revolutionary.
I have lost track of the number of times over the years that I’ve moaned about the constraints that our current infrastructure is imposing on us:
- The arbitrary segregation of structured and unstructured information [here]
- The inherent synergy of Content and Process management [here]
- The content granularity that stops at the file level [here]
- The security models that protect the container rather than the information [here]
- The lack of governance and lifecycle management of all information, not just records [here]
- The impossibility of defining and predicting information value [here]
…etc. The list goes on. EpyDoc’s “Information Operating System” (a grand, but totally appropriate title), seeks to remove all of these barriers by re-thinking the way we manage information today. Not in small incremental steps, but in a giant leap.
Their approach is so fundamentally different, that I would not do it credit by trying to summarise it here. And if I’m honest, I am still discovering more details behind it. But if you are interested in having a taste on what the future of information management might look like in 5-10 years, I would urge you to read this 10-segment blog set which sets the scene, and let me know your thoughts.
And if, while you are reading through, you are, like me, sceptical about the applicability or commercial viability of this approach, I will leave you with a quote that I saw this morning on the tube:
“The horse is here to stay but the automobile is only a novelty – a fad”
(President of the Michigan Savings Bank, 1903)
P.S. Before my pedant friends start correcting me: I know that dinosaurs became extinct at the end of the Cretaceous period, not the Jurassic… 😉
I’ve been wanting to write this article for a while, but I thought it would be best to wait for the deluge of 2014 New Year predictions to settle down, before I try and look a little bit further in the horizon.
The six predictions I discuss here are personal, do not have a specific timescale, and are certainly not based on any scientific method. What they are based on, is a strong gut feel and thirty years of observing change in the Information Management industry.
Some of these predictions are more fundamental than others. Some will have immediate impact (1-3 years), some will have longer term repercussions (10+ years). In the past, I have been very good at predicting what is going to happen, but really bad at estimating when it’s going to happen. I tend to overestimate the speed at which our market moves. So here goes…
Behaviour is the new currency
Forget what you’ve heard about “information being the new currency”, that is old hat. We have been trading in information, in its raw form, for years. Extracting meaningful value however from this information has always been hard, repetitive, expensive and most often a hit-or-miss operation. I predict that with the advance of analytics capabilities (see Watson Cognitive), raw information will have little trading value. Information will be traded already analysed, and nowhere more so than in the area of customer behaviour. Understanding of lifestyle-models, spending-patterns and decision-making behaviour, will become the new currency exchanged between suppliers. Not the basic high-level, over-simplified, demographic segmentation that we use today, but a deep behavioural understanding of individual consumers that will allow real-time, predictive and personal targeting. Most of the information is already being captured today, so it’s a question of refining the psychological, sociological and commercial models around it. Think of it this way: How come Google and Amazon know (instantly!) more about my on-line interactions with a particular retailer, than the retailer’s own customer service call centre? Does the frequency of logging into online banking indicate that I am very diligent in managing my finances, or that I am in financial trouble? Does my facebook status reflect my frustration with my job, or my euphoric pride in my daughter’s achievement? How will that determine if I decide to buy that other lens I have been looking at for my camera, or not? Scary as the prospect may be, from a personal privacy perspective, most of that information is in the public domain already. What is the digested form of that information, worth to a retailer?
Security models will turn inside out
Today most security systems, algorithms and analysis, are focused on the device and its environments. Be it the network, the laptop, the smartphone or the ECM system, security models are there to protect the container, not the content. This has not only become a cat-and-mouse game between fraudsters and security vendors, but it is also becoming virtually impossible to enforce at enterprise IT level. With BYOD, a proliferation of passwords and authentication systems, cloud file-sharing, and social media, users are opening up security holes faster than the IT department can close. Information leakage is an inevitable consequence. I can foresee the whole information security model turning on its head: If the appropriate security becomes deeply embedded inside the information (down to the file, paragraph or even individual word level), we will start seeing self-describing and self-protecting granular information that will only be accessible to an authenticated individual, regardless if that information is in a repository, on a file-system, on the cloud, at rest or in transit. Security protection will become device-agnostic and infrastructure-agnostic. It will become a negotiating handshake between the information itself and the individual accessing that information, at a particular point in time.
Oh, and while we are assigning security at this granular self-contained level, we might as well transfer retention and classification to the same level as well.
The File is dead
In a way, this prediction follows on from the previous one and it’s also a prerequisite for it. It is also a topic I have discussed before [Is it a record, who cares?]. Information Management, and in particular Content Management, has long been constrained by the notion of the digital file. The file has always been the singular granular entity, at which security, classification, version control, transportation, retention and all other governance stops. Even relational databases ultimately live in files, because that’s what Operating Systems have to manage. However, information granularity does not stop at the file level. There is structure within files, and a lot of information that lives outside the realm of files (particularly in social media and streams). If Information Management is a living organism (and I believe it is), then files are its organs. But each organ has cells, each cell has molecules, and there are atoms within those molecules. I believe that innovation in Information Management will grow exponentially the moment that we stop looking at managing files and start looking at elementary information entities or segments at a much more granular level. That will allow security to be embedded at a logical information level; value to grow exponentially through intelligent re-use; storage costs to be reduced dramatically through entity-level de-duplication; and analytics to explode through much faster and more intelligent classification. File is an arbitrary container that creates bottlenecks, unnecessary restrictions and a very coarse level of granularity. Death to the file!
BYOD is just a temporary aberration
BYOD is just a transitional phase we’re going through today. The notion of bringing ANY device to work is already becoming outdated. “Bring Work to Your Device” would have been a more appropriate phrase, but then BWYD is a really terrible acronym. Today, I can access most of the information I need for my work, through mobile apps and web browsers. That means I can potentially use smart phones, tablets, the browser on my smart television, or the Wii console at home, or my son’s PSP game device to access work information. As soon as I buy a new camera with Android on it, I will also be able to access work on my camera. Or my car’s GPS screen. Or my fridge. Are IT organisations going to provide BYOD policies for all these devices where I will have to commit, for example, that “if I am using that device for work I shall not allow any other person, including family members, to access that device”? I don’t think so. The notion of BYOD is already becoming irrelevant. It is time to accept that work is no longer tied to ANY device and that work could potentially be accessed on EVERY device. And that is another reason, why information security and governance should be applied to the information, not to the device. The form of the device is irrelevant, and there will never be a 1:1 relationship between work and devices again.
It’s not your cloud, it’s everyone’s cloud
Cloud storage is a reality, but sharing cloud-level resources is yet to come. All we have achieved is to move the information storage outside the data centre. Think of this very simple example: Let’s say I subscribe to Gartner, or AIIM and I have just downloaded a new report or white paper to read. I find it interesting and I share it with some colleagues, and (if I have the right to) with some customers through email. There is every probability that I have created a dozen instances of that report, most of which will end up being stored or backed up in a cloud service somewhere. Quite likely on the same infrastructure where I downloaded the original paper from. And so will do many others that have downloaded the same paper. This is madness! Yes, it’s true that I should have been sending out the link to that paper to everyone else, but frankly that would force everyone to have to create accounts, etc. etc. and it’s so much easier to attach it to an email, and I’m too busy. Now, turn this scenario on its head: What if the cloud infrastructure itself could recognise that the original of that white paper is already available on the cloud, and transparently maintain the referential integrity, security, and audit trail, of a link to the original? This is effectively cloud-level, internet-wide de-duplication. Resource sharing. Combine this with the information granularity mentioned above, and you have massive storage reduction, cloud capacity increase, simpler big-data analytics and an enormous amount of statistical audit-trail material available, to analyse user behaviour and information value.
The IT organisation becomes irrelevant
The IT organisation as we know it today, is arguably the most critical function and the single largest investment drain in most organisations. You don’t have to go far to see examples of the criticality of the IT function and the dependency of an organisation to IT service levels. Just look at the recent impact that simple IT malfunctions have had to banking operations in the UK [Lloyds Group apologies for IT glitch]. My prediction however, is that this mega-critical organisation called IT, will collapse in the next few years. A large IT group – as a function, whether it’s oursourced or not – is becoming an irrelevant anachronism, and here’s why: 1) IT no longer controls the end-user infrastructure, that battle is already lost to BYOD. The procurement, deployment and disposition of user assets is no longer an IT function, it has moved to the individual users who have become a lot more tech-savy and self-reliant than they were 10 or 20 years ago. 2) IT no longer controls the server infrastructure: With the move to cloud and SaaS (or its many variants: IaaS, PaaS, etc.), keeping the lights on, the servers cool, the backups running and the cables networked will soon cease to be a function of the IT organisation too. 3) IT no longer controls the application infrastructure: Business functions are buying capabilities directly at the solution level, often as apps, and these departments are maintaining their own relationships with IT vendors. CMOs, CHROs, CSOs, etc. are the new IT buyers. So, what’s left for the traditional IT organisation to do? Very little else. I can foresee that IT will become an ancillary coordinating function and a governance body. Its role will be to advise the business and define policy, and maybe manage some of the vendor relationships. Very much like the role that the Compliance department, or Procurement has today, and certainly not wielding the power and the budget that it currently holds. That, is actually good news for Information Management! Not because IT is an inhibitor today, but because the responsibility for Information Management will finally move to the business, where it always belonged. That move, in turn, will fuel new IT innovation that is driven directly by business need, without the interim “filter” that IT groups inevitably create today. It will also have a significant impact to the operational side of the business, since groups will have a more immediate and agile access to new IT capabilities that will enable them to service new business models much faster than they can today.
Personally, I would like all of these predictions to come true today. I don’t have a magic wand, and therefore they won’t. But I do believe that some, if not all, of these are inevitable and it’s only a question of time and priority before the landscape of Information Management, as we know today, is fundamentally transformed. And I believe that this inevitable transformation will help to accelerate both innovation and value.
I’m curious to know your views on this. Do you think these predictions are reasonable, or not? Or, perhaps they are a lot of wishful thinking. If you agree with me, how soon do you think they can become a reality? What would stop them? And, what other fundamental changes could be triggered, as a result of these?
I’m looking forward to the debate!
I am writing this at 40,000 feet, on a morning flight to Nice, final destination Monte-Carlo, for what promises to be a very busy 4-day event. The European leg of IBM’s Smarter Commerce Global Summit runs from 17-20 June at the Grimaldi Forum in Monaco, and in a strange twist of fate I am neither a speaker nor an attendee. I am staff!
The whole event is structured around the four commerce pillars of IBM’s Smarter Commerce cycle: Buy, Sell, Market and Service. Each pillar represents a separate logical track at the event, covering the software, services and customer stories.
Enough with the corporate promo already, I hear you say, where does Enterprise Content Management come into this? Surely, SmarterCommerce is all about retail, transactional systems, procurement, supply chain, CRM and marketing campaign tools?
Yes and no. It’s true that in the fast moving, high volume commercial transaction world, these tools share the limelight. But behind every new promotion, there is a marketing campaign review; behind every supplier and distributor channel, there is a contract negotiation; behind every financial transaction there is compliance; behind every customer complaint there is a call centre; and behind every customer loyalty scheme, there is an application form: ECM underpins every aspect of Commerce. From the first approach to a new supplier to the friendly resolution of a loyal customer’s problem, there is a trail of communication and interaction, that needs to be controlled, managed, secured and preserved. Sometimes paper-based, but mostly electronic.
ECM participates in all commerce cycles: Buy (think procurement contracts and supplier purchase orders and correspondence), Sell (invoices, catalogues, receipts, product packaging, etc.), Market (collateral review & approval, promotion compliance, market analysis, etc.).
But the Service cycle is where ECM has the strongest contribution, and its role goes much beyond providing a secure repository for archiving invoices and compliance documents: The quality, speed and efficiency of customer service, relies on understanding your customer. It relies on knowing what communication you have previously had with your customer or supplier (regardless of the channel they chose), it relies on understanding their sentiment about your products, it relies on anticipating and quickly resolving their requests and their problems.
As a long-standing ECM advocate, I have had the privilege of leading the Service track content at this year’s IBM Smarter Commerce Global Summit in Monaco. A roller-coaster two month process, during which we assembled over 250 breakout sessions for the event, covering all topics related to commerce cycles, and in particular for customer service: Advanced Case management for handling complaints and fraud investigations; Content Analytics for sentiment analysis on social media; Mobile interaction monitoring, to optimise the user’s experience; Channel-independent 360 degree view of customer interaction; Digitising patient records to minimise hospital waiting times; Paperless, on-line billing; Collaboration tools to maximise the responsiveness of support staff; and many more.
A global panel of speakers, with a common goal: putting the customer at the very centre of the commercial process and offering the best possible experience with the most efficient tools.
More comments after the event…
Those that have been reading my blogs for a while, know that I have great objections to the term “unstructured” and the way it has been used to describe all information that is text-based, image-based or any other format that does not tend to fit directly into the rows and columns of a relational database. None of that “unstructured” content exists without structure inside and around it, and databases have long moved on from storing just “rows and columns”.
A conversation last night with IDC Analyst @AlysWoodward, (at the excellent IDC EMEA Software Summit in London), prompted me to think about another problem that distinction has created:
Calling that content “unstructured” is a convention invented by analysts and vendors, to distinguish between the set of tools required to manage that content and the tools that service the world of databases and BI tools. The technologies used to manage text-based content and digital media need to be different, as they have a lot of different issues to address.
It has also been a great way of alerting the business users that while they are painstakingly taking care of their precious transactional data, that only represents a about 20% of their IT estate, while all this other “stuff” keeps accumulating uncontrolled and unmanaged on servers, C: drives, email servers, etc.
These artificial distinctions however, are only relevant when you consider HOW you manage that information, the tools and the technologies. These distinctions are not relevant when you are trying to understand WHAT business information you hold and need as an organisation, WHY you are holding it and what policies need to be applied to it, or WHO is responsible for it: The scanned image of an invoice is subject to the same retention requirements as the row-level data extracted from it; the Data Protection act does not give a different privacy rules for emails and for client records kept in your CRM system; a regulatory audit scrutinising executive decisions will not care if the decisions are backed by a policy document or a BI query; you can’t have a different group of people deciding on security policies for confidential information on your ERP system and another group for the product manufacturing instructions held in a document library.
“Data Governance” (or “Information Governance”, or “Content Governance”, I’ve seen all of these terms used) is not an IT discipline, it’s a business requirement. It does not only apply to the data held in databases and data warehouses, it applies to all information you manage as an organisation, regardless of location, format, origin or medium. As a business, you need to understand what information you hold about your customers, your suppliers, your products, your employees. You need to understand where that information lives and where you would go to find it. You need to understand who is responsible for managing it, making sure it’s secure and who has the right to decide that you can get rid of it. Regardless if that information lives in a “structured” or “unstructured” medium, and regardless of the tools or technologies that are needed to implement these governance policies.
The Data Governance Council, has developed an excellent maturity model for understanding how far your organisation has moved in understanding and implementing Data Governance. It covers areas such as “Stewardship”, “Policy”, “Data Risk management”, “Value Creation”, “Information Lifecycle Management”, “Security”, “Metadata”, etc. etc. All of these disciplines are just as relevant in taking control of the data in your databases, as they are for managing the files on your shared drives, your content repositories and the emails on your servers.
I seriously believe that by propagating this artificial divide between “data” and “content”, we are creating policy silos that not only minimise the opportunity for getting value out of our information, but we are introducing even further risks through gaps and inconsistencies. We may have to use different tools for implementing these governance controls on different mediums, but the business should be having ONE consistent governance scheme for all its information.
Open to your thoughts and suggestions, as always!
I love the technology behind “IBM Watson“. I think it’s been a long time coming and I don’t doubt that in a matter of only a few years, we will see phenomenal applications for it.
Craig Rhinehart explored some of the possibilities of using Watson to analyse social media in his blog “Watson and the future of ECM”. He also set out a great comparison of “Humans vs. Watson”, in the context of a trivia quiz. However, I believe that there is a lot more to it…
Watson is a knowledgeable fool. A 6-year old kid, that can’t tell fact from fiction.
When Watson played Jeopardy!, it ranked its possible answers against each other and the confidence that it understood the questions correctly. Watson did not for a moment question the trustworthiness of its knowledge domain.
Watson is excellent at analysing a finite, trusted knowledge base. But the internet and social media are neither finite, nor trusted.
What if Watson’s knowledge base is not factual?
Primary school children are taught to use Wikipedia for research, but not to trust it, as it’s not always right. They have to cross-reference multiple research sources before they accept the most likely answer. Can Watson detect facts from opinions, hearsay and rumours? Can it detect irony and sarcasm? Can it distinguish factual news from political propaganda and tabloid hype?
If we want to make Watson’s intelligence as “human-like” and reliable as possible, and to use it to drive decisions based on internet or social media content, its “engine” requires at least another dimension: Source reliability ranking. It has to learn when to trust a source and when to discredit it. It has to have a “learning” mechanism that re-evaluates the reliability of its sources as well as its own decision making process, based on the accuracy of its outcome. And since its knowledge base will be constantly growing, it also needs to re-assess previous decisions on new evidence. (i.e. a “belief revision” system).
Today, Watson is a knowledge regurgitating engine (albeit a very fast and sophisticated one). The full potential of Watson, will only be explored when it becomes a learning engine. Only then can we start talking about real decision intelligence.
The points he makes about the risk of not capturing information appropriately, are of course valid and often quoted in the world of document management. But it got me thinking on a more fundamental issue: How do you determine what the real value of a document is?
Clearly not all documents have the same value: losing a grocery receipt is fundamentally different to losing your driver’s license or your passport. Misplacing an expenses claim receipt might be worth £20, misplacing a vital piece of evidence in a litigation case might be worth £20 million. Today’s Financial Times is vital for making business decisions tomorrow, but worthless recycling paper next week.
Which makes the generic numbers like the PriceWaterhouse ones quoted in the Jeff’s blog seem just as relevant as ordering clothes for 2.75 children.
So how can we measure the value of a document? What is it worth? None of the Content Management systems that I’m aware of today, have provisions for assigning individual value to stored content, let alone managing its lifecycle differently based on that value.
Is it even possible to determine the value of a document? (And for the purposes of this discussion “a document” could be anything from a 140-character tweet message, to a 300,000-page drug application…) Where does the value come from?
- The cost and effort of preparing or acquiring it?
- The cost of storing it and managing it?
- The context in which it has been used in the past, or may be used in the future?
- Its rarity or brevity or accuracy?
- Its relevance now? Its potential relevance in the future?
- How often it has been accessed and referenced and by whom?
- Who it is relevant to?
- The length of time that it retains its value?
- At what point does its value peak and when does it wane?
- The risk it carries, by its existence or by its absence?
- etc., etc. …
The list goes on! And this is before we even start thinking about assigning metrics or actual monetary value to any of the above.
Common sense says it’s probably some combination of all of the above. But do we measure any of this today? Should we?
Imagine the potential scenarios, if every document in a Content Management system carried a continually adjusted “Relative Content Value” property (You’ve heard it here first: a document’s RCV! 🙂 ). We could easily foresee…
- A system that automatically discards a document, because it’s readily and securely available online, storing a reference instead
- A system that automatically archives and protects an email that has been used in contract negotiations
- A system that automatically hides a document that contains personal or confidential information
- A system that automatically discard or hides documents that have repeatedly appeared in search results but nobody chooses to read
- A system that automatically relocates content to different risk mediums based on its value
- A system that automatically calculates insurance premiums for insuring against loss of its content, based on the total content’s value to the organisation
- A system that can determine the likely life expectancy of a document, based on the history of how similar documents have been accessed in the past.
- A system that would notify you, the author, when facts in your original research sources have been disputed or have changed, rendering your document misleading.
- etc., etc. …
Actually, we have technologies today to implement most of these things, if we knew what that “relative content value” was. What we are missing is a coherent way of calculating and storing that value on an ongoing basis.
Which brings me back to my original conundrum: Is it ever possible to determine what IS the value of a document? How?