Archive

Archive for the ‘Business Intelligence’ Category

“Hey, Watson! Is Santa real?” – Why IBM Watson is an innocent 6-year old…

March 25, 2011 Leave a comment

I love the technology behind “IBM Watson“. I think it’s been a long time coming and I don’t doubt that in a matter of only a few years, we will see phenomenal applications for it.

Craig Rhinehart explored some of the possibilities of using Watson to analyse social media in his blog “Watson and the future of ECM”. He also set out a great comparison of “Humans vs. Watson”, in the context of a trivia quiz. However, I believe that there is a lot more to it…

Watson is a knowledgeable fool. A 6-year old kid, that can’t tell fact from fiction.

When Watson played Jeopardy!, it ranked its possible answers against each other and the confidence that it understood the questions correctly. Watson did not for a moment question the trustworthiness of its knowledge domain.

Watson is excellent at analysing a finite, trusted knowledge base. But the internet and social media are neither finite, nor trusted.

What if Watson’s knowledge base is not factual?

Primary school children are taught to use Wikipedia for research, but not to trust it, as it’s not always right. They have to cross-reference multiple research sources before they accept the most likely answer. Can Watson detect facts from opinions, hearsay and rumours? Can it detect irony and sarcasm? Can it distinguish factual news from political propaganda and tabloid hype?

If we want to make Watson’s intelligence as “human-like” and reliable as possible, and to use it to drive decisions based on internet or social media content, its “engine” requires at least another dimension: Source reliability ranking. It has to learn when to trust a source and when to discredit it. It has to have a “learning” mechanism that re-evaluates the reliability of its sources as well as its own decision making process, based on the accuracy of its outcome. And since its knowledge base will be constantly growing, it also needs to re-assess previous decisions on new evidence. (i.e. a “belief revision” system).

Today, Watson is a knowledge regurgitating engine (albeit a very fast and sophisticated one). The full potential of Watson, will only be explored when it becomes a learning engine. Only then can we start talking about real decision intelligence.

Are Content Analytics turning the grubby ECM worm into a butterfly?

March 11, 2010 2 comments

Colleagues that have known me for a while, have all heard me bemoaning the use of the term “unstructured” to describe text-based content. Without boring you again to tears, my main issue is that the ECM industry has been largely treating content files as amorphous “unstructured” blobs, ignoring the rich value that is locked inside these content objects.

For the last twenty years or so, ECM systems have been providing a cocoon, where documents and media files have been stored, preserved, secured, archived and generally left to their own devices. But we have been focusing in protecting the whole container, the box, based on the label it has outside and only looking inside the box, one box at a time.

There is change afoot! 2010 looks set to be the year of Content Analytics, which promises to finally unlock the value that is locked inside our gigantic festering ECM repositories. And if the early success signs of IBM’s new Content Analytics software is anything to go by, we are starting to witness a fundamental transformation in the way content is leveraged in large organisations.

Much in the same way that Data Warehousing and Business Intelligence transformed the bland data storage provided by databases in the mid-90s, Content Analytics is today bringing natural language processing, trends analysis, contextual discovery and predictive analytics to the “unstructured” world.

Purists will argue that these algorithms are not new and, to a certain extent, that is true. However, this is the first time that we are seeing these technologies applied easily, (i.e. with off-the-shelf products, without the need of a PhD statistician or linguist by your side…) in real commercial applications, to solve real business problems: Car manufacturers avoiding recalls with early fault trends analysis; Pharmaceutical companies recognising equipment failure trends much earlier; large multi-nationals saving millions in litigation fees, etc.

The ECM industry may still be thriving, but in terms of innovation it has reached a plateau that makes most of us uncomfortable (or complacent… depending on your point of view). Basic content management functionality is being commoditised with CMIS, OpenSource and SharePoint leading the charge. There’s nothing wrong with that, it’s the natural maturity curve for any 20-year technology sector. We’ve created a very big ECM cocoon and we’ve filled it to the brim with content worms. It’s time to innovate again!

Making no apologies for the crass analogy (it is March after all and, allegedly, spring is coming…), Content Analytics are starting to finally poke the cocoon, making the value of content slowly emerge, transformed from archived fodder into real business insight.

Follow

Get every new post delivered to your Inbox.