Monday 15 December 2008

Content Refinement Pipeline

How long does it take to find a typical document with current system?

Is there a meta-data policy in place? How diligent are employees in adding meta-data? What is the cost of this?

How much time is spent adding/managing meta-data?

What percentage of information, would you estimate, is duplicated throughout the organisation?

Is there a means of identifying those duplicates at present?

Do you know that this is an impediment to search and increases OPEX?

Unlike, Rather Than:

Poor quality content without good meta-data = storing data in electronic shoeboxes without labels – takes a lot of searching.

No longer -I have lots of information how do I best store it. That was the DB and subsequentally CMS/DMS approach - we flip it 180 degrees.

There is a lot of information how do you want to consume it?

Where can I find - expense forms, policy documents, annual report

What do we know about - customer, prospect, project

Business Objective, Business Challenges:

Increase the accuracy and speed of classifying content.

Negate the requirement for a costly CMS/DMS.

Would it benefit you if you could?

Increase the “findability” of content.

Auto-generate tags without human intervention.

Develop a grass-roots level classification model– a folksonomy as they are termed.

Identify duplicate documents and data across the organisation and provide a master representation of an entity – Data Cleansing.

I can give you an example:

Do a search on your desktop for your project plan excel sheet and you will find several copies of the same document with alternative dates and titles. Now, if you multiply that by the number of documents you have, times the number of people in your organisations, this problem grows exponentially.

If we can identify and remove duplicates we can not only bolster the search experience but also reduce OPEX.

No comments: