By David Vellante and Michael McCreary
Over the next five years chief information officers (CIOs) must strike a delicate balance between implementing technologies and practices that limit information risk while providing services that increase business productivity. Sound straightforward? Think again.
It's a new day in the courtroom: Plaintiff attorneys are finding that e-discovery of unstructured information-e-mail in particular-represents a new source of litigation leverage. In a number of lawsuits, electronic evidence has been a key factor in swinging a case. In others, e-discovery missteps have resulted in fines and sanctions. As a consequence, e-mail archiving and document and records management capabilities no longer are solely focused on narrow IT interests-such as keeping storage costs low-but are much more aligned with the legal and executive functions of organizations.
Information As Liability
Three main drivers are contributing to the increased importance of information liability management:
- Explosion in unstructured content.
- Organizations' increased agility and globalization.
- Rapid evolution of information risk and value.
For several years unstructured content has grown substantially faster than structured information has. According to market research firm International Data Corporation, more than 80 percent of all information in organizations today is unstructured. Because they contain highly structured data and metadata, corporate systems such as enterprise resource planning (ERP), financials, customer relationship management (CRM), and supply chain management (SCM) can be credibly used to recreate a sequence of events: who placed an order, for what, when, for how much, and on what terms. However, unstructured information such as e-mails, documents, spreadsheets, voice mails, and images are problematic: They present a morass of difficulty when trying to determine information relevance and replicate decision flow in a manner that can be proven with any degree of certainty. This represents a huge liability for organizations as information volumes grow exponentially.
The desire to form global networks in which information is a fundamental currency underscores the very nature of electronic information management as countervailing forces tug at information both as liability and asset. Risk-the exposure to financial loss directly tied to information-spans the spectrum of litigation, security, privacy, and regulatory concerns. Reward-the economic value of information measured in revenue or productivity gains-results from the ability to easily search, find, aggregate, analyze, and share information.
Consequently, CIOs face a conundrum: As organizations move to mitigate information risk, they necessarily introduce technologies and processes that constrain the value of information by limiting flexibility and context. Moreover, as systems and organizations become more diffuse, capturing all information in a central repository becomes impractical if not impossible, leaving many executives asking the question: "Are we gaining or losing ground on this problem?"
Information As Asset
Today's organizations create an almost unimaginable amount of electronic information, both structured and unstructured. Every project, idea, plan, and communication is now created electronically, resulting in massive stores of documents and messages spread throughout the organization on desktops, file shares, collaboration tools, e-mail systems, websites, portals, and wikis.
This glut in electronic information is both a blessing and a curse as IT struggles to store and secure it, the business struggles to find and share it, and the general counsel prefers that it all just go away.
But Where Should Organizations Start?
In the past process efficiencies were largely gained by reengineering inefficient approaches and taking advantage of automation. A byproduct of this method was a vast amount of well-structured electronic information to support decision making, much of which is now being stored in data warehouses.
To harness latent value within unstructured information, organizations need to build robust taxonomic views of their business, as they have with structured data. One place to find such a view is within the dimensional tables of data warehouses. This data may be used as a baseline for establishing the taxonomic views necessary to begin to consistently understand an organization's unstructured data. Specifically, information within data warehouses-such as accounting periods, geographies, and hierarchies-can be leveraged to weigh the outputs of and organize unstructured content.
Ultimately, the goal is to establish a single lens through which an organization sees itself and its business. Such a view, applied to both structured and unstructured data, will go a long way toward helping better manage escalating unstructured information risk and mining incremental value from data assets.
Balancing Information Assets And Liabilities
Effectively managing information liabilities scattered across the enterprise and harnessing information for business intelligence without limiting agility will require a true information lifecycle management (ILM) perspective. This includes views of what constitutes an organizational record (i.e., an information asset you want to keep) and the implementation of a new set of tools supporting these records.
In the past comprehensive enterprise ILM efforts have failed. The truth is that no large organization has solved this problem, and most records management functions and legal departments have little idea how to tackle it. One hurdle is the mindset that this is a "change management" problem. We often hear, "If users would just take the time to put files in the right folder, add the right metadata, retain everything Legal says needs to be held, and not leave data on their laptops, the problem would go away." While this assumption may be partially true, it is completely unrealistic. Today's employee is highly mobile and works daily with hundreds of messages, files, database records, and websites, all accessed through various computers and mobile devices.
In order to manage and secure these decentralized volumes of data, new tools must do a defensible job of automatically understanding document content and context in order to make lifecycle decisions. Moreover, it is critical that documents are managed at the point of creation and that encryption is built in.
Auto classification is not a solved problem, but algorithms such as Probabilistic Latent Semantic Indexing (PLSI) and Support Vector Machines (SVM) produce accurate results comparable to those generated by humans. The goal is legal defensibility, not perfection.
An Information Management Maturity Model
Such a vision will require IT organizations to accommodate the policy edicts of many parts of an organization. This means balancing maturing processes with other document and ILM activities, and integrating siloed security, compliance, retention, and other practices. Developing crossorganizational standards and eliminating redundancy will dramatically accelerate efforts to exploit new systems.
We are just beginning to understand this vision in terms of business requirements, key metrics, technologies, and challenges. Nonetheless, mid- and long-term plans must begin to incorporate the notion that information value can be viewed and measured using a balance sheet metaphor, where information assets and liabilities, while always evolving, can be observed as snapshots in time. The composition of that information balance sheet can be measured, albeit somewhat subjectively, and affected by specific strategies and actions that, like a financial balance sheet, can become an indicator of health, viability, and opportunity.
The Wikibon Project proposes an information management maturity model that articulates how organizations are evolving to meet these challenges. The basic premise is that as organizations become more aware of and take action to mitigate risks posed by the growth of unstructured information, they will naturally begin to constrict the value of the very information they are trying to protect by placing restrictions on its use. This will lead to organizational tension, but ultimately will facilitate the resolution of conflicts between risk and value by bridging the information management gaps between structured and unstructured information.
Each stage of this model is presented in the context of risk and value. Based on our research, which includes case studies of large and midsized organizations, most companies are in Stage 2 of the maturity model-where they are "plugging the holes" to immediately reduce information risk-or at Stage 1 of realizing that information, especially e-mail, creates vulnerability.
Action Items
Bridging the information management challenges of both structured and unstructured data is key to a successful IT strategy. Organizationally, the IT function must be prepared to serve many masters to achieve this including audit, legal, records management, and business units.
Information asset and liability management is not just about storing and securing instant messages (IMs), e-mails, voice, files, IP, collaborative tools, and the like. It's about providing flexibility to the business and understanding value, specifically what is actually stored and how it can be exploited in the context of how an organization views itself. Auto-classification of unstructured information and strategies to deliberately evolve and apply structured dimensions to unstructured content are key components of this approach. CIOs must begin to investigate technologies and strategies to accommodate this level of automation.
David Vellante is a cofounder of The Wikibon Project ( www.wikibon.org), a community of practitioners dedicated to improving the adoption of technology through an open source sharing of free advisory knowledge.
Michael McCreary is a Wikibon member and formerly head of Pfizer's legal IT group. He recently joined Rational Retention, an information management and records retention software startup.









