Archive for August, 2009

“we had the data, but we did not have any information”
– CIO to Boris Evelson (Forrester), on the global financial crisis.

Vendor marketing messages have been said to contend that only 20% of employees in BI-using organisations are actually consuming BI technologies (“and we’re going to help you break through that barrier”).

Why is the adoption of BI so low?

That was my original question, brought about by a statistic from this year’s BI Survey (8).  As discussed in a TDWI report, in any given organisation that uses business intelligence, only 8% of employees are using BI tools.

But does it matter?  Why should we pump up the numbers?  It should not be simply because we have a vested interest.

The questions are begged:

What is BI, and why is it important?
BI is more than the query, analysis and reporting from a database:

“Business intelligence (BI) refers to skills, technologies, applications and practices used to help a business acquire a better understanding of its commercial context” – Wikipedia

It’s a very broad definition.  A rather more technical one from Forrester:

“Business intelligence is a set of methodologies, processes, architectures, and technologies that transform raw data into meaningful and useful information used to enable more effective strategic, tactical, and operational insight and decision-making. . . .”

But it can be explained more simply as:

data -> information -> knowledge -> insight -> wisdom

Data can be assembled into information.  Information provides knowledge.  Knowledge can lead to insights (deeper knowledge), which can beget wisdom.  Is there any part of an organisation that would not benefit from that process?  If there are any roles sufficiently mundane that insights won’t help them improve the job, improve their service delivery, then I guess those roles would not benefit from BI.  Yet I would suggest they are few and far between, and they should be automated as soon as possible, because you can bet that employees filling those roles won’t feel fulfilled, won’t feel motivated.

Business intelligence has a part to play in that whole process above.  At the lowest level, it can provide data for others to analyse.  But at every step of the process of generating wisdom from data, BI has a part to play.  In that sense, it is both intrinsic to an organisation’s aims, and everyone has a part to play in it.

I started into this subject aiming to canvas the reasons behind poor BI takeup.  After some research and reflection on my own experiences, though, I found a whole book’s worth of material in that simple question.  So it’s not something I can lay out simply, in one take.

First, let’s see an example of good use of data – one, in fact, that demonstrates both the adding of value to the data, and the presentation and impartment of insight.

That wonderful organisation TED (“ideas worth spreading”) has a presentation by Hans Rosling, a Swedish professor of International Health.  Start with Rosling’s entry at TED, and look at any one of the presentations there.  The first has the most oomph, but they are all good.  Why?  Meaningful data, good presentation tools and a Subject Matter Expert.  (Thanks to Mike Urbonas for the reference).

Rosling’s presentations are a prime example of business intelligence done right.  The data was gathered from multiple sources, its quality assessed, it was assembled and presented in a fashion that gave its audience insights. In fact, the presentation tool he uses, Trendalyzer, although later bought by Google was originally developed by his own foundation Gapminder.org.  (There are similar tools such as Epic System‘s Trend Compass; MicroStrategy also has a similar tool)

Much as it might look like it, I wouldn’t say the job began and ended with Rosling.  Whatever other parts he played, here his role is SME.  Yet his presentations clearly demostrate the involvement of other roles, from data analyst to system integrator to vendor/software developer.

Barriers to BI takeup

So where to start?  Everyone has an opinion.

Rosling: “people put prices on [the data], stupid passwords, and boring statistics”.  In other words, he wanted data to be free, searchable, and presentable.  Integration and system issues aside, he found his barriers to be data availability and the expressiveness of his tools.

Pendse:  he gave a number of barriers, including “security limitations, user scalability, and slow query performance… internal politics and internal power struggles (sites with both administrative and political issues reported the narrowest overall deployments)… hardware cost is the most common problem in sites with wide deployments; data availability and software cost;… software [that] was too hard to use…”

In grouping together the issues, I found the opportunity to apportion the responsibility widely.  All roles are important to the successful dissemination of a business’ intelligence: CEO, CIO, CFO, IT Director, IT staff, BI manager, BI professional (of whatever ilk), implementation consultant, vendor, SME (too often under- or not rated!), all the way down to the information consumer.

Comments welcome.  See part two for some discussion about gaps that exist in the delivery of BI.

Read Full Post »

What is Data Governance?  How does it relate to data stewardship? Meta-data management?

“Data Governance is a quality control discipline for managing, using, improving and protecting organizational information.

…It is an outcome oriented approach to treating data as a balance sheet:
asset (value)  –  liability (risk)”
Steve Adler

Although the terms at top are related, Data Governance presents an overarching philosophy for enterprise-level management of a business’ data resources.

It was, I believe, initiated by IBM‘s Steven Adler who, responding to unaddressed management issues he encountered, in 2004 set up a Data Governance Council.  This includes a number of IBM customers (including top financial organisations such as Citigroup, Amex, Bank of America) and partners, plus some academic representatives, so yes, it was originally an IBM initiative.  But as a concept it has broken out of the box, so to speak, and there are initiatives all over, such as The Data Governance Institute, a fully vendor neutral organisations.

My first encounter with the concept was a presentation by Adler at a 2008 IBM conference*, in which he gave his articulation of the various strands and mechanisms inherent in a ‘data governance organisation’.

Adler’s presentation started with a talk on the concept of toxic data (largely reproduced here, although he also discussed its role in the Global Financial Crisis), and its potential impact on an organisation and its customers and public.

Data Governance certainly appeals to those of us whose work intersects business and data issues.  It is concerned with managing the quality of an organisation’s data, mitigating the risks attached to data issues.  A Data Governance Committee constitutes a level of administration below executive level, but overseeing Data Stewards, who in turn mediate with the data consumers.

For my money, a DG committee should include a C-level sponsor, ideally both business and technology focused, such as CIO/CTO and CFO.  It should also include representatives of the business data owners, and data stewards, as well as, I believe, representatives at a data user level.  Obviously these voices would have differential weight on such a committee, but all those voices would contribute to the requisite quality outcome.

Data Governance is a business issue: data is an inherent part of a business and its processes.  There is no firm boundary between business and data – they flow into each other; they should reflect each other accurately.

DG is about identifying risks, implementing business policy, providing an auditable framework for management of data resources, and overseeing the actual management.  This is not as simple as managing bad data (although a committee can develop policy and accountabilities, and act as an escalation point).  Yet importantly, as Adler says, it is be a nexus for maintaining confidence – trust – in an organisation’s data resources.

But a DG comitttee can also be a forum for mediating competing claims on data (who owns it, how it should be represented).  It can define metrics and processes, including tolerance threshholds.  Data issues covered should include accuracy, completeness, currency, reasonability, consistency, and identifiability/uniqueness – although in practice, detail work can be devolved to other roles reporting to the committee, such as stewards or owners.  The important thing is to have a forum in which potentially competing interests (such as I.T., finance, and different business units) can come to agreement in a structured, auditable way.  Not only can competing interests be mediated, but potential gaps in responsibilities can be identified and covered.

According to Dave Loshin, best practices include:
– clarifying sematics
– evaluating business impacts of failure (putting a value on data assets – this can also protect data quality/governance initiatives)
– formalising quality rules
– managing SLAs
– defining roles
– deploying tools such as data profiling, mapping, etc
– [overseeing management of] metadata repositories
– scorecarding.

That last point is something both Adler and Loshin place importance on.  Scorecarding is a way of encapsulating initiatives and achievements in a way that can be socialised from C-level down.

Of the resources listed below, I most strongly recommend Adler’s presentation: it has copious detail.
Useful resources

*In a conversation afterwards, Steve proved to be a really nice guy.  As well as discussing business, we found we shared the burden of a famous namesake in the music industry (as you can see via Wikipedia or Google). He also had met Obama, and expressed appreciation for him at a time he was yet to prove himself properly in the US primaries.

Interestingly, as I write, his latest blog entry includes a cryptically brief comment about the effect of frameworks on innovation, and that he’s now “working on new ideas”.

Read Full Post »

For my money, Gartner‘s and Forrester‘s depiction of tools has broad equivalence. Their x-axes are Completeness of Vision and Strategy respectively; their y-axes are Ability to Execute and [strength of] Current Offering. Additionally, Forrester’s Wave helpfully spells out equivalence (of a sort), and sizes out market presence.

To compare, I looked at Gartner BI Q1 2008 and Forrester Q2 08 (which periods should not exhibit marked change, to my knowledge). One should expect their analyses to have congruencies, but they do differ, sometimes significantly. The both accorded leadership to IBM [Cognos],SAP [Business Objects], Oracle [Hyperion/Siebel products], and SAS, but Gartner included Microsoft, which Forrester downgraded to second tier, along with MicroStrategy, Information Builders, and SAP Netweaver. Forrester had them rather clustered, whereas Gartner differentiated more strongly between current execution and vision, interestingly ranging them from current to future order as Microsoft, Cognos, BO, Oracle, SAS.

Gartner's BI magic quadrant, Q1 2009

Gartner's BI magic quadrant, Q1 2009

Gartner’s Q1 2009 (summary and better quality image here) has them more clustered, yet differentiated. Cognos, Oracle, and SAS are ahead, with Microsoft and SAP back. On this take, Cognos has the best current offering, while SAS has better vision. (I note here that MicroStrategy, placed in second tier tends to perform particularly well with The BI Survey [OLAP Report] on customer satisfaction, which must count for something.)

Gartner’s  observes, inter alia, a flattening of the market in terms of ROI and offering: bigger spends don’t yield greater satisfaction, and BI is becoming more accessible through open source, SaaS, and Microsoft.  But there’s a split in the market, between those going for a middleware solution (it will fit here) versus those seeking a vendor capable of providing a fully integrated product set – which puts a context on those market consolidations.

The overall impression I get is that one cannot guarantee a clear leader as the manufacturers attempt to leapfrog each other with each new release. It’s also worth mentioning the change in the BI market over the past five years or so, from straightforward query/analysis/report to a plethora of tools (dashboards, scorecards, etc) for conveying that intelligence to the right people.

Of other interest for BI is: database, data integration, data quality, and collaboration tools.

In Data Warehousing, current analysis has few surprises. Forrester puts Teradata, IBM, and Oracle at the forefront, with Teradata slightly ahead due to strength of current offering. Standing back is Microsoft, still depicted as a leader due to their strategy more than their current offering. Which takes us to collaborative tools, which goes some of the way to explaining Microsoft’s strength, particularly due to their Sharepoint products. This week, someone from a Microsoft-focused shop told me he was not selling on the basis of SQL Server products so much as Sharepoint – because of its presentation presence, albeit it being back-ended by SQL Server (an analysis can be read with the graph here at Intelligent Enterprise).

Data quality tools? Gartner’s has rather changed in the past six months, now putting Dataflux clearly ahead, with Informatica and IBM [DataStage] bunched behind (viewed here).

Data Integration? As of Sept-08, IBM/DataStage were at front, with SAP/BO trailing, followed by SAS, with Microsoft and Oracle surrpisingly far back in the field, due both to current offering and vision. Simple picture here.

Gartner’s BI-specific page has a lot of information to absorb; Forrester’s page is mostly just links to reports.

Accompanying analyses are intrinsic, but as Get Elastic points out, there’s no “one size fits all”.  You have to assess on their ability to meet business needs; on the basis of choosing leadership alone can burn fingers.

My experience is that every tool has its pain points, in terms of both capability and usability. The quality of implementation is probably a bigger determinant of success than toolset (amongst the general leaders, at least; niche players such as Qlikview do not have the broad capabilities called for in a comprehensive tool). Like data mining, successful implementation is more likely with experienced implementers – that is, consultants. Yet there are traps there, too. I’ve seen consultant installs that have shown insufficient business insight, and/or have left behind insufficient documentation or transferrence of skills – either of which can deflate an initiative.

Moreover, it has to be accepted that BI is an ongoing project. If a consultant sets and an enterprise forgets, a couple of years down the track there will be significant atrophy of relevance. Business needs, expectations, and technological possibilities are constantly evolving. That latter is where product leadership has the most significance.

Read Full Post »

I was once tasked with increasing revenue through data analysis.  Has all our sales resulted in ongoing service contracts?  Catch the opportunity for selling service contracts when the opportunity first arises, and identify any past sales that are not currently covered.

Sound easy?  Well the first part was.  It was a matter of writing reports on impending service contract (or warranty) expiries.

The second part became a hair-puller because it uncovered a number of different general data quality issues.  We’re talking about historical data here.  It’s easy to say that we now enter all service contract information, but since when?  Why was there masses of blanks for warranty or contract end dates?

Data quality issues are business issues.  Not just because business stakeholders care about it (often they don’t, if it doesn’t touch them), but because they reveal issues with business processes.  If we expect a field to have a value in it, then why doesn’t it?

The warranty end date is a good illustration.  It turned out that a) we sold some equipment second-hand, with no warranty; b) some minor parts did not come with warranties.  But if we already know data quality is patchy, then we don’t have a way of telling whether the equipment was sold without warranty or whether the data wasn’t entered.  I eventually traced this to source.  I got agreement from the person who entered the sales information; she found it simplest to enter ‘NONE’ if equipment was sold without warranty.  (albeit process changes should be finalised – if not initiated – in a structured way with the business unit’s manager, to ensure retention as a permanent procedure.

Okay, reports written, business process updated, what’s left?  Trawling through historical data for service contract opportunities.  This proved more elusive than it sounded – because of data quality issues.  We didn’t know whether a customer still had the equipment – although they should be covered by a Service Account Manager, some SAMs were better than others, and customers were under no obligation to notify.  Nor was it easy to identify high-value opportunities – we didn’t have access to a master list of equipment and their value – only one that covered currently sold items.  The list of hurdles went on… but finally we got to a point where we felt opportunity returns would be too low for further pursuit.

Some time after this, I attended a presentation – under the aegis of an earlier incarnation of the Sydney branch of TDWI – on data profiling.  It illustrated how a quick pass with data profiling tools could replace months of manual analysis.  You beauty!, I thought.  I wish I’d had such a tool.  Saved labour costs would far exceed expenditure, and I can’t see that spending that time down and dirty with the data gave me enough extra insight that a tool wouldn’t provide.

Some of the lessons in this:
– understand the quality of the data before embarking on an extended journey in data analysis;
– data profiling should be a first step in a data quality exercise;
– data profiling tools rock!
– attempt an ROI for such an exercise, and try to quantify the end point (albeit sometimes there is a “just do it” command; for example in the above case, the business unit needed to increase revenue);
– poor data quality can generally be traced back to an originating business processes; yet bad data sometimes reflects only historical practices that no longer happen;
– poor data quality often only surfaces when a business stakeholder deems a new use (or old use with renewed vigour!) for the data in question.

Data quality issues are business issues – unless the technical people are goofs, quality issues originate with business processes. This is great: identifying root cause is most of the battle, and the solutions are usually the easiest part.  However that doesn’t make the investigation mission critical; that represents cost the business must be willing to bear.

Of course, it should be the business stakeholders rather than the technical analyst who decides the scope or magnitude (priority) of a data quality issue.  That doesn’t make it so, unfortunately.  The flipside of “just do it” is “don’t bother me” – then when the data proves to be bad, it’s possible to take just as much flak for not doing anything (based on business direction) as for inappropriately prioritising tasks.  Still, the technical analyst needs to remain mindful of getting caught up in the “last mile” of quality assurance when it takes an inordinate effort or there are potentially higher priorities.

I recommend a blog entry on cleansing priority: see Cleanse Prioritisation for Data Migration Projects – Easy as ABC?.  While aimed at data migration projects, it gives some good suggestions for placing qualitative priorities on data cleansing tasks, especially where a deadline is involved.  It’s a matter of attributing a business impact to not doing each cleansing task, and inter alia it flags the trap of spending time on the easy/obvious over the task with greater business impact.  The Importance Of Language, too: if you couch the priorities in sufficiently clear business impact terms, it’s easier to avoid the other great trap of rehashing old ground.  “Target Of Opportunity”, for example, accords no business impact, but it’s not like distracting the business with “I won’t ever bother addressing this”.  Then again, there are pitfalls if too much bad data falls into a lower priority bucket; there’s little worse than a stakeholder’s loss of confidence in the data.

Read Full Post »