Archive for September, 2009

What does Business Process Improvement have to do with BI?

Not much, if you read Wikipedia.  But I’m beginning to suspect that a large number of the Wikipedia articles that stand at the confluence between business and technology are written by management ‘experts’ who are practicing for their next book, based on their part-time management studies.  Certainly most of those articles are arcane enough for those of us steeped in practical application of technology.

Yet there are other ways to bring about improvement in business processes than by following a rigorous methodology imposed from on high.

On the one hand, it’s often possible to just walk into a business and identify candidates for improvement, if not processes that are thoroughly broken.  That’s not just because a fresh set of eyes can help, but because experience, and enough experiences in different workplaces, can help to quickly identify both the work practices that are worth repeating, and those that are well broken.

But that’s not what I’m talking about either.

It is this: a business’ data is a model of the business and its practices.  In  return, business intelligence is a process of accurately reflecting that business and its processes.  And in endeavouring to do so, a good amount of business analysis is called for, to understand the business as you are reflecting it.  And that wholistic engagement process has a habit of uncovering what is not working as expected in business processes, both in practice (when analysing what people are doing) and in virtualisation – because when the data is shown to be incorrect and/or not as expected, that mismatch tends to reflect business processes that are awry.

That is not necessarily a part of the brief of a business intelligence professional.  Yet with forward-thinking management, it can be.

But at the very least, business intelligence professionals are ideally placed to gain insight into both a business and the model of the business and, in identifying mismatches, to foster improvements in business processes.  It would be negligent to waste such opportunities.


Read Full Post »

Why the buzz over columnar databases recently?  They’ve been around since the 1970s at least.  At the moment it remains the realm of niche players, although Sybase has had such a product for years, in Sybase IQ.

As far back as 2007, Gartner has been giving it a big tick.

Yet for some reason, I’ve been assailed by the concept from several disparate sources over the past month or so, some of which are heavy on the blah blah blah advantages but light on specifics such as reasons for those advantages.

I don’t pretend to have done the research, so I’ll just present a quick overview and links.  (At the very least, you can revert to Wikipedia’s article at top.)

In a nutshell, it is as it says, in that data is stored by column rather than by row (however, retrieval implementation seems to vary, with both row- and column-based querying variously supported).  Typified as meaningful in an OLAP more than OLTP context, it is said to be particularly beneficial when frequently working with a small subset of columns (in a table which has a large number of columns).  And, you guessed it, aggregation particularly thrives.


  • There’s a simple overview in a blog by Paul Nielsen, who puts his hand up for an implementation in SQL Server;
  • There’s a small simulation in Oracle, in a blog by Hemand Chitale  (with caveat in a comment);

Read Full Post »

[Part one of this discussion looked at different definitions of BI, and a very salient example of how it can be done well.]

When I’ve presented to people on the opportunities inherent in business intelligence, they marvel when they see information that is directly relevant to their work, in a new and meaningful light: summarised, for example, or detailed, or with direct visual impact that promotes new insights.

That’s the easy part.  Delivery is harder.

When I need to take a step back and assess what I am doing, I ask:

What does business want out of business intelligence?

This is particularly cogent if a BI implementation is less than successful – and I’ve never seen an implementation that really, I mean really, delivers.  I’m not talking about simply analysing business requirements, but understanding what is needed to deliver effectively.

There are many different ways of answering this question.

1) The anecdotal

My experience is probably not too different from many others.  In general, the feedback I’ve had from business stakeholders is:

  • They don’t know what they want; and/or
  • They want you to do it all for them

That’s a bit glib, but later I’ll extract some value from it.  In fact, as long as you’re delivering tangible value, I’ve found the business information consumers are reasonably happy.  It’s easy enough to rest on that, but as a professional it pays to think ahead.  Unfortunately, there remains a need for a level of business commitment to information issues – and I’m not talking about getting training in the tools or the data qua data, more about adopting an information-as-strategic-resource mindset.

2) The statistical

In a recent survey run by BeyeNetwork, the top two desires of business for BI are:

  • Better access to data
  • More informed decision-making

Axiomatic, no?  These effectively say the same thing, but there is nuance in each.

On the one hand, can business get whatever information they can possibly envisage, and in a format (whether presentation or analytical input) they can use effectively?  Clearly not – that’s a moving target.  But it’s also a goal to constantly strive for.

On the other hand, for business decisions to be made, it needs to be asked: what would support them in that process?  That’s too high-level for an immediate answer from most people.  Drilling into the detail of the processes is business analysis.  Maintaining such an understanding of business processes should rightly belong with the business, who should be fully on top of what they do and how they do it.  In practice, it’s often only when prompted by necessity – such as analysing information needs – that that exercise is done with much rigour.

3) The ideal

In an ideal world we would provide the knowledge base for a worker to be truly effective – which includes not just the passive support information, but the active intelligence that can generate useful new insights.  There’s a lot that can go into this, but the wishlist includes fuller realisation of:

  • Data integration: of information from disparate sources (not just databases)
  • transformation: from data to business meaning
  • Presentation: insightful representation of information (current buzzword being visualisations)
  • Discovery: the opportunity to explore the information (discovery)
  • Timeliness: information when they need it, where they need it, no delays
  • Control: the ability to save (and share) meaning that they encounter
  • Tools: a good, intuitive user experience – no learning hurdle, no techy barrier
  • Technical integration: seamless integration with the software and hardware environment (applications, devices respectively)
  • autonomy: the ability to do it themselves

That last one is an interesting one: it’s the exact opposite of what I said I’d experienced.  But the gap there is in the toolset, the environment in which the information is presented.  If it’s something they can intuitively explore for themselves, extract meaning without a painful learning curve, they would want to do it themselves.

This can’t be achieved by the data professional in isolation.  To achieve the above needs collaborative efforts: with business stakeholders, other IT professionals, and software vendors.

I don’t think there’s any BI implementation out there that delivers to the ideal.  Better business engagement, better business commitment, more resources for BI, better software tools, better integration: these would help.

We will get a lot closer to the delivery ideal.  But by then, BI will look rather different from today’s experience.

The dangling question: are new paradigms needed for BI to be fully realised?  If it is so hard to properly achieve the potential of BI today, there must be ways of working better.

Read Full Post »

Bill Inmon is one of the two gurus of data warehousing.  His claim is to have invented the modern concept of data warehousing, and he favours the top-down approach to design.

[Ralph Kimball is the other modern guru, who is credited with dimensional modelling – facts and dimensions.  He favours bottom-up design, first building data marts.]

Inmon is associated with the BeyeNetwork, maintaining his own “channel” there, on Data Warehousing.

Recently discussing data quality, he canvassed the issue of whether to correct data in a warehouse when it’s known to be wrong.

One approach is that it is better to correct data – where known to be in error – before it reaches the warehouse (Inmon credits Larry English for this perspective).

In contrast, there’s the notion that data should be left in the warehouse as it stands, incorrect but accurately representing the production databases. Inmon attributes this approach to Geoff Holloway.

Of course, Inmon easily demonstrates cases for both perspectives.  This is understandable because both versions of the data – corrected or incorrect – provide information.  On the one hand, business data consumers would want correct information, no mucking around.

But on the other hand, incorrect data is an accurate reflection of production values – and it can be misleading to represent it otherwise.  In particular, bad data highlights the business process issues that led to the entry the errors, and that in itself is valuable business information.

And here’s where I branch beyond Inmon.  I would argue the case for both forms of the data to be preserved in one form or another.

We have all experienced the exasperation of being faced with poor quality data flowing into business reports/information.  On a day-to-day basis, the information consumer doesn’t want to know about errors – they just want to use the information as it should rightly be, as a business input.  They may well be aware of the issues, but prefer to put them to one side, and deal with BAU* as it stands.

What this is saying is that the approach to data quality fixes should really be a business decision.  At the very least, the relevant business stakeholders should be aware of the errors – expecially when systemic – and make the call on how to approach them.  In fact, ideally this is a case for… a Data Governance board – to delegate as they see fit.  But unless the issues are fully trivial, errors should not be fully masked from the business stakeholders.

So if the stakeholders are aware of the data issues, but the fix is not done and they don’t want to see the errors on day to day reportage, how to deal the need to fix – at least as the data is represented?

I see four options here, and I think the answer just pops out.

Option 1: correct the data in the reports
Option 2: correct the DW’s representation of the data with a view
Option 3: correct the data itself in the DW
Option 4: correct it in ETL processing

Option 1 is fully fraught.  I have done this on occasion when it has been demanded of me, but it is a poor contingency.  You’re not representing the data as it exists in the DW, but more importantly, if you have to run a transform in one report, you may well have to reproduce that transform.  Over and over.

Option 2: creating a view is adding a layer of complexity to the DW that is just not warranted.  It makes the schema much harder to maintain, and it slows down all processing – both ETL and reporting.

Fixing the DW data (option 3) is done.  But again, it may have to be done over and over, if ETL overwrites it again with the bad data.  And there is a very sensible dictum I read recently, paraphrased thus: any time you touch the data, you can introduce more errors.  Tricky.  Who can say with certainty that they have never done that?

Of course, I would favour handling it in ETL.  More specifically, I would like to see production data brought to rest in a staging area that is preserved, then transformed into the DW.  That way, you have not touched the data directly, but you have executed a repeatable, documentable process that performs the necessary cleansing.

Not always possible, with resource limitations.  Storage space is one problem, but it may be more likely (as I have experienced) that the ETL processing window is sufficiently narrow that an extra step of ETL processing is just not possible.  Oh well.  There’s no perfect answer; the solution always has to fit the circumstance.  Again, of course, it’s a matter of collaboration with the business (as appropriate via the data steward or DG board).

Oh, and most importantly: get back to the business data owner, and get them working (or work with them) on the process issue that led to the bad data.

*BAU=Business As Usual – at risk of spelling out the obvious.  I find acronyms anathemic, but spelling them out can interrupt the flow of ideas.  So I will endeavour to spell them out in footnotes, where they don’t have to get in the way.


Read Full Post »