Feeds:
Posts
Comments

Although data quality exercises have a variety of business paybacks, they are often low on the radar until a particular business need (or failure!) arises.  That can be short-sighted; they can be often enough justified by a cost-benefit analysis of the existing data quality.  Business decision-makers that don’t want to allocate budget should be properly aware of the ramifications of saying no – all too often, the case is not presented clearly enough in a business context.  But once undertaken a data quality project should eventually transform into ongoing data quality processes, including audits and governance, which are far less costly than revisiting the same issues later when stemming from different failures.

Data quality projects can emerge from many affective issues.  General examples are problems with:

  • accuracy
  • completeness
  • timeliness
  • consistency

which can especially derail change in an organisation, whether new business development or new IT functionality.

A talk given through TDWI by Chris King illustrated some typical experiences with a data quality project, in the context of a regional level of an international hotel chain.  There were 8,000 employees in the region, half of whom could directly impact the data.

The starting point was the ‘single customer view’ objective.  This is a very common confluence of business need with IT strategy: as common as it is to hear that a company’s customers are its most significant business resource, it is almost axiomatic that the biggest data quality issues are to be found with customer data.  Customer data tends to come from a variety of sources – of variable quality; too often the customer can enter their own information without human mediation.

Yet the relationship between a customer-centric businesses and its data quality strategy is variable.  Jim Harris at OCDQ has a tragicomic tale to relate  about an MDM/EDW* project with 20 customer sources.  He characterised the company as having a business need to identify its most valuable customers, yet they “just wanted to get the data loaded” (sounds familiar) and intended to rely on MDM and “real-time data quality” via the ETL processing.

How valid is that approach?  It should be decided by the key business stakeholders, with input from the technical analysts on current data quality (and project constraints).  From the sound of it, that’s not how the decision-making was done – yet even if so, how confident were the key business stakeholders that they had a good handle on the issues (and weren’t obfusticated by the technical details)?

In the case of the hotel chain, 40% of bookings arrived centrally, while 60% were people applying directly.  Generally there was better quality data in the former, as it tended to be repeat customers with an established history – resulting in some informal cleansing in the past.

Issues were sourced to the variety of collection points, such as:

  • call centre: cost containment requirements had crunched call time, with an attendant reduction in data capture;
  • third-party collectors of information, such as travel websites: they may have their own data capture requirements, but they’re just as likely to regard the customer as their own, and forward minimal details;
  • email marketing: less focus on eliciting the full gamut of customer details.

Mandation of fields presents a typical quandary: you want as much as possible, but people will always find a reason to circumvent them, and a way.  But what’s worse than no data? Bad data – especially when shuffled into good data.  Among the ideas tested were simply highlighting some fields rather than mandating them, and a trial of requesting drivers licenses.

They separated information from I.T., as Information Services.  This to better deliver information management, champion data quality, and support decision-making.  As opposed to Jim Harris’ example above, they worked on data quality before data integration projects – which can significantly reduce the cost of such projects when it comes their turn.  In fact, Chris commented that once the objectives of the data quality project were well understood, it was both far easier to introduce the changes, and softened up the stakeholders for other objectives like integration.

Data Stewardship is an important part of the ongoing process.  Once you’ve brought people together initially, it’s easier to set up a structure to manage data continuously, not just as a centralised dictionary, but as a necessary and useful dialogue with affected stakeholders.  This can prevent in advance situations Chris uncovered, such as finding one person’s VIP code has been set up by someone else to flag inclusion in a blacklist.

Data quality thresholds were addressed by incentives as basic as ice cream in call centres, through to General Manager bonuses.

Chris commented that there remained some wider business issues for resolution, such as tracking business vs leisure travel, and upselling into different brands [of hotel].  But as I said, further developments are less likely to be stymied by poor data, with a cleaning exercise under one’s belt and a quality structure in place.

* Master Data Management, Enterprise Data Warehouse