Having read a variety of thoughts on data dictionaries, I have come to the conclusion that there is no ready consensus on exactly what they look like and what they are for.
The latter sounds easy: a data dictionary is meant to communicate (socialise) the contents (or rather, metadata) of one or more data sources.
That’s laudable: we all want to be on the same page.
Yet most of the descriptions I’ve seen amount to little more than sucking out the tables, views and columns from a database, and publishing. The result must be both incomprehensible for and unusable by business stakeholders.
I also see quite a dichotomy between starting a data dictionary as an aid to building a (greenfields) data source, and documenting an already-existing system. First, the immediate goal is quite different in each case, although the end-point would be the same. And in fact the starting point would be different in each case: on the one hand, business aims lead, and the exercise would begin from business definitions. On the other hand, documenting the here-and-now of the data sources is what matters, and business sense is built up from the atomic data definitions.
In documenting existing data sources, the hardest task seems to be to translate database definitions to business terminology, and I’ve never seen that done well, far less done automatically.
My vision of a data dictionary to document an existing system is as follows.
Suck out the atomic data definitions. From this, create a Wiki that can enable Subject Matter Experts from both business and IT disciplines to build up a common, agreed understanding. From changes to this Wiki, generate update code for the data sources.
Since documenting this vision, I’ve only found one organisation that has made a stab at anything like this: Metcash – and they apparently use it to document their data but not to generate updates. I’ve not seen their Wiki in detail, but they have a couple of thoughts to feed to the mix.
One: Updating of the Wiki needs to be restricted. It’s unwieldy to permit everyone to have a stab at improving definitions. Keep the update community small and manageable.
Two: it’s an evolving process. As with all development work around data, the project never reaches an end point – if for no other reason than business changes and so data would keep changing.
It’s nice blog and thanks share with us .
Business Dictionary