Friday, July 27, 2007

Data consistency and accuracy

Earlier this week, a private contractor working for the City of Burnaby (a suburb of Vancouver) dug their backhoe into a high pressure oil pipeline (a video of incident) sending about 1,400 barrels of crude oil high into air covering the neighbourhood and eventually leaked out into the Burrard Inlet. It took half of an hour before the pipeline was switched off and two hours before the oil boom were deployed to ensure that the oil slick is contained.

Now the clean-up begins and the finger of blame is circulating trying to land on someone or organization. The city is saying that the oil company didn't give the city accurate information as to where the pipeline was located (apparently it was 9 metres from where it should be on the map). It is going to take months for the clean-up and obviously health issues for the residents as well for the wildlife that will be affected within the immediate area.

I just wanted to underscore the importance of data accuracy & consistency. I know that many organizations are facing similar problems but you would think with today's technologies (GPS, thermal mapping, etc), it should be a no-brainer to verify and map the exact location of the pipeline.

In my previous life as a data management professional, the mantra has been single source of data but I have since recognize that it is okay to have multiple copies of the same data as long those copies are from the official source and is the single point of truth for the organization. Can you imagine the nightmare if you have multiple copies of the same data but each with a different value and no one is sure which one value is the correct value?

My group is currently dealing with data inconsistency for a particular project which went live earlier this year. The problem was that someone decided to use the same field/column for two separate attributes. This is never, never okay in the world of data management/data modelling but worst is the fact that the decision was never fully shared and documented for the rest of the project team so you can see why certain things are no longer consistent when the meaning of said field changes depending on other attributes.

No comments: