Zach bought a new Ferrari 458. High-performing even by elite sports car standards, this vehicle boasted a V8 engine capable of producing 562 horsepower. It zoomed from 0-62 MPH in under three breathtaking seconds, reaching top speeds of a very illegal (in most places) 210 MPH. Every morning, Zach meticulously washed and polished his jewel of an automobile. But he never used the high-octane fuel recommended by Ferrari, skipped all of the regular maintenance appointments, and never rotated the tires, checked the brakes, or changed the oil. As you can see, we refer to Zach's lovely vehicle in the past tense.
This is exactly what companies do when they invest megabucks in data warehousing and maintenance, yet never check or cleanse their data. Like Zach's Ferrari, the engine soon clogs, and it starts spitting a nasty-smelling funk. That's when you begin to blame your analytical tools and your data scientists -- when the real problem is the data quality.
There's an old acronym database administrators used to toss around: GIGO. Garbage in, garbage out. It's true about putting inferior motor oil and low-O fuel in a Ferrari, it's true about eating doughnuts and sugary drinks before bedtime, and it's true about your data. If you put bad data into good analytical tools, you'll never wind up with reliable results. In fact, the answers you get could be so skewed that you'd be better off without your expensive data warehouse and analysis.
Data quality can become inferior in a number of ways, including:
Certain types of data have a notoriously short shelf life.
It begins to go out of date as quickly as you collect it.
These types of data (like marketing databases) need to be checked
and cleansed more regularly than data that holds its on better.
If one sales rep calls a lead, they may or may not buy. If two or three sales reps contact the same lead because they're working with duplicated data, it looks like your right hand doesn't know what your left is up to. Similarly, data can contain bad information that makes your whole company look like they don't know what they're doing. The same is true with data used for operational intelligence, business intelligence analytics, machine learning apps, or any other data analytics endeavor. Garbage in, garbage out.
What does it take to achieve a high quality of data?
Popular posts like this: