In this age of increased regulatory reporting, many institutions have taken the clear stance that data quality initiatives need to be prioritized based upon data required by the regulatory reports and models. While this has helped prioritize data quality initiatives, it does mean that many data quality problems get left behind and are simply deprioritized.
All of this makes me think of the Broken Windows theory often used in sociology to take the use of informal actions to create change in behavior. As it’s described in the 1982 article by James Q. Wilson and George L. Kelling…
Consider a building with a few broken windows. If the windows are not repaired, the tendency is for vandals to break a few more windows. Eventually, they may even break into the building, and if it's unoccupied, perhaps become squatters or light fires inside. Or consider a pavement. Some litter accumulates. Soon, more litter accumulates. Eventually, people even start leaving bags of refuse from take-out restaurants there or even break into cars.
I think the same is true when it comes to data quality. If users “see bad data” or know that some data simply isn’t as important because it’s not related to today’s regulatory reporting initiative, there can develop an implicit acceptance of poorer quality data. They see the broken windows or the litter on the sidewalk and think “maybe data quality isn’t that important after all”. Let’s face it, data quality work is hard and the last thing any Chief Data Officer, Compliance Officer, or Head of Operations needs is an attitude of complacency developing throughout the organization.
This really argues for a comprehensive data quality program. However, as I’ve seen many times over my career, given the size and complexities of a global multi-national company, a comprehensive data quality program can be expensive. This is where data diagnostics can come in. Instead of boiling the ocean, we’re now able to use technology to quickly identify and prioritize data quality issues across very large data sets. Effectively, technology delivers scalable quick wins, which over time reduces data quality issues, especially when in the hands of a seasoned data operations team. Instead of just walking up and down the street picking up any litter we see, we now also run the street sweeping machine.
Data volumes are growing significantly throughout the industry and with this growth, data quality issues will grow as well. Beware of those broken windows…