Cost of poor data quality
The cost of poor data quality can be evaluated using methodology outlined in reference Data Quality Fundamentals publication (1) as follows:
1) Identify downtime related to data incident
It can be calculated based on the following:
Average time to detect data incident (TTD)
Average time to resolve data incident (TTR)
Number of data incident (N)
Note
DDT = N(TTD + TTR)
Downtime related to data incident equals the number of incidents (N) times the average time to detection (TTD) and the time it takes to resolve them (TTR)
2) Calculate labor cost related to data downtime
The cost of poor data quality can be evaluated directly by the cost of labor directly related to detecting and resolving data incidents based on the following parameters:
Number of data engineers (NBDE)
Number of worked hours per year (YWH)
Average hour cost of a data engineer (HCDE)
Average percentage of time spent on data incidents (AVGT)
Average data quality downtime cost per year (YCOST)
Note
YCOST = NBDE(AVGT * HCDE * YWH)
Yearly average cost of poor data quality can be calculated by multiplying the number of data engineers by the average worked hours per year, average hourly cost of a Data Engineer and the % of his/her time dedicated to fixing Data Quality issues.
References