Six Steps to Improve Data Quality in 2022
80% of an organization's data is tainted with some form of incorrect information at any one time. This figure is staggering, especially as organizations spend millions collecting and storing this data for decision-making.
If 80% of this data is potentially incorrect, not only is this a colossal waste, but it can significantly impact the organization's ability to adapt and evolve. Poor data translates to poor decisions.
According to IBM and the Harvard Business Review, bad data costs the US alone $3 Trillion dollars per year.
So, how to solve the data quality problem? We've identified six steps you can take to start to turn the tide on bad data within your organization.
Recognize the problem - listen; look for the misuse of spreadsheets
Start at the roots - aim to understand context; develop your insight using a graph
Build the right perspective - develop the necessary dashboards and reports
Implement a crowd-sourced approach - collect feedback from users as easily as possible
Develop automated workflows - send alerts, capture feedback, track remediation efforts
Continuously improve - implement programs to track and fix root causes
1) Recognize the problem
Business leaders often speak openly about their mistrust of their organization's data, and this rings especially true at lower management levels. The organization must listen to these individuals and understand their needs, as their data issues will often cascade and become more prominent if left unchecked. One of the clues you should look for is an over-dependence on the use of spreadsheets and similar technologies. Data stored in these tools are subject to errors and aging, making the data they contain untrustworthy. If your organization is overly reliant on spreadsheets, you will almost definitely have data quality issues.
2) Start at the roots
Data quality issues don't just exist in the form of typos or data formatting issues. These issues also appear when the context surrounding data is wrongfully applied. Take the instance of a specific term used within an organization that may not be interchangeable between departments.
In this example, the Product Owner in one department is different from the Product Owner in another. When looking at any data holding the term "Product Owner," the user must understand the specific context behind that term and apply the correct Product Owner to their reporting.
To ensure accuracy and solve the contextual data problem, you need to start at the root of the data. From there, you can work your way up from the source and apply the identifying qualities that provide the context.
The best approach to document this information from beginning to end is to use a graph. Graphs help describe data lineage and consumption, answering critical questions like "Where did this data come from?". It helps provide users with vitally important context.
3) Build the right perspective
Dashboards and reports are the perfect means to address data imperfections. Proper dashboarding and reporting give stakeholders the insight needed to address data quality problems.
Several dashboards are required to improve data quality:
Summary Dashboards: Summary Dashboards provide executives with a comprehensive understanding of the current state of affairs. What information is missing from this view, and what information currently reported is suspect?
Detail Dashboards: Detail Dashboards provide data owners with a view into the data they manage. Examples include list-style reports that link data elements to the proper owner.
Troubleshooting Dashboards: Correcting a data quality issue requires a great deal of investigation. There is an increased dependency on "data sleuths" to investigate and report these data quality issues. Through a platform approach, teams can utilize an army of dashboards that help isolate data quality issues and improve the productivity levels of these "data sleuthing" teams.
4) Implement a crowd-sourced approach
Fixing data quality issues requires a crowd-sourced approach. In other words, data owners need a mechanism to see problems clearly and give immediate feedback on what is wrong or needed to fix such issues. If the effort to provide this feedback is deemed overly arduous, the feedback will not come.
Instead, imagine a list-style report periodically sent to application owners. The report asks the owners to attest to whether or not they indeed own the applications listed. Imagine the users can correct the problem directly from the list by simply clicking on a button and providing feedback.
To make this a reality, your dashboard or report provider will need to integrate a mechanism that will allow both read and write capabilities to a database. Since many reporting and dashboards vendors do not support the ability to write back, users turn to spreadsheets.
5) Develop automated workflows
An automated workflow can orchestrate the remediation of data quality issues. The workflow can create tasks and work queues to track and identify data quality issues. In addition, they can apply logic in the form of rules to determine the appropriate queue work items that teams should resolve.
In many cases, teams use services or platforms such as Jira (or similar helpdesk systems). In these cases, the rules evaluate which queue the support ticket should move to. The problem is that the data owner will lack the appropriate access to the underlying support system.
Appropriate technology can resolve this problem. Similar to the crowd-sourcing of data quality, the automated workflows require visibility, alerting, logic, and a form capability.
6) Continuously improve
With these approaches in place, the next step is to build a program around them. Confidence in the organization's data is critically important and therefore should not be left to chance. Tracking incidents and root causes should be codified as an ongoing process and not attempted piecemeal. The organization should build services around these efforts to stay ahead of the issues and work proactively versus reactively.
A simple example would be to submit periodic surveys asking participants to review the data they may own or have a robust understanding of.
While there is no easy fix for improving data, the right technology can significantly impact the success of your strategy. Make sure the approach you implement - and the platform you choose to work with - can support the six steps needed to enhance the quality of your data.
Learn more about how Process Tempo can support these six steps by speaking with one of our experts today.
Process Tempo is a hybrid cloud, data management & analytics platform that breaks down silos to allow people, processes, and technologies to seamlessly work together. The platform supports a secure, governed, scalable, and high-performance environment for analysts and data scientists while serving as the foundation to deliver insights to all employees. It helps to deliver markedly fast, actionable, and accurate insights, easily incorporates a semantic data layer to curate and recommend information from across the organization, and makes every employee a first-class citizen in contributing insights and feedback to make the organization smarter over time.
Analytics as a Service from Process Tempo is an analytics solution that helps your entire team make intelligent and profitable decisions by leveraging your data without a heavy, upfront investment. By choosing an efficient, fast, and cost-effective path to insights, we can help you beat your competitors in today’s data-driven, competitive digital economy. With Process Tempo as your Managed Service Provider, stakeholders can focus on achieving sustainable and predictable profitable growth. Ready to get started with a Managed Analytics Service? Read more about Process Tempo's Approach to Managed Analytics or schedule an introductory session.