Data quality guide

Data quality is your moat, this is your guide

/•/

Fortify your data, fortify your business: Why high-quality data is your ultimate defense.

What's the difference between data observability and data quality?

Published

May 28, 2024

They’re not the same thing.

You might have already noticed that “data quality” and “data observability” are frequently used interchangeably or with loose definitions that make them hard to differentiate. Many vendors have features related to both which further blurs the line between the two. They might profile your data, test your pipeline, or let you set an alert if things go sideways–and maybe all three at once. What makes a tool more focused on data quality versus data monitoring?

Data quality and observability are both concerned with data reliability and integrity, but they’re different concepts with distinct purposes.

A strong moat requires both data testing and data observability

The traditional data quality paradigm has created confusion over these two concepts, as shown by how many self-described data quality vendors provide monitoring services too. They ping you on Slack when an anomaly in your production data has been detected, and list out possible root causes so you can look into it. In this common scenario, data quality excellence is defined as how fast you can correct data after it hits your production tables. The goalposts for data quality have been moved from “Our team is 100% confident that the data you see in production can be trusted” to “Well the data might be bad, but don’t worry, we’ll know if we get an alert and we’ll fix it afterwards.”

In other words, most frameworks are talking about data observability. Used this way, data quality practices are incorrectly defined as something reactive: What kicks in after the pipeline breaks and starts dumping bad data in production.

Data observability is a subset of data quality

Asking about the differences between data quality and data observability is the wrong question, albeit with the right intentions. Think of data quality like a big umbrella term covering everything about keeping data accurate and reliable. Underneath that umbrella, you have different tools and methods. Data observability is like one of those tools—it's aimed at ensuring real-time visibility and monitoring of data processes during production and helps you keep an eye on your data.

‍

The twin banners of data testing and data observability protect data quality

‍

So, when we say data observability is a subset of data quality, we mean it's a smaller part of the bigger goal of catching bad data before anything hits production. Just like how there are different strategies to keep a castle secure, you'll need multiple strategies, namely: data testing and data observability.

To achieve great data quality, you need data testing and data observability

Data observability is an important tool in preventing poor data quality—and goes hand-in-hand with another tool: data testing. They each have a role and purpose.

Data testing is proactive and is the gatekeeper against data quality issues in pre-production. It involves conducting rigorous data quality checks and validations on data pipelines, transformations, and analyses before they reach the production environment. By catching and resolving issues early in the data lifecycle, data testing helps prevent errors from propagating downstream.

Data monitoring is reactive and responsible for spotting data quality problems during production. It allows organizations to reactively detect and address issues that may arise during data processing and analysis in real-time. Data observability tools provide visibility into the health and performance of data pipelines, enabling timely identification and resolution of anomalies.

Within this framework, data testing and data observability emerge as complementary strategies, each addressing different stages of the data lifecycle. It's a better approach to the three different concepts, because it's a more holistic one: you can leverage both proactive and reactive strategies to safeguard all your data assets.

Aspect	Data Testing Tools	Data Observability Tools
Designed to	Prevent incorrect data deployments in production	Detect issues in data and pipelines once they happen
Methods used	Automated regression testing through CI/CD checks, data lineage and impact analysis , data diffing, data replication testing, dbt tests and packages	Machine-learning powered anomaly detection, monitoring, alerting, and providing insights into overall data health and status
Detects issues	Proactively, addressing issues before they hit production	Reactively, identifies problems after they appear in production or at the source-level
Areas of complexity	Often requires an understanding of expected, or unexpected, data changes	Requires understanding of data health and pipeline status

What success and failure looks like

Let’s recap: you need a proactive and reactive approach to maintaining data integrity. Data testing will catch bad data before anything falls into production, and data observability is responsible for alerting you to anomalies in production. Achieving better data quality standards requires tools that come in at a different stages. With their lines of responsibility clear, it makes it much easier to figure out whether a tool, framework, or process meets your needs.

Whether you’re a data engineer or a manager of a data team, ultimately what you care about is whether your data infrastructure succeeds or fails against measurable benchmarks. We think this list should serve as a checklist over every data engineer’s desk:

	Data Testing Tools	Data Observability Tools
What failure looks like	Increased number of data deployment errors Inconsistent or inaccurate data reported by end-users Persistent data discrepancies between production and test/dev environments Regularly reverted PRs	High frequency of undetected anomalies in production data Data pipeline failures or delays affecting downstream processes Lack of timely alerts or notifications for data anomalies Alert-fatigue caused from too many alerts
What success looks like	Decrease in data deployment errors and issues, leading to smoother operations Consistent and accurate data across all environments, improving decision-making Positive feedback from end-users regarding data reliability and consistency	Timely detection and resolution of anomalies, minimizing downtime and disruptions Improved performance and reliability of data pipelines Increased trust and confidence in data assets among stakeholders

‍

Modern data quality platforms need both data testing and data observability tools

You might see how this is coming together now. If data testing catches problems before production, then data observability is responsible for your production environment itself. Because, sometimes, intruders do get past your moat and into the castle itself. Edge cases always exist and it’s hard to create an alert for unknown unknowns.

So data engineers will always need both data testing and data observability tools. However, it's much easier to solve data quality problems before they hit production. Problematic data is easiest to find and correct when closest to the originating source. Bad data left to its own devices, traversing through ETL/ELT, transformation, and then analysis creates a host of other problems which you might already be familiar with. Think: stakeholders frustrated with having already made decisions based on faulty dashboards or marketing emails sent to the wrong cohort of clients.

A new proactive data quality standard: Shift-left testing

Hence, we believe data quality practice should be laser focused on finding issues before they hit production, and any data quality platform worth its salt should provide both data testing and observability tools. Modern data quality platforms are constantly innovating and adding new features all the time, but it's rare to find one that offers both proactive and reactive features—so we built one (more on this later!). We call this shifting data to the left–a phrase borrowed from systems testing.

As you'll see in the next section, when bad data hits critical data assets, it'll often take an entire data engineering team to untangle the mess and restore everyone's trust in the data.

previous Passage

Next Passage