Total Operations by
Checked by Datafold
Data Quality Issues
Nextbite deals with a lot of complex data for many varying sources. Each new integration for online ordering providers calls for new modeling and an increase in the risk of incorrect data. Existing integrations are not static so managing those changes over time without unexpected effects downstream is challenging.In addition to data complexity, Nextbite deals with the delivery of that data.
Nextbite exports actionable business intelligence data to various platforms. For pushing out this data, Nextbite relies on Hightouch. Some of this data goes back out to customers in the form of metrics. Other data goes to the Nextbite sales and operations team so they can take action.
The challenge for Nextbite was to gain confidence that making changes, and being able to do it rapidly, to their models would not negatively impact the quality of data sent to customers and stakeholders.
While Nextbite’s data quality process followed self-developed CI best practices, it was manual and could vary by developer. Ultimately, this led to heavy dependence on a bespoke process with manual checks, making it difficult to ensure complete data quality in a timely manner.
Nextbite needed a tool to ensure consistency and thoroughness in testing before model changes were merged. They needed a tool that reduced the time to get to a high level of confidence and they needed it to integrate into their existing PR workflow.
Nextbite needed confidence that the data delivered to their stakeholders and customers was accurate, and they needed to be alerted to data quality issues before delivering any problem data.
For delivering high quality data faster, Nextbite chose to use Data Diff from Datafold for two primary reasons: workflow integration and depth of impact insight.
With Datafold, we're not just adding trust to our Snowflake instance, we're adding trust to our most important data that is getting activated via Hightouch. Data that's directly driving decisions, sending emails, and interfacing with our customers.
Director of Data Engineering at Nextbite
Datafold seamlessly integrated with dbt, github, and Hightouch allowing them to get the tool integrated within a day into their existing workflow.
This integration puts an impact report into every pull request highlighting which tables, columns, and rows are impacted and which Hightouch syncs and models are impacted by model changes. This increased the level of testing and made it consistent across all engineers.
If an unexpected change occurred in the table being worked on, Datafold provides a rich UI to explore distributions, specific field level changes, and column level lineage to investigate. This ability to quickly dive into and explore impact details makes it easy to be confident about a code change before merging it.
Previously reaching this same level of confidence could take hours of querying and visualizing data, now it takes a few minutes.
Since the addition of Data Diff into Nextbite’s pull requests, they’ve caught several instances where changes to the data models would have negatively impacted their Hightouch models. Left unaddressed, this would have resulted in syncing incorrect data to customers and for use by their internal teams.
As an example, Nextbite was making a simple update when they noticed within the Data Diff report a change in a field related to order identification. At first, they assumed the change was benign. However, after further analysis in Datafold’s UI, Nextbite determined that this change would have a huge unintended effect on downstream tables.
Their conclusion: If the change hadn't been caught, then it would have modified the way that orders were linked between data providers. This, of course, would have had a major impact on a critical part of Nextbite’s data.
With Data Diff, data engineers at Nextbite can confidently work on their data models without fear of impacting customer-facing data being surfaced through Hightouch.
This ability to know what's going to happen to downstream data means breaking changes are no longer introduced. This allows data engineers, analysts and customers to get high quality data without interruption.
...the stakes are high and Datafold gives us peace of mind on data quality and accuracy.
Director of Data Engineering at Nextbite