Automated Testing For Data Engineers

We empower data teams to build reliable data products faster

At Datafold, we build tools for data practitioners to automate the most error-prone and time-consuming parts of the data engineering workflow: testing data to guarantee its quality. While data quality (just like software quality) is a complex and multifaceted problem, we draw from decades of our team’s combined experience in the data domain to build opinionated tools our users love. Specifically, we believe that:

Data quality is a byproduct of a great data engineering workflow

That means, rather than building yet-another-app for data practitioners to switch to and from, we insert our tools in the existing workflows, for example, in CI/CD for deployment testing and IDEs for testing during development.

Data quality issues should be addressed before deploying the code

Most data quality issues are bugs in the code that processes data, and applying a proactive, shift-left approach is the most effective way to achieve high shipping velocity and data quality simultaneously.

Lack of metadata (data about data) is the biggest gap in the data engineering workflow.

We bring powerful tools such as data diffing and column-level lineage to every data engineer’s workflow to help them validate the code and underlying data and fully understand the dependencies in complex data pipelines.

We're an all-remote global team

Join us

We're an all-remote global team

founder & ceo
Gleb Mezhanskiy
CO-founder & cTO
Alex Morozov

OUR investors

Backed by world-class partners

Datafold is used by data teams at Patreon, Thumbtack, Substack, Angellist, among others, and raised $22M from YC, NEA & Amplify Partners.

In the media

Strategies For A Successful Data Platform Migration
Build Better Tests For Your dbt Projects With Datafold And data-diff
Datafold Raises $20M Series A To Help Data Teams Deliver Reliable Products Faster
Datafold Raises $20M in Series A Funding
Datafold Raises $20 Million in Series A
Data reliability platform Datafold raises $20M
Datafold Raises $20M Series A To Help Data Teams Deliver Reliable Products Faster
Datafold Secures $20 Million Series A
NEA Invests in $20 Million Series A for Datafold
Datafold raises $20M for data reliability engineering
Chamath Palihapitiya’s Metromile SPAC fails to live up to its promise
How To Effectively Reduce Data Quality Incidents 10x with Datafold
Episode 72: Folding Data with Gleb Mezhanskiy
Strategies For Proactive Data Quality Management - Episode 205
Analytics startups to watch in the coming year
Datafold raises seed from NEA to keep improving the lives of data engineers
Launch HN: Datafold (YC S20) – Diff Tool for SQL Databases
Datafold is solving the chaos of data engineering