Datafold data engineering
Product
What we do
Data Platform Migrations
6x faster migrations with AI code translation and automated validation
Code Review and Testing
AI-driven impact analysis on every PR and value-level comparisons for every code change
Data Reconciliation
Test and monitor data consistency across databases with 
real-time, value-level precision
Data Quality Monitoring
Be the first to know about quality issues in your data warehouse
How we do it
AI Agents
Powerful AI that deeply understands your data to accelerate data engineering workflows
Data Diff
Compare datasets within or across databases with value-level precision at any scale
Anomaly Detection
ML-driven monitoring across all dimensions of data quality
Column-Level Lineage
See how data moves and transforms through your data ecosystem from source to end application
Customers
Resources
Resources
Blog
Insight and analysis of the latest trends
Guides
Deep dives and best practices
Changelog
The latest changes to the Datafold platform
Docs
How to put Datafold to work for your team
Featured
The Practical Guide to Data Modernization
Migrate with confidence and build a scalable, AI-ready data stack.
Pricing
Log in
Request a Demo
Ready for AI, but stuck with legacy data infrastructure?

Your data stack shouldn’t hold you back. It should be your biggest competitive advantage. Stop fighting legacy roadblocks and build an AI-ready data stack with confidence.

By providing this information, you agree to be kept informed about Datafold’s products and services.
Every data migration needs a hero!

A data migration shouldn’t be your villain origin story.  Learn from the best (and worst) data migrations.

Explore the Data Migration Guide Now
Get migrations right the first time with our new guide on data migration best practices.

Learn strategies to mitigate risks, streamline processes, and deliver on-time and on-budget outcomes that earn stakeholder trust.

By providing this information, you agree to be kept informed about Datafold’s products and services.
Request a 30-minute demo

Our product expert will guide you through our demo to show you how to automate testing for every part of your workflow.

See data diffing in real time
Data stack integration
Discuss pricing and features
Get answers to all your questions
By providing this information, you agree to be kept informed about Datafold’s products and services.
Submit your credentials
Schedule date and time
for the demo
Get a 30-minute demo
and see datafold in action
September 17, 2021

Folding Data #12

Gleb Mezhanskiy
CEO of Datafold
#12

When I started this newsletter months ago, I wasn't sure if I'd be able to find some worthy and interesting tools and stories to share with you every week. But luckily, the data space is evolving increasingly fast, and the deeper you dive in what seems like a small domain, the more you find. At some point, we should talk about data tech singularity but I'll stop here for now. 🙂

An Interesting Read: Timeseries Anomaly Detection at Scale with Thirdeye

Today perhaps no one would argue that having people monitor metrics for anomalies is a good (or even attainable) idea. Whereas multiple proprietary solutions have evolved, for a while there wasn't much in the open-source space beside the good old Prophet. Thirdeye is interesting in that it provides an end-to-end flow for detecting anomalies in time series, breaking them down by dimensions for root cause analysis, and even has some basic collaboration features for issues triage. Although the velocity of the project hasn't been high lately, the code is well structured and easy to study for anyone looking to adopt or built something internally. But before you dive in the code, check out an awesome article about ABTasty's journey integrating Thirdeye in their BigQuery-based data pipeline.

Go on ABTasty's Data Quality Journey

Tool of the Week: Made with ML

What is better than a tool that helps you learn a new field? Made with ML is among the top ML repos on GitHub, offering introductory courses for anyone looking to get into or uplevel their machine learning skills. The field is evolving so fast that it never hurts to catch up on the latest trends in the space.

Give Made with ML a GitHub star ✨

Data Quality Management According to Lyft, Shopify, and Thumbtack

Managing a two-sided market is no joke, especially trying to grow one in a sustainable (i.e. not burning billions of cash a year) way. It is not a coincidence that marketplace tech companies are among the most invested in data and its quality: a decision based on incorrect data can easily throw the market off-balance and cause a poor experience for a large number of users. For example, if Lyft's driver's ETA prediction model drifts off to the higher end, pricing algorithms can start setting higher prices for passengers which can result in fewer rides requested and low earnings for drivers. We've got a chance to learn from these three data-driven companies how they approach data quality management. What's interesting: each has a unique approach but it all comes down to reliable change management and proactive testing of data.

Show me what Lyft, Shopify, and Thumbtack are doing

Before You Go

Yup, it's time for a meme about everyone's favorite thing - documentation!

image (6)

‍

In this article
An Interesting Read: Timeseries Anomaly Detection at Scale with Thirdeye
Go on ABTasty's Data Quality Journey
Tool of the Week: Made with ML
Give Made with ML a GitHub star ✨
Data Quality Management According to Lyft, Shopify, and Thumbtack
Show me what Lyft, Shopify, and Thumbtack are doing
Before You Go
share:
Upcoming Event
Datafold Demo Day
Datafold Cloud Demo Day
Welcome to Datafold's Cloud Demo Day! If you’ve ever wondered: How to automatically integrate data diffing in your development, deployment or migration workflow, or How to level-up your dbt tests & enable your team to follow software engineering testing best practices How to best replicate data between two different data warehouses
Register now
Privacy Policy
|
MSA
|
DPA
© 2025 Datafold
Product
  • Migrations
  • CI
  • Monitors
  • Data Reconciliation
  • Pricing
Technology
  • AI agents
  • Data diff
  • Column-level lineage
  • Anomoly detection
Resources
  • Blog
  • Customers
  • Guides
  • Docs
  • Changelog
Company
  • About
  • Careers
  • Contact
By providing this information, you agree to be kept informed about Datafold’s products and services.