Thumbtack
Patreon
Dutchie
Nutrafol
Truebill
Faire
Snapcommerce
Pricing
Changelog
Docs
dbt Integration
Data Quality Meetup
Contact us
About us
Careers
Blog
Folding Data Newsletter
See how Datafold is used by leading data-driven organizations
"Datafold encourages good behavior. We value moving fast over doing things absolutely perfectly the first time. So we would just not do the right thing. Datafold helps us do the right thing".
Read more
"Datafold's column-level lineage gives confidence in the whole system. If my stakeholders ask “why is this dashboard out of date?” I can answer in 25 seconds instead of digging through pull requests for hours. As a product owner, I can understand how the rest of the company makes decisions based on the data we produce. It brings data confidence and visibility to the company".
"Datafold's Data Diff is the missing piece of the puzzle for data quality assurance. When I first heard about Datafold, all I could think was “Finally!” - it’s an unspoken problem that we all know about and no one wants to talk about. Datafold gives us full confidence when we ship changes into production, which historically I couldn’t say about PRs".
"The proactive features reduce the need for alerts, but some changes are out of our control and alerts can give a ping. Anomaly detection lets me know instantly that there was no Facebook spend the day before, instead of having to look at 20+ Tableau dashboards, to immediately see and know that something is going on".
"You can see right off the bat whether your data quality is what you were expecting, and reviewers can see it, too. Now we’re at the rate where we’re automating code reviews, or close to it, on 100 pull requests per month. And this is just the start".
"We front loaded the validation to be on the model level, when those things are reliable, everything downstream becomes trustworthy. This made it much easier to debug if a report looks weird. It means that the problem is most certainly on the Mode side or in the code generating the report instead of the datasets".
"Datafold is like a booster shot… you get extra protection and security when you make changes".
A guide on how to build data systems that do not break.
Read More
Gleb and Simon discuss the Data Diff backstory including key design decisions, open sourcing, and how Data Diff works in practice.
Datafold now seamlessly integrates with Hightouch to show how changes to a dbt model will impact Hightouch models and syncs.
A guide on how to onboard Analytics Engineers to your company and data stack.
Data diffing is the process of comparing two datasets. See various ways to compare data at different levels of complexity.
Open source data-diff automates data quality checks for data replication and migration.
Learn best practices for how to write and manage dbt tests in your organization.
Datafold has launched new pricing to make data quality more accessible for analytics engineers and data engineers.
Data lineage tools provide visibility into how data is connected upstream and downstream within a database.
It's official Datafold is now SOC2 Type II compliant. We follow a security by design approach to our software development process and are focused on keeping our customers' data safe.
Datafold has partnered with dbt Labs and has launched an integration with dbt to deliver column-level lineage, data diff, and shareable impact reports for analytics engineers.
2021 was a big year for Datafold. We reflect on top feature updates, blogs, and major company announcements from the past year.
Get an overview of the Data Quality Meetup #6. With speakers from Yelp, Patreon, Convoy, and Lightdash, the event included lightning rounds on data quality best practices and approaches from leading data-driven companies.
Datafold Founder and CEO, Gleb Mezhanskiy, shares what prompted Datafold's creation, how it has grown, and plans for the future.
What should you be looking for when doing data QA with Data Diff? There are three core checks that can help prevent surprises in production dashboards, and this blog walks you through what you're looking for in each step.
There are plenty of rules around PII, but you can stay on top of where your sensitive data is flowing in your pipelines with column-level lineage.
Bad data cost Samsung and Uber ridiculous sums of money with issues that could have been averted if they had been invested in data quality management. Read about their mistakes, and see how you could avoid doing the same.
If you want column-level lineage but you prefer tools like Amundsen or Data Hub, Datafold's GraphQL API lets you bring your metadata with you.
Without proactive data quality management, mistakes will happen. What you do can help improve your data quality in the future. Data quality post-mortems are a valuable tool for building improved processes and systems, plus rebuilding stakeholder trust.
It can be hard to even answer the question "is our data in good shape?" but these teams have gone on a journey towards improved data quality management. Here's how.
Doordash, Truebill, Appfolio, Evidently.ai, and Narrator share valuable insights at the fifth Data Quality Meetup hosted by Datafold.
SOC 2 compliance is a major step on our security journey. Here are some lessons we learned, as well as what Datafold's compliance means for your business.
In July 2021, Datafold co-founder and CEO Gleb Mezhanskiy went on the Data Engineering Podcast to share his thoughts about a proactive approach to data quality management.
If you're looking to build the ideal modern data stack for analytics using only open-source options, this is the blog for you. Find all the best open-source alternatives to your favorite paid tools.
Data quality is increasingly a top KPI for data teams, even as multiple sources of data are making it harder to maintain data quality and reliability. These tools can facilitate quality data at every step.
Lightdash is an open-source alternative to Looker that natively integrates with dbt. It may not be as mature as other open-source products like Metabase, Querybook, or Superset, but it is different in a few essential ways.
Learn what steps your team needs to take to improve data quality and get the most out of your data.
Data quality is always evolving, so where is it in 2021? We asked and you answered - here are the results.
Lyft vs. Shopify in testing ETL at scale, using fake data to align your stakeholders, and how to avoid nuclear meltdowns in your data platform.
Good Data: How Spotify, Shopify & Lyft approach data quality
Why implement regression testing for ETL code changes, how to align data producers and consumers, and what Data teams at Carta, Thumbtack, Shopify & Clari do to solve data quality.
Take your ETL workflow to the next level with Datafold and dbt integration that automates data testing and provides column-level data lineage
The more people that are looking at the data, and the more apps that are using the data, the faster data quality issues will be identified and resolved.
On the second Data Quality Meetup, we discussed three types of data testing and when to apply them, new-generation ETL frameworks and ROI of open-source data catalogs.
Over the past 10 years, we've seen a great advancement in technologies and tools for analytics and machine learning: with today’s modern analytics stack, we have fast and scalable data warehouses, dirt-cheap data storage, capable ETL orchestrators, and powerful BI tools.
Unlocking the next level with most popular ETL orchestrator
Put a comma in the right place
Objective criteria and subjective advice when choosing a data warehouse for analytics.
Picking tools for every step in the data flow
To get Datafold to integrate seamlessly with your data stack we need to have a quick onboarding call to get everything configured properly