The best of Datafold in 2021
As we head into the final weeks of 2021, it’s an ideal time to look back on the year that was and reflect on everything that happened - I guess that's why everyone does it around now. Whether you’ve been on this journey with us since January, or you only heard about Datafold in the past month, take this walk down memory lane with us and see all the things we did this year.
Datafold launches column-level lineage
We were super proud to add column-level lineage to the Datafold platform in the first half of the year. It offers a big picture view of the data, with options to zoom in for increased granularity.
We have heard about customers using it to improve new hire onboarding, supercharging their data discovery. Others have talked about using it to trace PII through their data pipelines. Whether it’s simply checking at a field level if a change will impact a dashboard, or getting faster troubleshooting, column-level data lineage has become a core feature for Datafold in 2021.
Integration with dbt
Whether you’re using dbt Cloud or opting to go open-source on prem, dbt is often foundational for teams’ modern data stack. Beyond that, dbt has become something of a hub for the data community. Pretty much everyone who is interested in the modern data stack eventually ends up joining the dbt Slack - it’s no wonder that they have so many thousands of community members.
We have a range of integrations at Datafold, but our seamless integration with dbt was a high point in 2021. With just a couple of clicks, data engineers can connect dbt with Datafold and instantly augment any dbt tests and improve data quality.
Sharing success stories
In 2021, we made a concerted effort to showcase our customer stories, sharing case studies of how leading data-driven organizations are benefitting from their relationships with Datafold. We like to say that Datafold is like a Swiss Army Knife - it can be used for a range of purposes. Our case studies showed just how diverse the organizations are that are finding success with Datafold in a myriad of ways.
Dutchie, Patreon, and Thumbtack all shared how Datafold has improved data quality and reliability for their teams. We can’t wait to tell even more success stories as we head into 2022.
Great speakers at the Data Quality Meetup
Every quarter, we hosted incredible speakers from across the data industry, open-source community, and data-related vendors. These speakers offered best practices, shared horror stories, and helped the community learn together. You can watch all the videos and read all the digests from the meetups here. These are some lightning talks you definitely should watch:
- Setting Sisyphus Free: Data Discoverability in a Scaling Company
Maura Church - Director of Data Science @ Patreon
- Fake It Till You Make It: A Backward Approach to Data Products
Alex Viana, VP of Data @ HealthJoy
- Automating Data Quality at Thumbtack
John Lee, Director, Product Analytics @ Thumbtack
A range of interesting articles
Not to get too meta about things, but this blog has always been one of the things we’re proud of in 2021. We shared articles about a wide variety of topics, sharing what was happening within the business, cool open source projects, and even key findings about the data community in general. These were some of our top blogs in 2021:
- The State of Data Quality in 2021 - Based on a survey, we found just how important data quality is for the data community, and what were some of the biggest challenges facing data teams. Spoiler: data quality is increasingly top KPI for data teams, who are struggling with mounting manual work.
- The Modern Data Stack: Open-source Edition - With so many great open-source projects out there across the data pipeline, this blog started as a thought experiment to see if you could build a modern data stack with only open-source software. We received a ton of feedback on this one, and are planning an update soon, but this is definitely a cool jumping-off point if you’re planning to build an open-source stack.
- Datafold: From Breaking Data to Series A - From a company perspective, one of our biggest pieces of news in 2021 was raising our Series A. More than touting how much money we raised, this blog shares the founder’s journey, going from breaking data pipelines to building the tools every data team can use to thrive.
Needless to say, 2021 was an eventful year at Datafold. We shipped a ton of new features, secured some incredible customers, and built a stellar global team. We are confident that 2022 will be even better. To stay up to date with all the latest Datafold news throughout the year, be sure to subscribe to our weekly newsletter, Folding Data. You’ll get cool articles, the latest blogs, and always a fun meme. Enjoy the last of 2021 and we will see you all again in 2022.
Datafold is the fastest way to validate dbt model changes during development, deployment & migrations. Datafold allows data engineers to audit their work in minutes without writing tests or custom queries. Integrated into CI, Datafold enables data teams to deploy with full confidence, ship faster, and leave tedious QA and firefighting behind.