Automated testing
for data engineers

We empower data teams to build reliable data products faster

At Datafold, we build tools for data practitioners to automate the most error-prone and time-consuming parts of the data engineering workflow: testing data to guarantee its quality. While data quality (just like software quality) is a complex and multifaceted problem, we draw from decades of our team’s combined experience in the data domain to build opinionated tools our users love. Specifically, we believe that:

Data quality is a byproduct of a great data engineering workflow

That means, rather than building yet-another-app for data practitioners to switch to and from, we insert our tools in the existing workflows, for example, in CI/CD for deployment testing and IDEs for testing during development.

Data quality issues should be addressed before deploying the code

Most data quality issues are bugs in the code that processes data, and applying a proactive, shift-left approach is the most effective way to achieve high shipping velocity and data quality simultaneously.

Lack of metadata (data about data) is the biggest gap in the data engineering workflow.

We bring powerful tools such as data diffing and column-level lineage to every data engineer’s workflow to help them validate the code and underlying data and fully understand the dependencies in complex data pipelines.

We're an all-remote global team

Join us

We're an all-remote global team

Datafold Values

Mission Statement

We are here to make a positive dent in the world by doing the best work of our careers, share a great upside at some point and enjoy the journey throughout. So far we’ve had all stars align in our favor, and the only reason why we wouldn’t succeed if we don’t execute it well enough or fast enough.

1
Contribution to Success

Your contribution to our shared success – your impact on our users, the team, product and company is ultimately what matters.

2
Asynchronous Work Culture

We work across time zones and cultures and rely primarily on asynchronous interactions. This requires intention in how we communicate and make decisions. Don’t expect things to happen naturally or the information flow organically. Act with intent.

3
Proactive Approach

When you have a plan or idea or spot something broken, bias toward action. Build it, improve it, fix it. Most decisions are reversible. Taking two steps forward and one back is better than not stepping forward at all.

4
Shipping Strategy

Ship early and often. Quality ≠ perfection. We strive for high-quality impact, but there is no impact (and often, no learnings) until someone is using it.

5
Seeking Help

If you are stuck, reach out for help. Be the primary driver for solving your problem and involve others.

6
Impact over Titles

It’s ok to step on toes. We value impact, not fitness to the job title. If you, however, do happen to step on someone’s toes, be respectful, considerate and kind.

6
Extreme Ownership

Take extreme ownership: when things go wrong, bias toward taking responsibility. When it seems that it’s someone else’s fault, ask yourself: what could you have done to help them do the right thing? Then do it.

6
Recruitment Priority

Treat recruiting including interviewing and convincing candidates to join our team at least as important as whatever you do as part of your role.

6
Empowering Data and Analytics Engineers

Datafold exists to empower data and analytics engineers. The work that doesn’t eventually translate into creating value for our users has no purpose.

1
Context and Learning

Empowering your team members by helping them acquire important context and learn, by reviewing their work and discussing ideas is a great way to make impact.

2
Moving Up the Stack

Empower your team by moving up the stack and automating your role away. Create processes, tools, and documentation. There is no honor in being irreplaceable but there is in being invaluable. By enabling others on the team to help themselves and automating your routine tasks, you gain leverage to create ever more value.

3
Overcommunicate

In a remote, async, cross-timezone setting, it’s ok to repeat yourself multiple times to drive alignment. Be creative, patient and persistent in communication: sometimes it’s helpful to jump on a quick call, sometimes you need to think about improving docs and pinning something to a Slack channel.

4
Default to transparency

Highlighting your progress, celebrating success helps others learn from you and boosts morale. Floating up challenges lets others help you out and builds trust.

5
Pay it forward

Give a hand, don’t let your teammates fail. We are playing a long game in tough times.

6
Give feedback, early and often.

Be generous when giving positive feedback, be brave and vulnerable when giving negative feedback to help someone improve. Positive feedback can have strong positive ripple effects if given in public, e.g. in #thanks Slack channel. Negative feedback should be given in private.

6
Assume positive intent

In a remote, cross-cultural setting and amidst all the pressure, it’s really easy to go down the spiral of taking things personally, and the results can be devastating. Assuming positive intent means that when an offense is taken, assume the person on the other side means well even if they haven’t communicated it in the exact way you’d wanted to. Offer feedback and use it as an opportunity to grow. If that doesn’t work, the behavior is egregious and/or violates our Code of Conduct, reach out your manager or the founders for help.

6
Intrinsic Value of Quality Work

Work done well is its own end. Raise the bar for yourself and keep it high for others.

1
Startup Equation: Growth

Startup=growth. We only succeed if we grow fast which requires a lot of leverage. Well-functioning and high-performing team of great people is our highest leverage. Company doesn’t grow without the growth of an individual. Improve, learn, and take feedback with gratitude and consider it as a great opportunity to grow.

2
Value of Growth Curve's Slope

We value the individual’s growth curve’s slope (i.e. how fast someone has been growing and learning) more than y-intercept (how accomplished they are by now) when hiring and evaluating performance.

3
Sloppiness is Unacceptable

There is no place for sloppiness. Our customers trust us with their most sensitive asset – data. Sloppiness is unacceptable for it’s destructive. Our users perceive it as indifference,  and it deteriorates the culture and the morale of the team.

4
Move Fast and Learn

We move fast and occasionally break things. It’s ok to make mistakes, it’s not acceptable not to learn from them. Every major incident (not only in engineering) deserves a blameless post-mortem with lessons learned and shared across the larger team.

5
Self-Care

Take care of yourself. Working yourself to the bone is bad for you and bad for the team. Knowing what drains you, what gives your energy, and how to balance those is a great skill itself.

6
founders
founder & ceo
Gleb Mezhanskiy
CO-founder & cTO
Alex Morozov

OUR investors

Backed by world-class partners

Datafold is used by data teams at Patreon, Thumbtack, Substack, Angellist, among others, and raised $22M from YC, NEA & Amplify Partners.

In the media

Strategies For A Successful Data Platform Migration
Build Better Tests For Your dbt Projects With Datafold And data-diff
Datafold Raises $20M Series A To Help Data Teams Deliver Reliable Products Faster
Datafold Raises $20M in Series A Funding
Datafold Raises $20 Million in Series A
Data reliability platform Datafold raises $20M
Datafold Raises $20M Series A To Help Data Teams Deliver Reliable Products Faster
Datafold Secures $20 Million Series A
NEA Invests in $20 Million Series A for Datafold
Datafold raises $20M for data reliability engineering
Chamath Palihapitiya’s Metromile SPAC fails to live up to its promise
How To Effectively Reduce Data Quality Incidents 10x with Datafold
Episode 72: Folding Data with Gleb Mezhanskiy
Strategies For Proactive Data Quality Management - Episode 205
Analytics startups to watch in the coming year
Datafold raises seed from NEA to keep improving the lives of data engineers
Launch HN: Datafold (YC S20) – Diff Tool for SQL Databases
Datafold is solving the chaos of data engineering