Enterprises Whose Bad Data Cost Them Millions: Lessons from Samsung and Uber

Bad data costs businesses an average of $15 million annually, according to Gartner’s Data Quality Market Survey. The survey shows how “nearly 60% of businesses” don’t even measure the financial implications of having bad data. Of course, it’s hard to even measure the value of good data, but that’s a blog for another day.

Companies must prioritize high-quality data despite up-front costs because bad data quality could ultimately cost a lot more money. Samsung and Uber learned that lesson the hard way.

Samsung: Data Entry Error Cost $105 Billion

In 2018, a Samsung Securities employee in South Korea made a “fat-finger” error mistaking won (South Korea’s currency) for shares, paying out 1,000 Samsung Securities shares to workers instead of 1,000 won per share in dividends.

That single human error cost the technology company $300 million in the end! (For a short period of time, the company had issued a mind-blowing $105 billion worth of shares, but that was fixed within 37 minutes, according to IEEE.) Ultimately, Samsung Securities paid dividends worth 1,000 times the value of each share to 2,018 of its employees.

Similar fat-finger problems can afflict any organization without protocols in place to protect itself. In the case of Samsung Securities, if an assurance process sent the data to another employee or automatically checked the range, they could’ve avoided the error. Had proper processes been in place before the employee paid out the shares, a prompt would have shown the error. The situation could have been averted and the loss avoided with some fairly simple data processes.

Uber: Accounting Error Resulted in “Tens of Millions” of Underpaid Compensation

In 2017, Uber admitted an overcalculation of its commission cut in its accounting system caused drivers to be underpaid. The error started from an update in 2014 to Uber’s terms of service. As a result, Uber skipped subtracting taxes and fees before collecting a 25% commission.

Because of the accounting error, Uber had to repay its drivers “tens of millions” of dollars. In the end, it cost Uber $900 per driver.

Uber didn’t have a process in place to check for data quality in its accounting organization. “Uber says it discovered the error when updating its terms of service for the recent launch of its new route-based pricing,” says Business Insider. Unfortunately, that means no one at Uber noticed the data error until someone manually updated its terms of service.

If checking data had been routine and systematic, the error could have been avoided in the first place. With good data quality assurance practices, Uber would have been notified of the incorrect formula calculating its commission.

Lessons Learned: What You Can Do to Prevent Bad Data

Harvard Business Review says only 3% of companies’ data meets basic quality standards. Here are some steps your organization can take to prevent the kinds of problems Samsung and Uber had:

  • Make data quality part of your KPIs or OKRs and work towards building a culture of data quality and observability.
  • Do an organization-wide study to see if you have a data quality problem, Nagle, Redman, and Sammon recommend.
  • Follow data quality best practices and automate elements of your data QA to ensure every change is checked and tested.
  • Make data quality a part of everyone’s job. Every person in your organization should prioritize good data.
  • Write well-documented processes that are shared with your entire organization. Make sure everyone is following the same rules.

As HBR reports, “it costs ten times as much to complete a unit of work when the data are flawed in any way as it does when they are perfect.” So, make a relatively small investment to prevent bad data in the first place to save money in the long run.

Promote Data Quality Management in Your Company

To ensure that bad data doesn’t make it into your pipeline, every code change should be tested. But this doesn’t have to be a grueling process of manual work. Data Diff can help you quickly see how any code changes can impact your data, while column-level lineage can make it easy to track downstream impacts. Plus, smart alerts can ping you when something doesn’t look right without adding even more noise to your busy day.

Whatever you do to improve your data quality and observability, make sure that you prioritize a culture of good data. While Samsung and Uber may have faced dire consequences from bad data, the value of good data can’t be overstated.

Datafold is the fastest way to validate dbt model changes during development, deployment & migrations. Datafold allows data engineers to audit their work in minutes without writing tests or custom queries. Integrated into CI, Datafold enables data teams to deploy with full confidence, ship faster, and leave tedious QA and firefighting behind.

Datafold is the fastest way to test dbt code changes