Enterprises Whose Bad Data Cost Them Millions: Lessons from Samsung and Uber
Bad data costs businesses an average of $15 million annually, according to Gartnerās Data Quality Market Survey. The survey shows how ānearly 60% of businessesā donāt even measure the financial implications of having bad data. Of course, itās hard to even measure the value of good data, but thatās a blog for another day.
Companies must prioritize high-quality data despite up-front costs because bad data quality could ultimately cost a lot more money. Samsung and Uber learned that lesson the hard way.
Samsung: Data Entry Error Cost $105 Billion
In 2018, a Samsung Securities employee in South Korea made a āfat-fingerā error mistaking won (South Koreaās currency) for shares, paying out 1,000 Samsung Securities shares to workers instead of 1,000 won per share in dividends.
That single human error cost the technology company $300 million in the end! (For a short period of time, the company had issued a mind-blowing $105 billion worth of shares, but that was fixed within 37 minutes, according to IEEE.) Ultimately, Samsung Securities paid dividends worth 1,000 times the value of each share to 2,018 of its employees.
Similar fat-finger problems can afflict any organization without protocols in place to protect itself. In the case of Samsung Securities, if an assurance process sent the data to another employee or automatically checked the range, they couldāve avoided the error. Had proper processes been in place before the employee paid out the shares, a prompt would have shown the error. The situation could have been averted and the loss avoided with some fairly simple data processes.
Uber: Accounting Error Resulted in āTens of Millionsā of Underpaid Compensation
In 2017, Uber admitted an overcalculation of its commission cut in its accounting system caused drivers to be underpaid. The error started from an update in 2014 to Uberās terms of service. As a result, Uber skipped subtracting taxes and fees before collecting a 25% commission.
Because of the accounting error, Uber had to repay its drivers ātens of millionsā of dollars. In the end, it cost Uber $900 per driver.
Uber didnāt have a process in place to check for data quality in its accounting organization. āUber says it discovered the error when updating its terms of service for the recent launch of its new route-based pricing,ā says Business Insider. Unfortunately, that means no one at Uber noticed the data error until someone manually updated its terms of service.
If checking data had been routine and systematic, the error could have been avoided in the first place. With good data quality assurance practices, Uber would have been notified of the incorrect formula calculating its commission.
Lessons Learned: What You Can Do to Prevent Bad Data
Harvard Business Review says only 3% of companiesā data meets basic quality standards. Here are some steps your organization can take to prevent the kinds of problems Samsung and Uber had:
- Make data quality part of your KPIs or OKRs and work towards building a culture of data quality and observability.
- Do an organization-wide study to see if you have a data quality problem, Nagle, Redman, and Sammon recommend.
- Follow data quality best practices and automate elements of your data QA to ensure every change is checked and tested.
- Make data quality a part of everyoneās job. Every person in your organization should prioritize good data.
- Write well-documented processes that are shared with your entire organization. Make sure everyone is following the same rules.
As HBR reports, āit costs ten times as much to complete a unit of work when the data are flawed in any way as it does when they are perfect.ā So, make a relatively small investment to prevent bad data in the first place to save money in the long run.
Promote Data Quality Management in Your Company
To ensure that bad data doesnāt make it into your pipeline, every code change should be tested. But this doesnāt have to be a grueling process of manual work. Data Diff can help you quickly see how any code changes can impact your data, while column-level lineage can make it easy to track downstream impacts. Plus, smart alerts can ping you when something doesnāt look right without adding even more noise to your busy day.
Whatever you do to improve your data quality and observability, make sure that you prioritize a culture of good data. While Samsung and Uber may have faced dire consequences from bad data, the value of good data canāt be overstated.
Datafold is the fastest way to validate dbt model changes during development, deployment & migrations. Datafold allows data engineers to audit their work in minutes without writing tests or custom queries. Integrated into CI, Datafold enables data teams to deploy with full confidence, ship faster, and leave tedious QA and firefighting behind.