Request a 30-minute demo

Our product expert will guide you through our demo to show you how to automate testing for every part of your workflow.

See data diffing in real time
Data stack integration
Discuss pricing and features
Get answers to all your questions
By providing this information, you agree to be kept informed about Datafold's products and services.
Submit your credentials
Schedule date and time
for the demo
Get a 30-minute demo
and see datafold in action

How Evri migrated from SAP HANA and Talend to Databricks in under a year

Evri is the UK’s biggest dedicated parcel delivery company. With 28K+ couriers and ~1B parcels a year, data runs the operation: the Ops team tracks SLA compliance in near-real-time to ensure parcels are delivered on time, while data science is applied to harder problems across the business, such as optimizing final mile delivery.

The legacy data stack wasn’t keeping up. Evri had an Oracle warehouse feeding Business Objects, SAP HANA for intraday ops reporting, and Talend for ETL. Each made sense at the time, but it added up to duplicate data everywhere, high licensing costs, and systems that predated GDPR with limited built-in governance. The data stack also couldn’t run Evri’s data science and operational research workloads.

After an RFP and pilot, Evri picked Databricks. One platform for reporting, ML, and near-real-time analytics, with built-in governance and support for batch, streaming, and API ingestion. Fewer platforms, fewer places for things to break. Databricks’ rapid pace of development, including its AI capabilities, was also a significant factor in the decision.

The migration challenge

With legacy platforms’ end-of-life and renewal deadlines looming, Evri needed a migration approach that was faster and lower risk than a traditional manual rewrite. The migration presented significant challenges:

  • Hard contractual deadlines: Both SAP HANA and Talend carried hard contractual commitments with no viable extension path.
  • Multi-system complexity: SAP HANA environment comprised 790 calculation views, 220+ stored procedures and table functions, and multiple flowgraphs — all tightly coupled to downstream Power BI dashboards. Talend had 200+ active production jobs handling data movement and transformation across Evri’s operational environment.
  • Performance requirements: Operational dashboards had to maintain < 10 minutes data latency through a complex chain of transformations for near-real-time decision making.
  • Zero disruption: Power BI dashboards serving Evri’s operations team had to remain live throughout the transition.

Evri’s team estimated the full migration would take two to three years if done using the traditional outsourcing model. That concern was amplified by undocumented legacy logic:

“Every company has a data warehouse, which absolutely everyone is relying on, that was created 10 or 15 years ago. Most of the people who understood it long since left the company, and very little is properly documented.”

— Harvinder Atwal

They engaged Datafold to compress this into under a single year, while maintaining confidence in production outputs.

“We thought that going down the AI-powered route might be a more efficient and quicker way - potentially a less risky way of migrating.”

— Harvinder Atwal

The solution

AI-powered Migration & Modernization

Datafold’s Migration Agent handled the full AI-powered code translation and refactoring from both SAP HANA and Talend into Databricks. The Agent analyzed the full dependency graph to determine which procedures and jobs could be eliminated (made redundant by Databricks’ native incremental processing), which needed refactoring into Lakeflow Spark Declarative Pipelines, and which required direct translation.

SAP HANA NDSO pipelines responsible for near-realtime representation of core business dimensions consisted of deeply nested stored procedures and views with over 6,000 lines of SQL per pipeline that included complex flag-based backfill logic, creating a significant challenge for migration. Even though the Databricks platform supports stored procedures, a 1:1 migration wasn’t feasible due to looping logic and reference complexity. Instead, Datafold Migration Agent refactored each pipeline into a Lakeflow Spark Declarative Pipeline, enabling efficient near-realtime incremental processing with automatic orchestration and backfilling. The modernization of SAP HANA sprocs into Databricks SDPs reduced the code volume by 3x, making the code more readable and maintainable.

Talend ETL jobs handling transformation of raw data from Bronze to Silver layers were refactored into Lakeflow Jobs and Lakeflow Spark Declarative Pipelines depending on the transformation type, which also led to significant simplification of the code.

Accelerating Oracle Migration UAT

The migration of workloads from Oracle had started prior to the Datafold engagement. The Evri team is using Datafold to automate and accelerate user acceptance testing for the Oracle migration — systematically comparing Oracle outputs against the new Databricks models to reduce the manual effort required to validate each migrated pipeline.

Value-level validation across thousands of datasets

For every delivered data asset, Datafold produced a data diff report showing the exact match between the legacy platform and Databricks. The vast majority of delivered assets achieved 100% parity. All remaining discrepancies — such as those arising from differences in source data — were explained in detail and accepted by the data users.

Both the Datafold and Evri teams used Datafold’s Knowledge Graph to automate investigation of discrepancies, tracing root causes through column-level lineage. This allowed the team to pass UAT on time for operational cutover.

“What set Datafold apart is the data reconciliation and automating migration at scale. It’s not just converting the code — validation and reconciliation is really important.”

— Harvinder Atwal

Staging environment and Power BI lineage

Evri established a staging environment midway through the project, enabling Datafold’s Agents to test changes before deploying to production. Datafold’s Knowledge Graph also provided column-level lineage capabilities so Evri’s BI team could trace exactly which Databricks models underpinned each of their ~40 production Power BI dashboards — giving the team confidence to migrate dashboard by dashboard, validating each endpoint before marking the corresponding HANA object for decommission.

The results

Both platforms decommissioned on time

SAP HANA and Talend were retired on schedule — with no extensions required and significant renewal costs avoided.

Under one year — versus the two to three years anticipated

Evri’s team had estimated a two-to-three-year manual migration effort. Datafold delivered both the SAP HANA and Talend phases in under one year, with all production monitors validated and all Power BI dashboards cut over.

Lower delivery cost than a traditional migration approach

Evri also reported a clear cost advantage versus a traditional services-led migration:

“After working with Datafold, I realized just how differently we can approach migrations, and how much more efficiently we can deliver them. This is the future of migrations.”

— Harvinder Atwal

Looking ahead

The successful HANA and Talend migration immediately opened the door to Evri’s next phase. Following Evri’s acquisition of DHL UK, the team kicked off the fourth Datafold engagement to migrate DHL’s Microsoft SQL Server data warehouse, including adjacent SSIS, SSRS, and SSAS services, to Databricks.

Post-migration, Evri’s data engineers are using Datafold’s MCP integration with Knowledge Graph and data diffs as part of their ongoing development workflow. Engineers query column-level lineage directly from their AI-assisted development environment, enabling them to trace upstream dependencies and run data diffs without leaving their IDE.

Challenge
SAP HANA SAP HANA
Talend Talend
Oracle Oracle
Databricks Databricks
Outcome
< 1 year vs. 2–3 year timeline
12+ months saved
SAP HANA & Talend migrated to Databricks and decommissioned on time, avoiding renewal