With the use of AI, Datafold was able to deliver the project within a timeline significantly shorter than every other vendor estimate.
AstraZeneca is one of the world’s biggest pharmaceutical companies, with medicines reaching patients globally — including the COVID-19 vaccine, which reached billions of people during the pandemic.
A Regional Data Warehouse (RDW) built on AWS Redshift is the operational heart of AstraZeneca Canada’s entire sales team. The 100+ person team relies on the RDW for planning, execution, and tracking across a complex business.
For almost ten years, the data feeding the RDW — including prescriptions, sales, and other critical feeds — had flowed through Informatica PowerCenter, the ETL layer that pulled from every source, transformed, and loaded into Redshift. But PowerCenter’s days were numbered: vendor support for PowerCenter 10.5 ended in March 2026. Staying put wasn’t an option.
Working with Adastra, a data and analytics consultancy and longtime partner, AstraZeneca settled on dbt Cloud and Airflow as the stack to replace Informatica: transformations running directly on Redshift instead of round-tripping through a separate engine, the pipeline in SQL under version control, and a stack the team would own instead of rent. Delivery was the harder question — another vendor had scoped the migration at ~12 months and ~$1M, past the PowerCenter deadline. But Adastra’s familiarity with AstraZeneca’s technology and business processes, combined with Datafold’s migration automation, made a much more aggressive timeline possible.
“By the time PowerCenter support was ending, we already knew the architecture needed to evolve. This wasn’t only about replacing Informatica. We wanted a stack that gave us more flexibility and less dependency on a single vendor going forward.”
— Naveen Kumar Dhinakaran, Enterprise Integration Architect
The challenge
Picking the new stack was the easy part. Delivery was not:
- Big scope. Close to 800 workflows had to be migrated.
- Short timeline. PowerCenter 10.5’s standard support ended in March 2026.
- Complex workflows. Over 80% carried complex incremental behavior, calls to custom scripts, data quality gates, and ETL metadata logging.
The AstraZeneca team was well aware of the challenge:
“The start of the project was difficult, as data access, orchestration planning, and technical requirements all needed to come together.”
— Shakti Bhati, Data Solution Architect & Delivery Lead
The solution
This is what Datafold’s migration platform is built for: complex migrations delivered fast and affordably, with parity proven at every step. Here’s how the project ran.
Project blueprint
One of Adastra’s principal architects led the design of the target dbt project — capturing repeating patterns as standardized macros and project-wide configuration. AstraZeneca supplied the execution order in clear documentation, which Datafold encoded as dbt tags. AstraZeneca also supplied a wave-based delivery plan, slicing the scope into milestones the team could ship against.
Translation backed by lineage
Datafold used the Data Knowledge Graph to map Informatica’s lineage across the in-scope Redshift warehouse. That graph became the backbone for the Data Migration Agent (DMA), which rebuilt the same dependencies in dbt and converted the Informatica XMLs into dbt code, workflow by workflow.
Validation in the loop
For every model it produced, DMA used Data Diff to compare its output to Informatica’s, then refined the model until it reached data parity. Datafold added custom unit tests on top to further harden the models, and defined the dbt Cloud jobs as version-controlled code.
Continuous delivery, weekly checkpoints
Datafold released models in batches throughout the project and ran a weekly progress review with AstraZeneca’s management team. In the final phase, Data Diff Monitors validated parity on live production data, and Datafold ran knowledge-transfer sessions and produced documentation for a warm handover.
“A couple of weeks into the project, the team got rolling. Around December, Datafold delivered about one-third of the dbt project; from then on, delivery was stable and exactly on track. Even against the steep timeline, Datafold was able to deliver the dbt project by March, before PowerCenter support ended.”
— Shakti Bhati, Data Solution Architect & Delivery Lead
The AstraZeneca team is currently handling the tail of the cut-over process, replacing the legacy Informatica processes with dbt processes one module at a time. This phase is led by Adastra, with Datafold as support.
The results
“By leveraging an AI-driven conversion process to perform the bulk transformation and testing of our existing Informatica codebase into dbt, Datafold rapidly and reliably completed this potentially lengthy task. This enabled our internal team to focus on higher-value project activities.”
— Chad Barrett, Senior Manager, Information Management
On a modern data stack — dbt Cloud and Airflow, with transformations in SQL under version control.
Fast and cost-effective — 7 months versus the ~12 months and ~$1M scoped by another vendor.
No service disruption — 100+ Canada reps and managers kept their reports, targets, and cycle workflows running throughout cutover.