Data engineering lead

Takeda

Job Location:

Bengaluru - India

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

By clicking the Apply button I understand that my employment application process with Takeda will commence and that the information I provide in my application will be processed in line with Takedas Privacy Notice and Terms of Use. I further attest that all information I submit in my employment application is true to the best of my knowledge.

Job Description

The Future Begins Here

At Takeda we are leading digital evolution and global transformation. By building innovative solutions and future-ready capabilities we are meeting the need of patients our people and the planet.

Bengaluru the city which is Indias epicenter of Innovation has been selected to be home to Takedas recently launched Innovation Capability Center. We invite you to join our digital transformation this role you will have the opportunity to boost your skills and become the heart of an innovative engine that is contributing to global impact and improvement.

At Takedas ICC we Unite in Diversity

Takeda is committed to creating an inclusive and collaborative workplace where individuals are recognized for their backgrounds and abilities they bring to our company. We are continuously improving our collaborators journey in Takeda and we welcome applications from all qualified candidates. Here you will feel welcomed respected and valued as an important contributor to our diverse team

About the role:

As a Data Engineer Lead you will provide technical and team leadership in designing building and optimizing scalable enterprise data architectures and high-quality data pipelines that deliver trusted actionable insights across the business. You will lead and develop a high-performing data engineering team driving strong high quality engineering standards ensuring consistent delivery of secure reliable and well-governed data assets that power business intelligence analytics and AI-driven decision-making.

You will oversee the end-to-end data lifecycle ensuring production data remains accurate timely and enterprise ready. By embedding data quality governance reconciliation and performance optimization into pipeline design you will enable scalable analytics and advanced AI/ML use cases.

You will partner closely with Design & Engineering Leads Delivery Leads Business Intelligence Data Science AI and other global PDT DD&T stakeholders across India Europe and the United States you will help advance modern data capabilities and enterprise-wide insight generation.

This role reports to the PDT Delivery Lead ICC India and is aligned with the Data Chapter.

How you will contribute:

Engineering & Design

Lead the end-to-end architecture design and implementation of scalable batch micro-batch and streaming data platforms aligned to enterprise data strategy
Define implement and enforce data engineering standards design patterns and governance controls to ensure secure reliable and production-ready data assets.
Shape storage compute and processing architectures across ingestion transformation serving and observability layers ensuring high availability resiliency and recoverability.
Establish and own engineering quality strategy including testing frameworks release readiness and continuous improvement of platform reliability and performance.

Databricks Spark & Performance Engineering

Provide deep technical leadership in Spark distributed computing and Databricks Lakehouse architecture guiding solution design and engineering best practices.
Lead large-scale performance optimization including cluster configuration autoscaling caching storage formats and workload tuning for cost and efficiency.
Diagnose and resolve complex platform or workload issues using Spark UI Ganglia and Databricks observability metrics driving measurable improvements in stability and throughput.
Drive pipeline redesign and storage optimization strategies (Delta Parquet partitioning Z-ordering) to balance scalability performance and cloud cost.
Implement robust observability error handling retry and checkpointing mechanisms and define SLOs/SLAs to ensure consistent production reliability.

Team Leadership & Capability Building

Lead mentor and grow a high-performing data engineering team within the ICCs strategic capability fostering engineering excellence ownership and continuous learning.
Translate enterprise architectural direction and global priorities into clear technical roadmaps execution plans and measurable outcomes for engineering teams.
Coach engineers on distributed data processing Databricks engineering patterns ETL design and production readiness elevating overall team capability.
Conduct design walkthroughs architecture reviews and code quality governance ensuring scalable maintainable and secure implementations.
Build a culture of agile delivery reuse automation operational stability and accountability across the data engineering lifecycle.

Global Collaboration & Stakeholder Engagement

Partner closely with geographically distributed business product owners architects and platform teams to clarify requirements constraints and acceptance criteria.
Communicate technical recommendations trade-offs and architectural decisions clearly to both technical and non-technical stakeholders.
Collaborate across data science AI/ML analytics cloud security and integration teams to enable enterprise-wide insight generation and AI adoption.
Represent the ICCs engineering capability in global forums design discussions and strategic initiatives ensuring alignment with enterprise standards and outcomes.

Data Governance Quality & Cost Stewardship

Embed data quality reconciliation lineage and governance controls into pipeline and platform design including secure access models and metadata management
Leverage governed data platform capabilities to ensure trusted compliant and discoverable enterprise data.
Drive cloud cost optimization strategies across storage compute and workload design while maintaining performance and scalability.
Ensure production data accuracy timeliness and reliability for downstream analytics reporting and AI-driven decision-making.

Minimum Requirements/Qualifications:

Bachelors degree in Engineering Computer Science Data Science or related field
10 years of experience in software development data engineering ETL and analytics reporting including proven team leadership experience
Expert in building and maintaining data and system integrations using dimensional data modelling and optimized ETL pipelines.
Advanced experience with modern data architectures and frameworks (data mesh data fabric data products) and scalable multi-source data integration across structured and unstructured data.
Proven track record of designing and implementing complex enterprise scale data solutions
Strong proficiency in Python SQL and PySpark with hands-on experience in Spark and distributed data processing including real-time pipelines using Spark Structured Streaming.
Experience with AWS cloud services (e.g. Lambda DMS Step Functions S3 EventBridge CloudWatch Aurora RDS) and DevOps/CI practices including automated deployments via GitHub Actions.
Deep understanding of database architecture data modeling relational databases data lakes data warehouses and Databricks/Delta Lakehouse.
Experience extracting transforming and consolidating multi-source enterprise data into governed analytics-ready platforms supporting BI and visualization.
Familiarity with code repositories and version control (GitHub GitLab or similar).
Strong experience in code reviews performance tuning scalability and maintainability of data engineering solutions.
Ability to optimize AWS/Databricks cloud costs and ensure efficient infrastructure utilization.
Experience with Databricks Unity Catalog for centralized governance lineage and secure access control.
Excellent communication storytelling and stakeholder engagement across cross-functional and global teams.
Strong organizational troubleshooting and problem-solving capabilities with the ability to manage multiple concurrent initiatives in fast-paced environments.
Experience working in globally distributed delivery models and leading engineering best practices.

Preferred requirements:

Masters degree in engineering specialized in Computer Science or related field
Demonstrated understanding and experience using:
- Knowledge in CDK
- Experience in IICS Data Integration tool
- Job orchestration tools like Tidal/Airflow/ or similar
- Knowledge on NoSQL
- ETL tools likeDataStage Ab Initio Talend
Databricks Certified Data Engineer Professional
AWS Certified Data Engineer Associate

BENEFITS:

It is our priority to provide competitive compensation and a benefit package that bridges your personal life with your professional career. Amongst our benefits are:

Competitive Salary Performance Annual Bonus
Flexible work environment including hybrid working
Comprehensive Healthcare Insurance Plans for self spouse and children
Group Term Life Insurance and Group Accident Insurance programs
Employee Assistance Program
Broad Variety of learning platforms
Diversity Equity and Inclusion Programs
Reimbursements Home Internet & Mobile Phone
Employee Referral Program
Leaves Paternity Leave (4 Weeks) Maternity Leave (up to 26 weeks) Bereavement Leave (5 calendar days)

ABOUT ICC IN TAKEDA:

Takeda is leading a digital revolution. Were not just transforming our company; were improving the lives of millions of patients who rely on our medicines every day.
As an organization we are committed to our cloud-driven business transformation and believe the ICCs are the catalysts of change for our global organization.

#Li-Hybrid

Locations

IND - Bengaluru

Worker Type

Employee

Worker Sub-Type

Regular

Time Type

Full time

Required Experience:

Job Description

The Future Begins Here

At Takeda we are leading digital evolution and global transformation. By building innovative solutions and future-ready capabilities we are meeting the need of patients our people and the planet.

At Takedas ICC we Unite in Diversity

About the role:

This role reports to the PDT Delivery Lead ICC India and is aligned with the Data Chapter.

How you will contribute:

Engineering & Design

Lead the end-to-end architecture design and implementation of scalable batch micro-batch and streaming data platforms aligned to enterprise data strategy
Define implement and enforce data engineering standards design patterns and governance controls to ensure secure reliable and production-ready data assets.
Shape storage compute and processing architectures across ingestion transformation serving and observability layers ensuring high availability resiliency and recoverability.
Establish and own engineering quality strategy including testing frameworks release readiness and continuous improvement of platform reliability and performance.

Databricks Spark & Performance Engineering

Provide deep technical leadership in Spark distributed computing and Databricks Lakehouse architecture guiding solution design and engineering best practices.
Lead large-scale performance optimization including cluster configuration autoscaling caching storage formats and workload tuning for cost and efficiency.
Diagnose and resolve complex platform or workload issues using Spark UI Ganglia and Databricks observability metrics driving measurable improvements in stability and throughput.
Drive pipeline redesign and storage optimization strategies (Delta Parquet partitioning Z-ordering) to balance scalability performance and cloud cost.
Implement robust observability error handling retry and checkpointing mechanisms and define SLOs/SLAs to ensure consistent production reliability.

Team Leadership & Capability Building

Lead mentor and grow a high-performing data engineering team within the ICCs strategic capability fostering engineering excellence ownership and continuous learning.
Translate enterprise architectural direction and global priorities into clear technical roadmaps execution plans and measurable outcomes for engineering teams.
Coach engineers on distributed data processing Databricks engineering patterns ETL design and production readiness elevating overall team capability.
Conduct design walkthroughs architecture reviews and code quality governance ensuring scalable maintainable and secure implementations.
Build a culture of agile delivery reuse automation operational stability and accountability across the data engineering lifecycle.

Global Collaboration & Stakeholder Engagement

Partner closely with geographically distributed business product owners architects and platform teams to clarify requirements constraints and acceptance criteria.
Communicate technical recommendations trade-offs and architectural decisions clearly to both technical and non-technical stakeholders.
Collaborate across data science AI/ML analytics cloud security and integration teams to enable enterprise-wide insight generation and AI adoption.
Represent the ICCs engineering capability in global forums design discussions and strategic initiatives ensuring alignment with enterprise standards and outcomes.

Data Governance Quality & Cost Stewardship

Embed data quality reconciliation lineage and governance controls into pipeline and platform design including secure access models and metadata management
Leverage governed data platform capabilities to ensure trusted compliant and discoverable enterprise data.
Drive cloud cost optimization strategies across storage compute and workload design while maintaining performance and scalability.
Ensure production data accuracy timeliness and reliability for downstream analytics reporting and AI-driven decision-making.

Minimum Requirements/Qualifications:

Bachelors degree in Engineering Computer Science Data Science or related field
10 years of experience in software development data engineering ETL and analytics reporting including proven team leadership experience
Expert in building and maintaining data and system integrations using dimensional data modelling and optimized ETL pipelines.
Advanced experience with modern data architectures and frameworks (data mesh data fabric data products) and scalable multi-source data integration across structured and unstructured data.
Proven track record of designing and implementing complex enterprise scale data solutions
Strong proficiency in Python SQL and PySpark with hands-on experience in Spark and distributed data processing including real-time pipelines using Spark Structured Streaming.
Experience with AWS cloud services (e.g. Lambda DMS Step Functions S3 EventBridge CloudWatch Aurora RDS) and DevOps/CI practices including automated deployments via GitHub Actions.
Deep understanding of database architecture data modeling relational databases data lakes data warehouses and Databricks/Delta Lakehouse.
Experience extracting transforming and consolidating multi-source enterprise data into governed analytics-ready platforms supporting BI and visualization.
Familiarity with code repositories and version control (GitHub GitLab or similar).
Strong experience in code reviews performance tuning scalability and maintainability of data engineering solutions.
Ability to optimize AWS/Databricks cloud costs and ensure efficient infrastructure utilization.
Experience with Databricks Unity Catalog for centralized governance lineage and secure access control.
Excellent communication storytelling and stakeholder engagement across cross-functional and global teams.
Strong organizational troubleshooting and problem-solving capabilities with the ability to manage multiple concurrent initiatives in fast-paced environments.
Experience working in globally distributed delivery models and leading engineering best practices.

Preferred requirements:

Masters degree in engineering specialized in Computer Science or related field
Demonstrated understanding and experience using:
- Knowledge in CDK
- Experience in IICS Data Integration tool
- Job orchestration tools like Tidal/Airflow/ or similar
- Knowledge on NoSQL
- ETL tools likeDataStage Ab Initio Talend
Databricks Certified Data Engineer Professional
AWS Certified Data Engineer Associate

BENEFITS:

It is our priority to provide competitive compensation and a benefit package that bridges your personal life with your professional career. Amongst our benefits are:

Competitive Salary Performance Annual Bonus
Flexible work environment including hybrid working
Comprehensive Healthcare Insurance Plans for self spouse and children
Group Term Life Insurance and Group Accident Insurance programs
Employee Assistance Program
Broad Variety of learning platforms
Diversity Equity and Inclusion Programs
Reimbursements Home Internet & Mobile Phone
Employee Referral Program
Leaves Paternity Leave (4 Weeks) Maternity Leave (up to 26 weeks) Bereavement Leave (5 calendar days)

ABOUT ICC IN TAKEDA:

Takeda is leading a digital revolution. Were not just transforming our company; were improving the lives of millions of patients who rely on our medicines every day.
As an organization we are committed to our cloud-driven business transformation and believe the ICCs are the catalysts of change for our global organization.

#Li-Hybrid

Locations

IND - Bengaluru

Worker Type

Employee

Worker Sub-Type

Regular

Time Type

Full time

Required Experience:

Key Skills

Apply Now

About Company

Takeda

Takeda is a patient-focused, R&D-driven global biopharmaceutical company committed to bringing Better Health and a Brighter Future.

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click

AI Resume Builder

Create an ATS-ready CV in minutes

AI Cover Letter

Write a personalized letter instantly

Data engineering lead

Bengaluru - India

Job Summary

Job Description

Locations

Worker Type

Worker Sub-Type

Time Type

Job Description

Locations

Worker Type

Worker Sub-Type

Time Type

Key Skills

About Company

Related Jobs