By clicking the Apply button I understand that my employment application process with Takeda will commence and that the information I provide in my application will be processed in line with Takedas Privacy Notice and Terms of Use. I further attest that all information I submit in my employment application is true to the best of my knowledge.
Job Description
The Future Begins Here
At Takeda we are leading digital evolution and global transformation. By building innovative solutions and future-ready capabilities we are meeting the need of patients our people and the planet.
Bengaluru the city which is Indias epicenter of Innovation has been selected to be home to Takedas recently launched Innovation Capability Center. We invite you to join our digital transformation this role you will have the opportunity to boost your skills and become the heart of an innovative engine that is contributing to global impact and improvement.
At Takedas ICC we Unite in Diversity
Takeda is committed to creating an inclusive and collaborative workplace where individuals are recognized for their backgrounds and abilities they bring to our company. We are continuously improving our collaborators journey in Takeda and we welcome applications from all qualified candidates. Here you will feel welcomed respected and valued as an important contributor to our diverse team
About the role:
As a Data Engineer Lead you will provide technical and team leadership in designing building and optimizing scalable enterprise data architectures and high-quality data pipelines that deliver trusted actionable insights across the business. You will lead and develop a high-performing data engineering team driving strong high quality engineering standards ensuring consistent delivery of secure reliable and well-governed data assets that power business intelligence analytics and AI-driven decision-making.
You will oversee the end-to-end data lifecycle ensuring production data remains accurate timely and enterprise ready. By embedding data quality governance reconciliation and performance optimization into pipeline design you will enable scalable analytics and advanced AI/ML use cases.
You will partner closely with Design & Engineering Leads Delivery Leads Business Intelligence Data Science AI and other global PDT DD&T stakeholders across India Europe and the United States you will help advance modern data capabilities and enterprise-wide insight generation.
This role reports to the PDT Delivery Lead ICC India and is aligned with the Data Chapter.
How you will contribute:
Engineering & Design
- Lead the end-to-end architecture design and implementation of scalable batch micro-batch and streaming data platforms aligned to enterprise data strategy
- Define implement and enforce data engineering standards design patterns and governance controls to ensure secure reliable and production-ready data assets.
- Shape storage compute and processing architectures across ingestion transformation serving and observability layers ensuring high availability resiliency and recoverability.
- Establish and own engineering quality strategy including testing frameworks release readiness and continuous improvement of platform reliability and performance.
Databricks Spark & Performance Engineering
- Provide deep technical leadership in Spark distributed computing and Databricks Lakehouse architecture guiding solution design and engineering best practices.
- Lead large-scale performance optimization including cluster configuration autoscaling caching storage formats and workload tuning for cost and efficiency.
- Diagnose and resolve complex platform or workload issues using Spark UI Ganglia and Databricks observability metrics driving measurable improvements in stability and throughput.
- Drive pipeline redesign and storage optimization strategies (Delta Parquet partitioning Z-ordering) to balance scalability performance and cloud cost.
- Implement robust observability error handling retry and checkpointing mechanisms and define SLOs/SLAs to ensure consistent production reliability.
Team Leadership & Capability Building
- Lead mentor and grow a high-performing data engineering team within the ICCs strategic capability fostering engineering excellence ownership and continuous learning.
- Translate enterprise architectural direction and global priorities into clear technical roadmaps execution plans and measurable outcomes for engineering teams.
- Coach engineers on distributed data processing Databricks engineering patterns ETL design and production readiness elevating overall team capability.
- Conduct design walkthroughs architecture reviews and code quality governance ensuring scalable maintainable and secure implementations.
- Build a culture of agile delivery reuse automation operational stability and accountability across the data engineering lifecycle.
Global Collaboration & Stakeholder Engagement
- Partner closely with geographically distributed business product owners architects and platform teams to clarify requirements constraints and acceptance criteria.
- Communicate technical recommendations trade-offs and architectural decisions clearly to both technical and non-technical stakeholders.
- Collaborate across data science AI/ML analytics cloud security and integration teams to enable enterprise-wide insight generation and AI adoption.
- Represent the ICCs engineering capability in global forums design discussions and strategic initiatives ensuring alignment with enterprise standards and outcomes.
Data Governance Quality & Cost Stewardship
- Embed data quality reconciliation lineage and governance controls into pipeline and platform design including secure access models and metadata management
- Leverage governed data platform capabilities to ensure trusted compliant and discoverable enterprise data.
- Drive cloud cost optimization strategies across storage compute and workload design while maintaining performance and scalability.
- Ensure production data accuracy timeliness and reliability for downstream analytics reporting and AI-driven decision-making.
Minimum Requirements/Qualifications:
- Bachelors degree in Engineering Computer Science Data Science or related field
- 10 years of experience in software development data engineering ETL and analytics reporting including proven team leadership experience
- Expert in building and maintaining data and system integrations using dimensional data modelling and optimized ETL pipelines.
- Advanced experience with modern data architectures and frameworks (data mesh data fabric data products) and scalable multi-source data integration across structured and unstructured data.
- Proven track record of designing and implementing complex enterprise scale data solutions
- Strong proficiency in Python SQL and PySpark with hands-on experience in Spark and distributed data processing including real-time pipelines using Spark Structured Streaming.
- Experience with AWS cloud services (e.g. Lambda DMS Step Functions S3 EventBridge CloudWatch Aurora RDS) and DevOps/CI practices including automated deployments via GitHub Actions.
- Deep understanding of database architecture data modeling relational databases data lakes data warehouses and Databricks/Delta Lakehouse.
- Experience extracting transforming and consolidating multi-source enterprise data into governed analytics-ready platforms supporting BI and visualization.
- Familiarity with code repositories and version control (GitHub GitLab or similar).
- Strong experience in code reviews performance tuning scalability and maintainability of data engineering solutions.
- Ability to optimize AWS/Databricks cloud costs and ensure efficient infrastructure utilization.
- Experience with Databricks Unity Catalog for centralized governance lineage and secure access control.
- Excellent communication storytelling and stakeholder engagement across cross-functional and global teams.
- Strong organizational troubleshooting and problem-solving capabilities with the ability to manage multiple concurrent initiatives in fast-paced environments.
- Experience working in globally distributed delivery models and leading engineering best practices.
Preferred requirements:
- Masters degree in engineering specialized in Computer Science or related field
- Demonstrated understanding and experience using:
- Knowledge in CDK
- Experience in IICS Data Integration tool
- Job orchestration tools like Tidal/Airflow/ or similar
- Knowledge on NoSQL
- ETL tools likeDataStage Ab Initio Talend
- Databricks Certified Data Engineer Professional
- AWS Certified Data Engineer Associate
BENEFITS:
It is our priority to provide competitive compensation and a benefit package that bridges your personal life with your professional career. Amongst our benefits are:
- Competitive Salary Performance Annual Bonus
- Flexible work environment including hybrid working
- Comprehensive Healthcare Insurance Plans for self spouse and children
- Group Term Life Insurance and Group Accident Insurance programs
- Employee Assistance Program
- Broad Variety of learning platforms
- Diversity Equity and Inclusion Programs
- Reimbursements Home Internet & Mobile Phone
- Employee Referral Program
- Leaves Paternity Leave (4 Weeks) Maternity Leave (up to 26 weeks) Bereavement Leave (5 calendar days)
ABOUT ICC IN TAKEDA:
- Takeda is leading a digital revolution. Were not just transforming our company; were improving the lives of millions of patients who rely on our medicines every day.
- As an organization we are committed to our cloud-driven business transformation and believe the ICCs are the catalysts of change for our global organization.
#Li-Hybrid
Locations
IND - Bengaluru
Worker Type
Employee
Worker Sub-Type
Regular
Time Type
Full time