We are seeking an experienced Senior Data Engineer to join our AI this role you will lead the development and optimization of data infrastructure supporting our Agentic AI initiatives. You will collaborate with ML engineers AI scientists and product managers to architect implement and maintain robust data pipelines powering autonomous AI agents. As a senior member of the R&DS AI Innovation Program you will help shape data strategy and ensure our data solutions scale to meet the demanding requirements of nextgeneration AI systems.
Design develop and maintain scalable data pipelines and ETL processes supporting AI research and development.
Design and maintain scalable data models (e.g. star schemas featureready datasets semantic layers) for analytics ML training and agent workflows.
Collaborate with AI scientists and engineers to gather data requirements and ensure availability and quality.
Implement data governance and security measures to protect sensitive information.
Establish observability lineage tracking and monitoring frameworks to detect anomalies freshness issues and operational failures.
Implement data partitioning indexing and storage optimization techniques for largescale AI datasets.
Monitor and troubleshoot data pipeline issues to ensure continuity and reliability.
Stay current with emerging data engineering and AI technologies.
Drive data platform reliability scalability and cost optimization across cloudbased infrastructure.
Design and implement scalable resilient data architectures for AI agent training finetuning and inference workflows.
Build streaming and eventdriven pipelines enabling realtime agent feedback telemetry and adaptive learning.
Develop and maintain highperformance pipelines using modern orchestration frameworks to support realtime agent interactions.
Create specialized storage and retrieval systems for vector embeddings knowledge graphs and symbolic reasoning components.
Implement automated data validation schema testing and quality checks ensuring reliable AI training datasets.
Implement comprehensive monitoring and governance frameworks ensuring highquality training data and compliance with privacy regulations.
Continuously optimize system performance with a focus on reducing latency for agent decisionmaking.
Education
Bachelors or Masters degree in Computer Science Data Engineering or a related field; advanced degree preferred.
Experience
5 years of professional experience in data engineering including at least 2 years focused on ML/AI data infrastructure.
Advanced proficiency in Python and Scala; experience with Rust Go Java or Julia is valued.
Expertlevel knowledge of SQL and NoSQL databases.
Handson experience with vector databases (e.g. Pinecone Weaviate Milvus).
Proficiency with modern data orchestration platforms (e.g. Airflow 2.x).
Extensive experience with at least one major cloud platform (AWS Azure or GCP).
Expertise in containerization and orchestration (Docker Kubernetes).
Experience with Infrastructure as Code tooling (e.g. Terraform).
Experience with distributed computing frameworks (Spark Dask Ray).
Proficiency with streaming technologies (Kafka Flink).
Knowledge of modern data lakehouse architectures.
Certifications in cloud platforms big data technologies engineering or ML operations.
Experience collaborating with ML engineers on CI/CD pipelines for data processing and model deployment.
Working knowledge of ML frameworks (PyTorch TensorFlow).
Experience with feature stores and experimenttracking platforms.
Understanding of LLM finetuning data requirements and processing.
Experience developing data systems for autonomous AI agents or agentic AI applications.
Background in prompt engineering or retrievalaugmented generation systems.
Experience with semantic caching and efficient storage/retrieval of AIgenerated artifacts.
Familiarity with LLM evaluation metrics and benchmarking frameworks.
Expertise in hybrid architectures combining traditional databases with vector stores.
Experience with RAG systems and related data pipelines.
Knowledge of RLHF data workflows.
Experience mentoring junior engineers establishing best practices and contributing to architectural decisions.
IQVIA is a leading global provider of clinical research services commercial insights and healthcare intelligence to the life sciences and healthcare industries. We create intelligent connections to accelerate the development and commercialization of innovative medical treatments to help improve patient outcomes and population health worldwide. Learn more at .
IQVIA is committed to integrity in our hiring process and maintains a zero tolerance policy for candidate fraud. All information and credentials submitted in your application must be truthful and complete. Any false statements misrepresentations or material omissions during the recruitment process will result in immediate disqualification of your application or termination of employment if discovered later in accordance with applicable law. We appreciate your honesty and professionalism.
At IQVIA we believe that diversity inclusion and belonging empower our mission to accelerate innovation for a healthier world. We create a culture of belonging by valuing the perspectives of all talented employees worldwide and providing them with the opportunity to power smarter healthcare for everyone everywhere. When our talented employees bring their authentic selves and their diverse experiences to work they enable us to accomplish extraordinary things. Multifaceted thought processes spark innovation. Multi-talented collaboration harnesses innovation to deliver superior outcomes. Likewise as part of this culture IQVIA is committed to ensuring effective equality between women and men integrating it as a strategic principle in its corporate and human resources policies.
Required Experience:
Senior IC
IQVIA is the Human Data Science Company™. We are inspired by the industry we serve and provide solutions that enable life sciences companies to innovate with confidence, maximize opportunities and ultimately drive human health outcomes forward. Our approach is Human Data Science – a d ... View more