Principal Data Scientist
Morrisville, NC - USA
Job Summary
Work Schedule
Standard (Mon-Fri)Environmental Conditions
OfficeJob Description
At Thermo Fishers PPD clinical research business were using digital innovation data science and AI to reimagine how life-changing therapies reach patients. Our teams combine deep scientific expertise with advanced analytics automation and digital platforms to make research smarter faster and more connected.
We know that innovation happens when diverse minds meet. Our Digital Science Data and AI professionals collaborate closely with scientists clinicians and operational experts to solve real-world challenges in clinical research. Alongside our partnership with Open AI you can be part of the collaboration that will help to improve the speed and success of drug development enabling customers to get medicines to patients faster and more cost effectively.
Youll join a culture that values experimentation learning and collaboration where your ideas can help shape how we deliver life-saving solutions and improve global health outcomes. Whether youre a data engineer product manager software developer or AI scientist youll find opportunities here to apply your skills to work that truly matters improving global health outcomes.
Principal Data Scientist - Patient Analytical Services Division (PASD)
The Principal Data Scientist is a senior individual contributor and the deepest technical voice on the PASD data science team focused on applying machine learning advanced analytics and modern AI to patient-level healthcare data. This role partners closely with epidemiologists statisticians RWE scientists data engineers and consulting teams to build scalable analytical and AI solutions that power evidence generation and decision support for biopharmaceutical biotech and medical device clients.
The role balances three areas: rigorous ML and advanced analytics on complex patient data (claims EHR registries linked datasets) responsible adoption of generative AI and agentic solutions for analytics productivity and client-facing workflows and fluent collaboration within RWE and patient analytics contexts. This is a hands-on technical leadership role; influence is exercised through technical depth mentorship and setting engineering and modeling standards not through people or portfolio management.
Key Responsibilities
Technical Leadership (Individual Contributor)
- Serve as a senior technical expert across the full analytics lifecycle including problem framing data strategy model development validation deployment and monitoring.
- Set and uphold high standards for modeling rigor reproducibility and engineering quality across the data science team.
- Mentor data scientists and engineers review code and modeling approaches and raise the technical bar on projects without owning delivery management.
- Evaluate emerging methods tools and frameworks and guide adoption where they add measurable value.
Machine Learning & Advanced Analytics for Patient Data
- Build predictive and descriptive models on patient-level healthcare data to support use cases such as patient stratification risk prediction text analytics workflow prioritization and decision support.
- Apply appropriate methods across classical statistical modeling machine learning and deep learning including survival analysis causal inference propensity scoring and longitudinal modeling where relevant.
- Design feature engineering evaluation and validation approaches suited to the complexities of real-world healthcare data including missingness censoring bias and longitudinal structure.
- Develop reproducible well-tested pipelines using modern data science tooling experiment tracking and scalable compute.
Generative AI & Agentic Solutions
- Identify and implement high-value applications of generative AI to improve analytics productivity scientific review knowledge retrieval and internal and client-facing workflows.
- Design and evaluate LLM-powered assistants retrieval workflows and agentic applications with appropriate human oversight traceability and quality controls.
- Partner with platform and engineering teams to operationalize AI applications using enterprise tooling for experimentation tracing evaluation and monitoring ensuring responsible deployment in regulated and client-facing environments.
Cross-Functional Partnership
- Partner with RWE scientists epidemiologists statisticians data engineers product owners and consulting teams to translate scientific and business questions into sound analytical approaches.
- Communicate methods assumptions limitations and findings clearly to both technical and non-technical audiences including client-facing contexts.
- Translate technical outputs into scientific and business value for internal teams and client stakeholders.
Key Technologies
Languages and Analytics: Python SQL R
ML / AI: scikit-learn XGBoost PyTorch TensorFlow NLP libraries LLM APIs
Statistics for RWD: survival analysis causal inference propensity scoring longitudinal modeling
Data Platforms: Databricks Spark Delta Lake Snowflake AWS Azure
LLMOps / Agentic AI: MLflow prompt and version tracking tracing evaluation frameworks RAG architectures vector search agent orchestration frameworks
Engineering and Delivery: Git CI/CD notebooks APIs
Qualifications
- Bachelors degree in data science computer science statistics biostatistics epidemiology mathematics bioinformatics or a related quantitative fieldor
- Masters degree with significant progressive experience in data science machine learning or healthcare analytics (preferred).
- Previous experience in data science that provides the knowledge skills and abilities to perform the job (comparable to 8-10 years experience).
- Hands-on experience applying ML and advanced analytics to real-world healthcare data such as claims EHR registries or other patient-level longitudinal datasets.
- Strong programming skills in Python and SQL; working proficiency in R.
- Solid grounding in statistical modeling machine learning and model evaluation.
- Experience working in modern cloud and data platforms such as Databricks Spark AWS Azure or Snowflake.
- Strong software engineering fundamentals including version control modular code testing documentation and reproducibility.
- Strong written and verbal communication skills with the ability to present methods and findings clearly to diverse audiences.
Preferred Qualifications
- Experience applying ML and advanced analytics within RWE HEOR epidemiology pharmacoepidemiology or patient analytics at a pharma biotech CRO medical device or healthcare analytics organization.
- Domain familiarity with oncology immunology rare disease or therapeutic-area-specific patient analytics.
- Experience with survival analysis causal inference propensity scoring and longitudinal modeling applied to real-world data.
- Experience with NLP unstructured clinical text knowledge retrieval LLM applications prompt evaluation and agentic workflows.
- Practical experience with MLOps / LLMOps capabilities such as experiment tracking tracing evaluation frameworks model monitoring and deployment governance.
- Experience mentoring data scientists and contributing to technical standards in a matrixed environment.
At Thermo Fisher Scientific we are committed to fostering a healthy and harmonious workplace for our employees. We understand the importance of creating an environment that allows individuals to excel. Please see below for the required qualifications for this position which also includes the possibility of equivalent experience:
- Able to communicate receive and understand information and ideas with diverse groups of people in a comprehensible and reasonable manner.
- Able to work upright and stationary for typical working hours.
- Ability to use and learn standard office equipment and technology with proficiency.
- Able to perform successfully under pressure while prioritizing and handling multiple projects or activities.
- May require as-needed travel (0-20%).
*Location: Remote US (East Coast preferred). Relocation assistance is NOT provided.
*Must be legally authorized to work in the United States without sponsorship.
*Must be able to pass a comprehensive background check which includes a drug screening.
The annual salary range estimated for this position in North Carolina is $185000- $215000 USD. This position may also be eligible to receive a variable annual bonus based on company team and/or individual performance results in accordance with company policy.
Required Experience:
Staff IC
About Company
Electron microscopes reveal hidden wonders that are smaller than the human eye can see. They fire electrons and create images, magnifying micrometer and nanometer structures by up to ten million times, providing a spectacular level of detail, even allowing researchers to view single a ... View more