Data Scientist
Job Summary
Work Schedule
Standard (Mon-Fri)Environmental Conditions
OfficeJob Description
As part of the Thermo Fisher Scientific team youll discover meaningful work that makes a positive impact on a global scale. Join our colleagues in bringing our Mission to life every single day to enable our customers to make the world healthier cleaner and safer. We provide our global teams with the resources needed to achieve individual career goals while helping to take science a step beyond by developing solutions for some of the worlds toughest challenges like protecting the environment making sure our food is safe or helping find cures for cancer.
DESCRIPTION:
Key responsibilities include but are not exclusively:
- Design develop and maintain scalable data pipelines and data processing solutions using Python SQL and PySpark in AWS environments
- Build train evaluate and deploy machine learning and AI models to solve business problems and improve decision- making
- Apply Generative AI (GenAI) techniques (e.g. LLMs prompt engineering embeddings) to develop innovative data products and automation solutions
- Collaborate with cross- functional teams (data engineers BI developers business stakeholders) to translate business requirements into data science solutions
- Perform data exploration feature engineering and data validation to ensure high- quality datasets
- Contribute to the deployment and monitoring of models using MLOps best practices (CI/CD versioning model tracking)
- Optimize data processing workflows and model performance for scalability and efficiency in cloud environments (AWS)
- Stay updated on the latest advancements in AI ML and GenAI and actively apply best practices in ongoing projects
- Participate in team meetings code reviews and knowledge-sharing sessions
REQUIREMENTS:
- 35 years of experience in data science machine learning or applied AI
- Strong programming skills in Python and experience with PySpark for large- scale data processing
- Hands- on experience with AWS services (e.g. S3 Lambda Glue SageMaker EMR)
- Practical experience with machine learning frameworks (e.g. scikit- learn TensorFlow PyTorch)
- Experience or exposure to Generative AI (LLMs prompt engineering vector databases RAG pipelines)
- Strong knowledge of SQL and working with structured and unstructured data
- Understanding of data modeling ETL processes and big data architectures
- Experience with version control (e.g. Git) and collaborative development practices
- Strong analytical and problem-solving skills with the ability to work independently and manage multiple tasks
COMPETENCIES:
- Data Analysis and Data Engineering: Strong ability to preprocess clean and transform large datasets including distributed data processing
- Machine Learning & AI Expertise: Solid understanding of supervised and unsupervised learning model evaluation and optimization techniques
- Generative AI & Innovation: Ability to apply GenAI techniques (LLMs embeddings RAG) to real- world use cases
- Cloud & Big Data Technologies: Experience working with scalable architectures and distributed systems in AWS
- MLOps Awareness: Understanding of model lifecycle management deployment and monitoring
- Communication & Collaboration: Ability to explain complex technical concepts to non- technical stakeholders
- Continuous Improvement: Proactive in learning new tools frameworks and industry trends
Required Experience:
IC
About Company
Electron microscopes reveal hidden wonders that are smaller than the human eye can see. They fire electrons and create images, magnifying micrometer and nanometer structures by up to ten million times, providing a spectacular level of detail, even allowing researchers to view single a ... View more