DescriptionMachine Learning Engineer I will be primarily responsible for contributing to the development and enhancement of machine learning applications and systems. They will work closely with other engineers and data scientists to design and implement scalable and efficient machine learning systems.
We are recruiting a Machine Learning Engineer I to support the labs core projects in multimodal AI for womens health. The engineer will be responsible for building optimizing and deploying ML pipelines at scale working with both postdocs and clinicians. This role is ideal for an applied researcher who is excited about translational machine learning and thrives in a collaborative interdisciplinary environment.
Heavy menstrual bleeding affects nearly one in three women of reproductive age and is a leading cause of iron deficiency worldwide. Yet it remains one of the most under-recognized challenges in medicine. Our lab at the intersection between the Artificial Intelligence and Human Health Department and the Department for Obstetrics Gynecology and Reproductive Sciences at Mount Sinai has been awarded a Wellcome Leap Missed Vital Sign grant to change this.
We are building a new interdisciplinary group at the intersection of AI human health and obstetrics & gynecology. Our mission is to harness state-of-the-art methods in machine learning and multimodal data integration to close critical gaps in womens healthand to translate these advances into solutions that matter for patients and clinicians.
As a founding member you will help shape a lab designed for openness collaboration and translation. You will have access to unique resources including Mount Sinais genome-linked EHR biobank (the Sinai Million) AIRMS (AI-ready Mount Sinai Integrated Data and Analytics Platform) the Minerva HPC cluster and eHive a digital platform for wearable and real-world data collection. Partnerships with the Hasso Plattner Institute in Germany create further opportunities for international collaboration.
This is a chance to join at the ground level of a lab committed to impact: bringing computational innovation directly into womens health.
Responsibilities- Build train and evaluate machine learning models on large scale multimodal datasets (wearables imaging genomics EHR)
- Develop and maintain reproductible scalable ML pipelines using PyTorch
- Run experiments on HPC clusters (Minerva) and support distributed learning (e.g. Accelerate Lightning)
- Optimize workflows for compute and data efficiency
- Collaborate with post-doctoral fellows and clinical researchers to translate models into practice
- Contribute to codebases documentation and open source tools
- Assist in the collection cleaning and curation of large data sets.
- Assist in the operationalization of machine learning models.
- Participate in evaluating model performance and contribute to model refinement.
- Work with other team members to deploy machine learning models.
- Contribute to maintaining clear and organized documentation of machine learning systems.
- Stay updated with the latest trends and technologies in the machine learning field.
- Work collaboratively with a multidisciplinary team to ensure the effectiveness of machine learning systems.
- Develop and maintain project work plans including critical tasks milestones timelines interdependencies and contingencies. Tracks and reports progress. Keeps stakeholders apprised of project status and implications for completion.
- Prepare clear well-organized project-specific documentation including at a minimum analytic methods used key decision points and caveats with sufficient detail to support comprehension and replication.
- Share development and process knowledge with other analysts in order to assure redundancy and continuously builds a core of analytical strength within the organization.
- Adhere to corporate standards for performance metrics data collection data integrity query design and reporting format to ensure high quality meaningful analytic output.
- Works closely with IT on the ongoing improvement of Mount Sinais integrated data warehouse driven by strategic and business needs and designed to ensure data and reporting consistency throughout the organization.
- Demonstrates advanced level proficiency with the principles and methodologies of process improvement. Applies these in the execution of responsibilities in support of a process focused approach.
- Other duties as assigned.
QualificationsRequirements
- Bachelors degree in Computer Science Statistics Mathematics Data Science Biomedical Informatics or related field.
- Experience in applied machine learning and deep learning using PyTorch
- Experience in HPC environments distributed training and large scale data processing
- Familiarity with version control containerization (Docker Singularity) and reproducible research practices
- Experience with clinical data and biomedical informatics (OMOP FHIR) preferred
- Background in multi-modal Machine Learning time series analysis or computer vision preferred
- Azure Cloud experience preferred
- Interest in translational applications in Womens Health preferred
- Knowledge of at least one programming language among Scala Python Java C or C.
- Knowledge of big data technologies (e.g. Hadoop Spark)
- Knowledge of Software Development Lifecycle.
- Self-motivated with a demonstrated ability to work independently and to exercise independent judgment in developing complex techniques or programs in a dynamic environment.
- Act as the major contributor in the development and operationalization of four different applications.
- Play a key technical role in maintaining deployed products
- Understanding of machine learning algorithms (Supervised Unsupervised ML algorithms).
- Familiarity with SQL or other database languages.