drjobs Senior Data Engineer ADMET

Senior Data Engineer ADMET

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Berlin - Germany

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

About the role
At Apheris we power federated data networks in life sciences to address the data bottleneck in training highly performant ML models. Publicly available molecular datasets are insufficient to train highquality ML models that meet industry requirements. Our product addresses this by hosting networks where biopharma organizations collaboratively train higher quality models on their combined data. The Apheris product is a set of drug discovery applications enriched with the proprietary data of network participants. Our federated computing infrastructure with builtin governance and privacy controls ensure that the data IP and ownership always stays with the data custodians.

As we are doubling down on ADMET (absorption distribution metabolism excretion and toxicity) use cases as a focus area within our drug discovery work we are looking for a Senior Data Engineer to help us build great ADMET models. This is a handson highimpact role focused on advancing the state of the art in applying foundational models to drug discovery problems. Youll work closely with our ADMET team and will serve as the technical authority on data preparation data harmonization and data pipelines in this domain.

You should bring deep expertise in data infrastructure and data preparation with domain knowledge in pharmacokinetics and toxicity with a focus on ADMET modelling and related tasks. You must also understand the application of these models within industrial drug discovery workflows.

If you want to be part of a missiondriven team building cuttingedge AI systems for life sciences and you know what it takes to leverage domainspecific data this role is for you.
What you will do
  • ADMET Data Pipeline Development: Design buildand maintain scalablepipelinesfor ingesting processing andharmonizingdiverse ADMET datasetsfrom publicsources(e.g.ChEMBL PubChem)and proprietary assays.
  • Data Harmonization: Standardize heterogeneous ADMET data formats (e.g.in vitroassaysin silicopredictions)acrossnetwork participantsto enablemodellingreadinessof the data
  • ModelReady Dataset Curation: Preprocess raw ADMET data (e.g. normalizing units handling missing values) to support AI/ML model training fora variety ofendpointslike bioavailability hERG inhibition or CYP450 interactions
  • Data Quality Assurance: Implementand automatevalidation checks to ensure ADMET data integrity
  • CrossFunctional Integration: Work with computational chemists tooptimizedata structures for AIdriven ADMET models (e.g. graphbased representations for metabolic pathways)
  • Work with ourcustomersand potentially academic partnerstodefine data preprocessingselection and benchmarking strategies for novel training tasks involvingADMETdata includingleveraging and harmonizing assay data fromdifferent sources.
  • Collaborate crossfunctionally to ensuredata and resultingmodels address realworld drug discovery needs.
  • Mentor and guide team members on a content level supporting the planning and breakdown of complexADMET data preparation.
  • Influence strategic decisions on data infrastructureand data quality assurance
  • Contribute to publications or opensource contributions where relevant.

What we expect from you
  • By month 3:Develop a deep technical understanding of theApherisproduct and how it maps to the current ADMET usecases we are working on.Take ownership of anADMETdata preparationstream.Build relationships with product and engineering leadership.Develop a roadmap and experiment plan forpreparing data andadapting models to one highvalue use case.
  • By month 12:Lead multipledata preparationefforts inADMETanddemonstratemeasurable progress in model performance and realworld impact. Mentor colleagues and set strategic direction for the domain.
You should apply if
  • You have abackgroundincomputational chemistrycheminformaticscomputational biologybioinformaticsdata engineering or computer scienceanda track recordofpreparing data for ML modelsaddressingrealworld drug discovery problems.
  • You have deep experiencein pharma/biotechADMETdata pipelinesformachine learning.
  • Youhave deep experience inADMET data including an understanding of assay protocols and how tomap protocols to each other.
  • Yourecomfortable navigating complex technical landscapes and can break down and drive on ambitious modeling plans.
  • You understand howADMETdata andmodels are used in the drug discovery lifecycle and can align your work to practical use cases.
Bonus points if
  • You have experience in federated learning privacypreserving ML or secure model training.
  • You have experience inbenchmarkingpredictive models against standardized datasets.
  • You have experience working withMLandMLOpssystems at scale including CI/CD model versioning Docker Kubernetes cloud platforms and orchestration tools.
  • Youvecontributed to opensourcedataorcheminformaticstooling.
  • You have handson experience working withADMET assays and DMPK stakeholders.
  • You have experience guiding technical direction in a fastpaced researchoriented environment.
What we offer you
  • Industrycompetitive compensation incl. earlystagevirtual share options
  • Remotefirst working work where you work best whether from home or a coworking space near you
  • Great suite of benefitsincluding a wellbeing budget mental health benefits a workfromhome budget a coworking stipend and a learning and development budget
  • Regular team lunches and social events
  • Generous holiday allowance
  • Quarterly All Hands meetup at our Berlin HQor a different European location
  • A fundiverse team of missiondriven individuals with a drive to see AI and ML used for good
  • Plenty of room to grow personally and professionally and shape your own role
About Apheris
Apheris powers federated life sciences data networks addressing the critical challenge of accessing proprietary data locked in silos due to IP and privacy concerns. Publicly available datasets are insufficient to train highquality ML models that meet industry requirements. Our product addresses this by enabling life sciences organizations to collaboratively train higher quality models on complementary data from multiple parties. We are now doubling down on two key areas of interest: structural biology and ADMET.
Logistics
Our interview process is split into three phases:
  1. Initial Screening: If your application matches our requirements we invite you to an initial video call to explore the fit. In this 3045 minutes interview you will get to know us and the role. The interviewer will be interested in your relevant experiences and skills as well as answer any question on the company and the role itself that you may have.
  2. Deep Dive: In this phase a domain expert from our team will assess your skills and knowledge required for the role by asking you about meaningful experiences or your solutions for specific scenarios in line with the role we are staffing.
  3. Final Interview: Finally we invite you for up to three hours of targeted sessions with our founders talking about our culture and meeting future coworkers on the ground.

Required Experience:

Senior IC

Employment Type

Full-Time

Company Industry

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.