Data Scientist – Python & PySpark

Not Interested
Bookmark
Report This Job

profile Job Location:

Salt Lake, UT - USA

profile Monthly Salary: Not Disclosed
Posted on: 21 hours ago
Vacancies: 1 Vacancy

Job Summary

Remote Job
Must Have Technical/Functional Skills:
7-10 years hands-on with Python for machine learning especially XGBoost scikit-learn and NumPy/pandas.
Proficiency in PySpark for reading transforming and analyzing large datasets stored in parquet.
Experience in validating or reverse engineering ML models from business logic or legacy implementation.
Exposure to Java-based ML libraries or understanding of how internals map across languages.
Hands-on with Python frameworks for meta-modelling libraries.
Roles & Responsibilities:
Interpret data transformation logic and validate feature pipelines from existing Java implementations.
Run Python-converted models on historical datasets and validate output metrics against Java model benchmarks.
Collaborate with model validation teams to review performance consistency and explain metric deviations if any.
Design unit tests and validation scenarios to support each migrated model s readiness for signoff.
Ingest model input data from parquet files using PySpark and pandas to reproduce training and scoring workflows.
Conduct EDA and spot-check row-level predictions where needed Collaborate with the customer team to
understand the logic structure and parameters of the Java-based XGBoost models.
Salary :00/Per Annum
Remote Job Must Have Technical/Functional Skills: 7-10 years hands-on with Python for machine learning especially XGBoost scikit-learn and NumPy/pandas. Proficiency in PySpark for reading transforming and analyzing large datasets stored in parquet. Experience in validating or r...
View more view more

Key Skills

  • Laboratory Experience
  • Immunoassays
  • Machine Learning
  • Biochemistry
  • Assays
  • Research Experience
  • Spectroscopy
  • Research & Development
  • cGMP
  • Cell Culture
  • Molecular Biology
  • Data Analysis Skills