Lead Data Science Engineer

Warsaw - Poland

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

We are looking for a skilled Data Scientist to collaborate with financial experts on the development and prototyping of machine learning algorithms that drive pricing optimization forecasting and business decision-making. You will leverage Python and modern data science libraries working with large datasets provided as Parquet files in an Azure cloud infrastructure. Experience with the Kedro framework the Azure platform and financial sector challenges is highly valued. The ideal candidate is adept at building data pipelines conducting advanced analytics and communicating insights in a consulting environment.

Responsibilities:

Collaborate with financial experts to understand business problems and frame them as machine learning tasks.
Prototype build tune and deploy machine learning models - especially gradient boosted tree models using LightGBM (preferred) and XGBoost to address price optimization forecasting and other financial decision processes.
Perform feature engineering data preprocessing and model validation on large structured datasets provided in Parquet format.
Analyze feature importance and interpret model results to drive actionable insights for clients.
Develop reproducible data pipelines using Python employing modular frameworks such as Kedro when possible.
Validate model performance through experiment frameworks such as A/B testing and industry-standard metrics (e.g. RMSE MAE AUC).
Collaborate within an Azure-based infrastructure to integrate models into operational workflows and support production deployment.
Communicate findings and present actionable insights to both technical and non-technical stakeholders.
Contribute to a collaborative team environment and support knowledge-sharing.

Requirements:

Proven experience deploying end-to-end data science solutions ideally within financial services or economic domains.
Previous experience with Kedro for pipeline orchestration and modular data science development is a significant plus.
Gradient Boosted Tree experience hands-on familiarity with implementing tuning and deploying machine learning models based on the gradient boosting algorithm. Experience with LightGBM and/or XGBoost open source frameworks is a significant plus.
Experience in building or deploying models within enterprise cloud environments (AWS Azure).
Exposure to working with automated machine learning tools and scalable data engineering platforms.
Solid understanding of:
- A/B testing methodologies - experiment design and statistical analysis (classical t-tests zero inflated models etc).
- Revenue forecasting models (DARTS prophet sktime etc).
- Price Optimization (regression analysis and optionally linear programming).
Advanced SQL skills and experience with data manipulation tools (e.g. Pandas).
Experience working with parquet files and manipulating large structured datasets.
Hands-on experience in Azure cloud environments (Databricks Azure ML Data Lake etc.).
Experience with CI/CD workflows (Continuous Integration / Continuous Deployment) for data science and machine learning projects enabling automated model testing integration and deployment.
Understanding of ETL/ELT processes and data pipeline orchestration
Strong collaboration and communication skills with the ability to interact with subject matter experts and clients
Strong organizational skills with keen attention to detail.
Excellent communication analytical and interpersonal skills.
English - Upper intermediate or Advanced.

We offer*:

Flexible working format - remote office-based or flexible
A competitive salary and good compensation package
Personalized career growth
Professional development tools (mentorship program tech talks and trainings centers of excellence and more)
Active tech communities with regular knowledge sharing
Education reimbursement
Memorable anniversary presents
Corporate events and team buildings
Other location-specific benefits

*not applicable for freelancers

Responsibilities:

Collaborate with financial experts to understand business problems and frame them as machine learning tasks.
Prototype build tune and deploy machine learning models - especially gradient boosted tree models using LightGBM (preferred) and XGBoost to address price optimization forecasting and other financial decision processes.
Perform feature engineering data preprocessing and model validation on large structured datasets provided in Parquet format.
Analyze feature importance and interpret model results to drive actionable insights for clients.
Develop reproducible data pipelines using Python employing modular frameworks such as Kedro when possible.
Validate model performance through experiment frameworks such as A/B testing and industry-standard metrics (e.g. RMSE MAE AUC).
Collaborate within an Azure-based infrastructure to integrate models into operational workflows and support production deployment.
Communicate findings and present actionable insights to both technical and non-technical stakeholders.
Contribute to a collaborative team environment and support knowledge-sharing.

Requirements:

Proven experience deploying end-to-end data science solutions ideally within financial services or economic domains.
Previous experience with Kedro for pipeline orchestration and modular data science development is a significant plus.
Gradient Boosted Tree experience hands-on familiarity with implementing tuning and deploying machine learning models based on the gradient boosting algorithm. Experience with LightGBM and/or XGBoost open source frameworks is a significant plus.
Experience in building or deploying models within enterprise cloud environments (AWS Azure).
Exposure to working with automated machine learning tools and scalable data engineering platforms.
Solid understanding of:
- A/B testing methodologies - experiment design and statistical analysis (classical t-tests zero inflated models etc).
- Revenue forecasting models (DARTS prophet sktime etc).
- Price Optimization (regression analysis and optionally linear programming).
Advanced SQL skills and experience with data manipulation tools (e.g. Pandas).
Experience working with parquet files and manipulating large structured datasets.
Hands-on experience in Azure cloud environments (Databricks Azure ML Data Lake etc.).
Experience with CI/CD workflows (Continuous Integration / Continuous Deployment) for data science and machine learning projects enabling automated model testing integration and deployment.
Understanding of ETL/ELT processes and data pipeline orchestration
Strong collaboration and communication skills with the ability to interact with subject matter experts and clients
Strong organizational skills with keen attention to detail.
Excellent communication analytical and interpersonal skills.
English - Upper intermediate or Advanced.

We offer*:

Flexible working format - remote office-based or flexible
A competitive salary and good compensation package
Personalized career growth
Professional development tools (mentorship program tech talks and trainings centers of excellence and more)
Active tech communities with regular knowledge sharing
Education reimbursement
Memorable anniversary presents
Corporate events and team buildings
Other location-specific benefits

*not applicable for freelancers