Expected start date: 1st April 2025
Contract Duration: 1 Year and can be extended
Languages Required: English German (Nice to have)
Remote/Onsite/Hybrid: Hybrid 23 days every week)
Travel cost reimbursement : No
Location: Essen Germany
Years of exp needed: 10 years
ETRM Data Scientist
Experience: 7 years
Job Description:
- Education Requirements:
- Master’s degree in mathematics Statistics Data Science or related fields is mandatory.
- A Ph.D. in Mathematics Statistics Data Science or similar areas is preferred but not mandatory.
- Mandatory skills:
- Data Science:
- Extensive experience in timeseries forecasting predictive modelling and deep learning.
- Proficient in designing reusable and scalable machine learning systems.
- Proficiency in implementing techniques such as ARIMA LSTM Prophet Linear Regression and Random Forest to ensure accurate forecasting and insights.
- Strong command of machine learning libraries including scikitlearn XGBoost Darts TensorFlow and PyTorch along with data manipulation tools like Pandas and NumPy.
- Proven expertise in designing and implementing explicit ensemble techniques such as stacking boosting and bagging to improve model accuracy and robustness.
- Proven track record of analysing and optimizing performance of operational machine learning models to ensure longterm efficiency and reliability.
- Expertise in retraining and finetuning models based on evolving data trends and business requirements.
- MLOps Implementation:
- Proficiency in leveraging Pythonbased MLOps frameworks for automating machine learning pipelines including model deployment monitoring and periodic retraining.
- Advanced experience in using the Azure Machine Learning Python SDK to design and implement parallel model training workflows incorporating distributed computing parallel job and efficient handling of largescale datasets in managed cloud environments.
- PySpark Proficiency
- Strong experience in PySpark for scalable data processing and analytics.
-
- Azure Expertise:
- Azure Machine Learning: Managing parallel model training deployment and operationalization using the Python SDK.
- Azure Databricks: Collaborating on data engineering and analytics tasks using PySpark/Python.
- Azure Data Lake: Implementing scalable storage and processing solutions for large datasets.
- Preferred skills:
- KMeans Clustering: Experience in applying kmeans clustering for data segmentation and pattern identification.
- BottomUp Forecasting: Skilled in creating granular bottomup forecasting models for hierarchical insights.
- Azure Data Factory : Designing orchestrating and managing pipelines for seamless data integration and processing.
- knowledge of power trading concepts.
- Generative AI (GenAI): Experience in applying generative AI models such as GPT or similar frameworks.