Job Responsibilities:
- Data Cleaning Preprocessing & exploration: Prepare data for analysis ensuring quality consistency and completeness by handling missing values outliers and transforming data. Explore and analyze large and complex datasets to identify patterns trends and anomalies
- Machine Learning Model Development: Build train and deploy machine learning models on the Databricks platform leveraging tools such as MLflow for experiment with techniques like regression classification clustering and time series analysis
- Model Evaluation & Deployment: Develop and select features to improve model performance leveraging Databricks distributed computing capabilities for efficient processing. Familiarity with CI/CD tools (e.g. Jenkins GitLab) for automating deployment and testing of data pipelines
- Collaboration: Collaborate with data engineers analysts and business stakeholders to understand business requirements and translate them into datadriven solutions.
- Data Visualization and Reporting: Create visualizations and dashboards within Databricks Power BI and other tools to communicate insights to technical and nontechnical stakeholders.
- Continuous Learning: Stay up to date with the latest developments in data science machine learning and industry best practices to continually enhance skills and processes.
Technical skills:
Knowledge of statistical analysis techniques hypothesis testing and machine learning
Familiarity with NLP time series analysis and computer vision or A/B testing
Databricks and Apache Spark: Proficiency with Databricks Spark DataFrames and MLlib
Programming: Proficiency in (Python TensorFlow Pandas scikitlearn PySpark NumPy) with experience in writing scalable code for large datasets
SQL: Strong SQL skills for data extraction manipulation and analysis
Familiarity with MLflow for tracking model versioning and reproducibility.
Familiarity with cloud data storage and processing tools (e.g. Azure Data Lake AWS S3.
Qualifications:
- Education: Bachelor s degree in Statistics Mathematics Computer Science or a related field
- Experience: 3 years of experience in a data science or analytical role.
Remote Work :
No