Roles and Responsibilities
1. MLOps Strategy & Implementation
- Design and implement scalable MLOps pipelines for the end-to-end lifecycle of machine learning models (from data ingestion to model deployment and monitoring).
- Automate model training testing validation and deployment using CI/CD practices.
- Collaborate with data scientists to productize ML models.
2. Infrastructure Management
- Build and maintain cloud-native infrastructure (e.g. AWS/GCP/Azure) for training deploying and monitoring ML models.
- Optimize compute and storage resources for ML workloads.
- Containerize ML applications using Docker and orchestrate them with Kubernetes.
3. Model Monitoring & Governance
- Set up monitoring for ML model performance (drift detection accuracy drop latency).
- Ensure compliance with ML governance policies versioning and auditing.
4. Collaboration & Communication
- Work with cross-functional teams (Data Engineering DevOps and Product) to ensure smooth ML model deployment and maintenance.
- Provide mentorship and technical guidance to junior engineers.
5. Automation & Optimization
- Automate feature extraction model retraining and deployment processes.
- Improve latency throughput and efficiency of deployed models in production.