Role Overview
We are seeking a Software Engineer with MLOps skills to contribute to the deployment automation and monitoring of GenAI and LLM-based applications. You will work closely with AI researchers data engineers and DevOps teams to ensure seamless integration scalability and reliability of AI systems in production.
Key Responsibilities
1. Deployment & Integration
- Assist in deploying and optimizing GenAI/LLM models on cloud platforms (AWS SageMaker Azure ML GCP Vertex AI).
- Integrate AI models with APIs microservices and enterprise applications for real-time use cases.
2. MLOps Pipeline Development
- Contribute to building CI/CD pipelines for automated model training evaluation and deployment using tools like MLflow Kubeflow or TFX.
- Implement model versioning A/B testing and rollback strategies.
3. Automation & Monitoring
- Help automate model retraining drift detection and pipeline orchestration (Airflow Prefect).
- Assist in designing monitoring dashboards for model performance data quality and system health (Prometheus Grafana).
4. Data Engineering Collaboration
- Work with data engineers to preprocess and transform unstructured data (text images) for LLM training/fine-tuning.
- Support the maintenance of efficient data storage and retrieval systems (vector databases like Pinecone Milvus).
5. Security & Compliance
- Follow security best practices for MLOps workflows (model encryption access controls).
- Ensure compliance with data privacy regulations (GDPR CCPA) and ethical AI standards.
6. Collaboration & Best Practices
- Collaborate with cross-functional teams (AI researchers DevOps product) to align technical roadmaps.
- Document MLOps processes and contribute to reusable templates.
Qualifications :
Technical Skills
- Languages: Proficiency in Python and familiarity with SQL/Bash.
- ML Frameworks: Basic knowledge of PyTorch/TensorFlow Hugging Face Transformers or LangChain.
- Cloud Platforms: Experience with AWS Azure or GCP (e.g. SageMaker Vertex AI).
- MLOps Tools: Exposure to Docker Kubernetes MLflow or Airflow.
- Monitoring: Familiarity with logging/monitoring tools (Prometheus Grafana).
Experience
- 2 years of software engineering experience with exposure to MLOps/DevOps.
- Hands-on experience deploying or maintaining AI/ML models in production.
- Understanding of CI/CD pipelines and infrastructure as code (IaC) principles.
Education
- Bachelors degree in Computer Science Data Science or related field.
Preferred Qualifications
- Familiarity with LLM deployment (e.g. GPT Claude Llama) or RAG systems.
- Knowledge of model optimization techniques (quantization LoRA).
- Certifications in cloud platforms (AWS/Azure/GCP) or Kubernetes.
- Contributions to open-source MLOps projects.
Remote Work :
Yes
Employment Type :
Full-time