DescriptionThis role combines advanced AI/ML expertise with IT operations (Infrastructure Networks Applications Digital integrated experience) to build scalable solutions for automating processes like incident management resource allocation and infrastructure optimization. Key objectives include:
- Reducing mean time to resolution (MTTR) through predictive maintenance and AIdriven anomaly detection.
- Deploying Gen AI for log analysis code generation and automated documentation using frameworks like GPT4 LangChain or Vertex AI.
- Designing Agentic AI systems where autonomous agents collaborate to execute tasks (e.g. autoscaling security threat mitigation).
- Integrating AI with DevOps pipelines (CI/CD Kubernetes) and IT tools (ServiceNow Ansible Terraform).
Responsibilities- AI/ML Model Development:
- Build predictive models for IT automation (e.g. incident prediction resource optimization).
- Implement reinforcement learning for selfhealing infrastructure and autoscaling.
- Gen AI Integration:
- Develop LLM/RAGbased conversational AI for automated code reviews log parsing and knowledgebase curation.
- Finetune models for domainspecific tasks (e.g. IT ticket classification rootcause analysis).
- Agentic AI Design:
- Architect multiagent systems to automate workflows like ticket routing patch management and compliance checks.
- Enable realtime collaboration between AI agents and human operators.
- Automation Pipeline Engineering:
- Deploy AI models into production using MLOps tools (MLflow Kubeflow) and orchestration platforms (Airflow Prefect).
- Ensure seamless integration with DevOps pipelines and observability tools (Datadog Splunk).
- Governance & Ethics:
- Enforce transparency fairness and compliance (GDPR SOC2 in AIdriven workflows.
- Monitor model drift and performance in production systems.
Qualifications- 10 years in data science with 3 years focused on AIdriven IT automation.
- Proficiency in Python LLM frameworks AI Agents Gen AI applications and cloud platforms (AWS/GCP/Azure).
- Advanced degree in Computer Science Data Science or related fields. Certifications in AI/ML or MLOps (e.g. AWS ML Specialty Google Cloud ML Engineer) preferred.