As AI Ops Technical Leader you drive the intelligent transformation of operations support. This player-coach role combines hands-on technical delivery team leadership and AI architecture governance to achieve operational excellence. You apply deep technical expertise and strategic leadership to design build and evolve AI and data solutions that improve incident management major incident response problem management change enablement service desk support observability and overall operational resilience.
What Youll Be Doing:
Hands-on Data & AI Solutions for Operations Support
Lead and contribute to high-impact data and AI initiatives that improve operations support outcomes including real-time incident enrichment automated rootcause analysis predictive alerting ticket clustering andauto-triage change risk scoring knowledge mining and intelligent runbooks.
Design and deliver scalable AI-enabled features embedded into operations support platforms such as ServiceNow Jira Service Management monitoring/observability tools and ITSM systems.
Ensure all solutions meet strict operational SLAs for reliability low latency auditability explainability and zero-downtime deployment.
Stay up to date with emerging AIOps tools research and trends and apply them to enhance operations support.
AIOps Tools & Platform Leadership
Lead the architecture development and continuous improvementof internal AIOps platforms and reusable components supporting operationsteams.
Integrate AIOps tools with ITSM systems observability platforms (Prometheus Grafana ELK Dynatrace Splunk) ticketing systems and automation frameworks.
Apply best practices in MLOps/AI Ops tailored to production environments: model monitoring drift detection automated roll back performance checks and cost optimization at scale.
AI Technical Leadership for Operations Support Initiatives
Serve as the principal AI technical authority for operations support transformation programs across service operations NOC support desks infrastructure operations and reliability engineering.
Lead technical discussions architecture reviews proof of concepts vendor evaluations and solution selection involving AI for operations.
Identify prioritize and drive highvalue AI use cases focused on reducing MTTR/MTTD automating L1 triage predicting major incidents generating postmortems optimizing shift handovers and enabling proactive operations.
Team & People Leadership
Build mentor and lead a high-performing squad of AIOps specialists focused on measurable operations support improvements.
Foster a culture of experimentation productionfirst thinking and commitment to operational impactreduced toil faster resolution and higher availability.
Provide technical coaching conduct design/code reviews and guide career development with emphasis on operations and support domain expertise.
Stakeholder & Cross-Functional Collaboration
Work closely with operations support leaders incident managers service owners reliability engineers ITSM teams infrastructure groups and other stakeholders to align AI solutions with operational needs.
Collaborate deeply with DS&AI Competency teams to ensurehigh-quality scalable and sustainable AI delivery.
What Were Looking For:
Strong background indata engineering AI/ML or operations support technology including technical leadership in operations IT or service environments
Proven track record delivering production AI/ML/data solutions that improve MTTR MTTD availability and ticket deflection
Hands-on expertise with Python Spark Kafka Airflow cloud data platforms PyTorch/TensorFlow LLMs and integrations with tools like ServiceNow PagerDuty Splunk Datadog Moogsoft Big Panda Databricks and Azure/ADF.
Deep knowledge of AIOps practices including event correlation anomaly detection predictive analytics automated actions and GenAI for operations.
Experience designing building or enhancing AIOps and internal tooling platforms.
Familiarity with ITIL processes (incident problem change service request knowledge management).
Experience with GenAI/LLM applications for operations such as copilots auto-remediation knowledge search and alert/incident summarization.
Proven ability to scale AIOps in large operations or NOC environments while balancing hands-on work with strategy.
Strong communication skills able to translate complex AI concepts for operations teams and executives focusing on action and automation to reduce operational toil.
Missing one or two of these qualifications We still want to hear from you! If you bring a positive mindset well provide an environment where you feel valued and empowered to learn and grow.
Required Skills:
ead and contribute to high-impact data and AI initiatives that improve operations support outcomes including real-time incident enrichment automated rootcause analysis predictive alerting ticket clustering andauto-triage change risk scoring knowledge mining and intelligent runbooks. Design and deliver scalable AI-enabled features embedded into operations support platforms such as ServiceNow Jira Service Management monitoring/observability tools and ITSM systems.
Required Education:
ead and contribute to high-impact data and AI initiatives that improve operations support outcomes including real-time incident enrichment automated rootcause analysis predictive alerting ticket clustering andauto-triage change risk scoring knowledge mining and intelligent and deliver scalable AI-enabled features embedded into operations support platforms such as ServiceNow Jira Service Management monitoring/observability tools and ITSM systems.
This is a remote position. As AI Ops Technical Leader you drive the intelligent transformation of operations support. This player-coach role combines hands-on technical delivery team leadership and AI architecture governance to achieve operational excellence. You apply deep technical expertise a...
This is a remote position.
As AI Ops Technical Leader you drive the intelligent transformation of operations support. This player-coach role combines hands-on technical delivery team leadership and AI architecture governance to achieve operational excellence. You apply deep technical expertise and strategic leadership to design build and evolve AI and data solutions that improve incident management major incident response problem management change enablement service desk support observability and overall operational resilience.
What Youll Be Doing:
Hands-on Data & AI Solutions for Operations Support
Lead and contribute to high-impact data and AI initiatives that improve operations support outcomes including real-time incident enrichment automated rootcause analysis predictive alerting ticket clustering andauto-triage change risk scoring knowledge mining and intelligent runbooks.
Design and deliver scalable AI-enabled features embedded into operations support platforms such as ServiceNow Jira Service Management monitoring/observability tools and ITSM systems.
Ensure all solutions meet strict operational SLAs for reliability low latency auditability explainability and zero-downtime deployment.
Stay up to date with emerging AIOps tools research and trends and apply them to enhance operations support.
AIOps Tools & Platform Leadership
Lead the architecture development and continuous improvementof internal AIOps platforms and reusable components supporting operationsteams.
Integrate AIOps tools with ITSM systems observability platforms (Prometheus Grafana ELK Dynatrace Splunk) ticketing systems and automation frameworks.
Apply best practices in MLOps/AI Ops tailored to production environments: model monitoring drift detection automated roll back performance checks and cost optimization at scale.
AI Technical Leadership for Operations Support Initiatives
Serve as the principal AI technical authority for operations support transformation programs across service operations NOC support desks infrastructure operations and reliability engineering.
Lead technical discussions architecture reviews proof of concepts vendor evaluations and solution selection involving AI for operations.
Identify prioritize and drive highvalue AI use cases focused on reducing MTTR/MTTD automating L1 triage predicting major incidents generating postmortems optimizing shift handovers and enabling proactive operations.
Team & People Leadership
Build mentor and lead a high-performing squad of AIOps specialists focused on measurable operations support improvements.
Foster a culture of experimentation productionfirst thinking and commitment to operational impactreduced toil faster resolution and higher availability.
Provide technical coaching conduct design/code reviews and guide career development with emphasis on operations and support domain expertise.
Stakeholder & Cross-Functional Collaboration
Work closely with operations support leaders incident managers service owners reliability engineers ITSM teams infrastructure groups and other stakeholders to align AI solutions with operational needs.
Collaborate deeply with DS&AI Competency teams to ensurehigh-quality scalable and sustainable AI delivery.
What Were Looking For:
Strong background indata engineering AI/ML or operations support technology including technical leadership in operations IT or service environments
Proven track record delivering production AI/ML/data solutions that improve MTTR MTTD availability and ticket deflection
Hands-on expertise with Python Spark Kafka Airflow cloud data platforms PyTorch/TensorFlow LLMs and integrations with tools like ServiceNow PagerDuty Splunk Datadog Moogsoft Big Panda Databricks and Azure/ADF.
Deep knowledge of AIOps practices including event correlation anomaly detection predictive analytics automated actions and GenAI for operations.
Experience designing building or enhancing AIOps and internal tooling platforms.
Familiarity with ITIL processes (incident problem change service request knowledge management).
Experience with GenAI/LLM applications for operations such as copilots auto-remediation knowledge search and alert/incident summarization.
Proven ability to scale AIOps in large operations or NOC environments while balancing hands-on work with strategy.
Strong communication skills able to translate complex AI concepts for operations teams and executives focusing on action and automation to reduce operational toil.
Missing one or two of these qualifications We still want to hear from you! If you bring a positive mindset well provide an environment where you feel valued and empowered to learn and grow.
Required Skills:
ead and contribute to high-impact data and AI initiatives that improve operations support outcomes including real-time incident enrichment automated rootcause analysis predictive alerting ticket clustering andauto-triage change risk scoring knowledge mining and intelligent runbooks. Design and deliver scalable AI-enabled features embedded into operations support platforms such as ServiceNow Jira Service Management monitoring/observability tools and ITSM systems.
Required Education:
ead and contribute to high-impact data and AI initiatives that improve operations support outcomes including real-time incident enrichment automated rootcause analysis predictive alerting ticket clustering andauto-triage change risk scoring knowledge mining and intelligent and deliver scalable AI-enabled features embedded into operations support platforms such as ServiceNow Jira Service Management monitoring/observability tools and ITSM systems.