drjobs DevOps Engineer – AIOps

DevOps Engineer – AIOps

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Frisco, TX - USA

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

Job Title: DevOps Engineer AIOps

Location: Frisco TX

Duration: 08 months

Term: Contract

Job Description:

Experience Desired: 10 Years

The AIOps Engineer is responsible for integrating machine learning and advanced analytics into our existing monitoring and logging systems. This role will leverage artificial intelligence to automate routine operational tasks detect anomalies proactively and implement self-healing frameworks to enhance the stability and performance of our infrastructure. The ideal candidate will be proactive in identifying gaps creating strategic roadmaps and implementing phased improvements to achieve operational excellence.

Key Responsibilities:

  • Apply machine learning algorithms to existing operational data (logs metrics events) to predict system failures and proactively address potential incidents.
  • Implement automation for routine DevOps practices including automated scaling resource optimization and controlled restarts.
  • Develop and maintain self-healing systems to reduce manual intervention and enhance system reliability.
  • Build anomaly detection models to quickly identify and address unusual operational patterns.
  • Collaborate closely with SREs developers and infrastructure teams to continuously enhance the operational stability and performance of the system.
  • Provide insights and improvements through visualizations and reports leveraging AI-driven analytics.
  • Create a phased roadmap to incrementally enhance operational capabilities and align with strategic business goals.

Required Skills and Qualifications:

  • Strong experience with AI/ML frameworks and tools (e.g. TensorFlow PyTorch scikit-learn).
  • Proficiency in data processing and analytics tools (e.g. Splunk Prometheus Grafana ELK stack).
  • Solid background in scripting and automation (Python Bash Ansible etc.).
  • Experience with cloud environments and infrastructure automation.
  • Proven track record in implementing proactive monitoring anomaly detection and self-healing techniques.
  • Excellent analytical problem-solving and strategic planning skills.
  • Strong communication skills and the ability to effectively collaborate across teams.

Preferred Experience:

  • Background in DevOps/Site Reliability Engineering.
  • Familiarity with containerization and orchestration platforms (Kubernetes Docker).
  • Experience in building scalable distributed systems.
  • This role is pivotal in enabling our organization to achieve and sustain Operational Excellence through intelligent automation and proactive monitoring practices.

Key Skills:

Devops SRE Monitoring Python Cloud AI Machine Learning.

Employment Type

Full Time

Company Industry

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.