drjobs Senior AI Ops Engineer

Senior AI Ops Engineer

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Bengaluru - India

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

Job Description Summary

The Senior AI Ops Engineer is a strategic leader responsible for driving the
evolution of IT operations through the innovative application of AI/ML
technologies. This role focuses on building and scaling AIdriven systems that
enhance IT infrastructure performance automate routine tasks and ensure
system reliability. The Senior AI Ops Engineer collaborates closely with IT
leadership DevOps teams and data scientists to design solutions that
proactively identify and resolve operational issues minimize downtime and
drive efficiency at scale. This role also requires expertise in managing large
datasets implementing predictive models and ensuring seamless integration of
AI tools into complex IT environments. You will support enterprisescale AI
initiatives leveraging Bedrock foundational models Azure OpenAI and Google
Gemini. The core platform is based on AWS with additional integrations into
Azure for specific AI use cases. As a senior member of the team you will mentor others contribute to longterm IT strategies and champion AI adoption across the organization.

As a GE Vernova accelerator GE Vernova Advanced Research is driving
strategy and leading research & development efforts to execute on the
businesss mission to help power the energy transition. We forge the
collaborations and help invent the technologies required to electrify and
decarbonize for a zerocarbon future.

Representing virtually every major scientific and engineering discipline our
researchers are collaborating with GE Vernovas businesses the U.S.
government and more than 420 entities at the forefront of technology to execute on 150 energyfocused projects. Collectively these research programs and initiatives aim to solve near term technical challenges deliver next generation product advances and drive long term breakthrough innovation to enable more affordable reliable sustainable and secure energy.

Job Description

Responsibilities:

  • Architect and deploy advanced AI/ML solutions to monitor analyze and optimize IT operations.
  • Automate critical processes including anomaly detection root cause analysis and resolution workflows leveraging advanced AI/ML and/or GenAI technology.
  • Lead collaboration with IT and DevOps teams to integrate AI tools into cloud and onpremise use case solutions across multiple environments.
  • Establish maintain and improve data pipelines to support performance of AI and GenAI solution applications.
  • Research recommend and implement the latest advancements in AI/ML technologies to maintain a cuttingedge IT infrastructure (i.e. newly developed Large Language Models Agentic frameworks OCR tooling advanced Chunking & Embedding methodologies)
  • Drive the interpretation and translation of enterprise goals into technical specifications delivering a point of view on cloud agnostic technologies.
  • Support projects as a trusted technical advisor to team members to solve complex technical challenges.
  • Own develop and maintain process to support IT Operations Management Discovery Monitoring and AIOps solutions using current industry platforms.
  • Leverage artificial intelligence (AI) and machine learning (ML) technologies and frameworks to drive greater observability and service operations automation.
  • Align AI Ops initiatives with broader organizational goals and longterm IT strategies. Optimize LLM performance scalability and costefficiency using techniques like model pruning quantization or distributed inference.
  • Monitor and troubleshoot production deployments to ensure model accuracy latency and uptime requirements are met.
  • Implement robust security controls for AI/ML workflows including data encryption IAM policies and secure API integrations.
  • Ensure compliance with data governance and regulatory requirements across cloud environments

Key Technical Skills:

  • Deep knowledge of AI/ML frameworks (e.g. TensorFlow PyTorch scikitlearn) and algorithms.
  • Advanced proficiency in scripting and programming languages (e.g. Python Bash PowerShell).
  • Experience orchestrating the entire AI/ML lifecycle (data ingestion model training validation deployment monitoring).
  • Familiarity with tools like Kubeflow MLflow Airflow or Argo Workflows.
  • Expertise in cloud platforms like AWS Azure or Google Cloud Platform (GCP).
  • Proficiency in Kubernetes Docker and container orchestration.
  • Experience with frameworks like Hugging Face Transformers LangChain or OpenAI APIs Advanced skills in Natural Language Processing including summarization translation and augmentation (preferred experience with advanced prompting and/or model fine tuning)
  • Experience with InfrastructureasCode (IaC) tools like Terraform Ansible or CloudFormation.
  • Expertise in IT monitoring tools (e.g. AWS CloudWatch Azure Monitoring Splunk Dynatrace Prometheus Datadog etc..
  • Experience with automated alerting and logging best practices for largescale AI systems.
  • Proficiency in GPU/TPU acceleration and parallelization techniques. Familiarity with performance tuning autoscaling and load balancing for highthroughput AI workloads.
  • Experience building CI/CD pipelines for machine learning and experience with tools like GitLab CI/CD or Jenkins for automating workflows.
  • Familiarity with DevOps principles CI/CD pipelines and ITIL best practices. Strong experience in Programming/scripting languages (e.g. Python Pyspark etc. ETL pipelines data lakes and data warehousing
  • Proven proficiency with tools like Apache Spark Kafka Snowflake Redshift.
  • Strong knowledge of database systems (SQL and NoSQL) Position

Requirements:

  • Bachelors degree or Masters degree in computer science Engineering or related fields (Masters degree preferred).
  • 7 years of experience in IT operations DevOps or AI/ML systems implementation. Expertise in one or more of the following is desirable: DevOps Serverless Networking Security Storage Databases IOT AI/ML Cloud Migration and IT Transformation.
  • Proven ability to lead and deliver AI solutions in largescale IT environments. Experience working with BMC Observability and AIOps technologies for monitoring Cloudbased environments (AWS Azure Google Cloud Platform) and their key technologies.
  • Strong analytical strategic thinking and leadership skills. Excellent communication and collaboration abilities to work effectively with stakeholders across all levels.
  • Must be willing to work out of an office located in Bangalore India.

Additional Information

Relocation Assistance Provided: Yes


Required Experience:

Senior IC

Employment Type

Full-Time

Company Industry

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.