Site Reliability Engineer

Not Interested
Bookmark
Report This Job

profile Job Location:

Cincinnati, OH - USA

profile Monthly Salary: Not Disclosed
Posted on: 13 hours ago
Vacancies: 1 Vacancy

Job Summary

Site Reliability Engineer

Location: Cincinnati OH (100% Onsite).

Position Summary

As a Site Reliability Engineer/DevOps Engineer you will be responsible for ensuring the availability performance and reliability of Fulfillment Technology solutions for Kroger to support omni-channel strategy. You will work closely with the development testing and operations teams to design implement and maintain scalable reliable and efficient solutions for the production environment. You will also troubleshoot and resolve any issues that may arise in the production systems using various tools and techniques such as monitoring logging alerting automation and incident management. You will also contribute to the continuous improvement of the DevOps practices and processes such as CI/CD configuration management infrastructure as code and cloud computing. You will have a strong background in software engineering system administration networking and cloud technologies. You will also have excellent communication and collaboration skills as well as a passion for learning new technologies and solving complex problems.

Minimum Position Qualifications

  • 4 years of experience in the cloud SRE/DevOps/Infrastructure or any related fields
  • 4 years experience working with databases web applications and micro-services event-driven applications messaging systems REST APIs and integrations cloud support tools observability and containerization technologies.
  • Knowledge of Java Spring boot Microservices Kafka Cassandra & SQL Server
  • Proficiency in scripting languages such as Python / Shell scripting
  • 1 year of experience managing System Observability tools (DynaTrace ELK PagerDuty Datadog Azure Monitor Grafana etc)
  • Hands-on experience with GitActions for CI/CD automations
  • Knowledge of Linux architecture security administration performance monitoring/tuning troubleshooting and production operations
  • Demonstrated skill in working in an Agile environment
  • Demonstrated skill in working with multi-location global teams
  • Proven ability to think and contribute at the strategic level
  • Demonstrated knowledge of eCommerce Fulfillment or Retail Technology solutions
  • Demonstrated written oral and presentation/public speaking communication skills

Desired Previous Experience/Education

  • 4 years of experience in designing/working in high volume eCommerce applications
  • 2 years of experience configuring and managing cloud infrastructure (Azure AWS GCP)
  • 1 year of experience with technologies such as Apache Kafka Azure Cosmos DB Apache Cassandra Ansible Terraform Docker and Kubernetes
  • Experience with Nginx HAProxy Squid
  • Experience with CI/CD pipelines using tools such as Jenkins Spinnaker Azure DevOps TeamCity etc.
  • Proficient in implementing and managing RoyalTS or similar cross-platform remote management solutions ensuring secure and efficient remote access and system administration across diverse environments.

Key Responsibilities

  • Partner and collaborate with application engineering observability and other support teams within Kroger as well as our business operation partners and third parties (as appropriate) to prioritize address and drive the resolution of issues and incidents that impact customer pickup or delivery domains
  • Drive root-cause analysis of critical business and production issues to prevent future occurrences and review/approve potential solutions
  • Lead Major Incident calls impacting the Pickup Fulfillment domain and provide clear timely updates on status of service restoration to key stakeholders
  • Work with the engineering teams to continuously implement and improve reliable and speedy build environments
  • Increase automation to improve efficiency and quality
  • Ensure traceability observability and retrievability of system behavior
  • Build logging monitoring and alerting systems to identify bottlenecks and assist with debugging analysis and optimization in cloud on-prem and store environments
  • Craft solid and clearly explained designs playbooks and documentation
  • Participate in an off-hours on-call rotation and perform periodic off-hours work during maintenance windows
Site Reliability Engineer Location: Cincinnati OH (100% Onsite). Position Summary As a Site Reliability Engineer/DevOps Engineer you will be responsible for ensuring the availability performance and reliability of Fulfillment Technology solutions for Kroger to support omni-channel strategy. You wil...
View more view more

Key Skills

  • Kubernetes
  • FMEA
  • Continuous Improvement
  • Elasticsearch
  • Go
  • Root cause Analysis
  • Maximo
  • CMMS
  • Maintenance
  • Mechanical Engineering
  • Manufacturing
  • Troubleshooting