Senior Site Reliability Engineer

Lytx

Not Interested
Bookmark
Report This Job

profile Job Location:

Bengaluru - India

profile Monthly Salary: Not Disclosed
Posted on: 20 hours ago
Vacancies: 1 Vacancy

Job Summary

Why Lytx:
Join our dynamic and passionate team of driven low-ego engineers who are at the forefront of designing and supporting cutting-edge IoT infrastructure. As we rapidly grow and transition to the cloud were diving into the exciting realms of Operations as Code Infrastructure as Code and innovative infrastructure automation.

Our Site Reliability Engineering (SRE) team is pivotal in ensuring the availability reliability
observability and resilience of Lytx services both on-premises and in the cloud. Were not just keeping the lights onwere engineering the future of our businesss continuity.
If youre energized by crafting transformative solutions and excel at designing robust detailed cloud infrastructure with a focus on continuous improvement this could be the perfect role for you!
Responsibilities:
System Design and Architecture: Design implement and maintain scalable and reliable
systems ensuring they can handle both current and future demands.
Incident Management: Lead incident response efforts diagnose root causes and
implement long-term solutions to prevent recurrence. Ensure effective communication
during outages.
Monitoring and Observability: Develop and maintain comprehensive monitoring and
alerting systems to proactively identify and address issues before they impact users.
Automation and Efficiency: Automate repetitive tasks and processes to improve
operational efficiency and reduce manual intervention.
Performance Tuning: Continuously optimize system performance including fine-tuning
applications databases and infrastructure to meet service level objectives (SLOs).
Capacity Planning: Forecast future system requirements based on growth trends and
current usage and plan capacity upgrades to ensure system reliability.
Collaboration and Mentoring: Work closely with development teams to integrate
reliability into the software development lifecycle. Mentor junior SREs and share best
practices.

Documentation and Knowledge Sharing: Create and maintain detailed documentation on
system design incident response procedures and operational practices to ensure
knowledge is preserved and accessible.
Requirements:
5 years of experience as an SRE within AWS environments at medium to large-scale
organizations.
5 years of hands-on experience implementing and managing observability tools such
as Prometheus New Relic Grafana or similar.
Advanced programming skills in Python Groovy and Bash.
Strong understanding of database technologies including both SQL and NoSQL
systems.
3 years of experience developing and managing infrastructure deployment pipelines
using Git Terraform Helm Jenkins/Jenkins X/ArgoCD or similar tools.
Proven expertise in designing evaluating and supporting production environments in
AWS including VPCs EKS IAM AMI EC2 CloudWatch CloudTrail Control Tower
GuardDuty MSK S3 Glacier Gateways Direct Connect Route 53 RDS ALBs
Autoscaling and more.
Hands-on experience with Linux systems and protocols and technologies such as HTTP
REST TCP/IP SSL DNS SMTP SSH NTP Load Balancing SQL/NoSQL Message
Brokers Nginx Vault etc.
Extensive experience with Kubernetes and various container and cloud-native
technologies.
Significant experience in managing 24/7 on-call rotations creating runbooks
establishing support procedures and proactively monitoring systems across multiple
geographic locations.
Ability to thrive under pressure and excel in a technically challenging environment.

Innovation Lives Here


Together we help save lives on our roadways.

Find out how good it feels to be a part of an inclusive collaborative team. Were committed to delivering an environment where everyone feels valued included and supported to do their best work and share their voices.

Lytx Inc. is proud to be an equal opportunity/affirmative action employer and maintains a drug-free workplace. Were committed to attracting retaining and maximizing the performance of a diverse and inclusive workforce. EOE/M/F/Disabled/Vet.


Required Experience:

Senior IC

Why Lytx:Join our dynamic and passionate team of driven low-ego engineers who are at the forefront of designing and supporting cutting-edge IoT infrastructure. As we rapidly grow and transition to the cloud were diving into the exciting realms of Operations as Code Infrastructure as Code and innovat...
View more view more

Key Skills

  • Kubernetes
  • FMEA
  • Continuous Improvement
  • Elasticsearch
  • Go
  • Root cause Analysis
  • Maximo
  • CMMS
  • Maintenance
  • Mechanical Engineering
  • Manufacturing
  • Troubleshooting

About Company

Company Logo

Since 1998, Lytx has led the video telematics industry using proprietary machine vision, artificial intelligence, and big data to protect and connect thousands of fleets and millions of drivers in more than 85 countries worldwide. At Lytx, you'll be a part something good - helping sav ... View more

View Profile View Profile