Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via emailWHO WE ARE
Responsibilities:
Desired Qualifications and Expertise:
WHO WE ARE
Job Overview:
We are looking for a dynamic and highly skilled Senior SRE Engineer to join the team.
Key Responsibilities:
Implement and manage SLOs SLIs and error budgets.
Lead and promote postmortems ensuring robust root cause analysis to drive continuous system improvement. Analyze historical data to identify improvement areas.
Implement full observability across systems using OpenTelemetry.
Reduce toil through runbook automation.
Record and track key MTTx metrics (MTTA MTTR MTTF etc..
Lead design sessions on capacity planning reliability by design automation and alerting.
Collaborate with product teams to enhance system reliability.
Engage in strategic initiatives for capacity reliability and automation ensuring alignment with business goals.
What Were Looking For:
3 years of experience as an SRE.
2 years of software development experience with a strong emphasis on automation.
3 years of experience in AWS
Experience managing and designing highthroughput systems processing millions of transactions daily.
Deep understanding of observability with handson experience implementing SLIs SLOs and error budgets.
Proficiency in Kubernetes Terraform and cloud platforms (AWS) with a focus on scalability and reliability.
Handson experience with Infrastructure as Code (IaC) tools.
Experience with distributed systems and microservices architecture (MSA).
Production experience with distributed tracing.
Strong software development background in Python Golang and Bash scripting.
Experience with OTEL Collectors collection sampling and customizations.
Solid understanding of SLIs SLOs and error budgets.
Handson experience with CI/CD platforms (GitOps GitLab Jenkins ArgoCD etc..
Expertise in incident management and root cause analysis.
Knowledge of modern deployment strategies (Canary BlueGreen etc..
Familiarity with resiliency patterns (circuit breakers retry mechanisms load balancing etc..
Experience with SQL and NoSQL databases and understanding of distributed systems.
Proficiency in statistical analysis applied to metrics.
Experience with highperformance lowlatency systems.
Proven experience in cloud cost optimization strategies.
Containerization experience (docker k8s) onprem and cloud.
Experience with Kafka or other distributed messaging systems.
Strong understanding of security and compliance standards within DevOps/SRE environments.
BENEFITS & PERKS
COMPENSATION RANGE
The compensation range for this role is $185000.00 $200000.00 depending on location and experience.
PEOPLE & CULTURE AT ZETA
Zeta considers applicants for employment without regard to and does not discriminate on the basis of an individuals sex race color religion age disability status as a veteran or national or ethnic origin; nor does Zeta discriminate on the basis of sexual orientation gender identity or expression.
Were committed to building a workplace culture of trust and belonging so everyone feels invited to bring their whole selves to work. We provide a forum for employees to celebrate support and advocate for one another. Learn more about our commitment to diversity equity and inclusion here:https://zetaglobal/blog/alookintozetasergs/
ZETA IN THE NEWS!
https://zetaglobal/press/catpressrelease
#LIDD1
#LIRemote
Required Experience:
Staff IC
Full Time