Site Reliability Engineer W2 Role

Saransh Inc

Job Location:

Palo Alto, CA - USA

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

Role: Site Reliability Engineer (SRE)

Location: Palo Alto CA (Onsite from Day 1)

Job Type: Contract (W2)

Skill Matrix:

Name	Required
Programming	Yes
SRE	Yes
Grafana	Yes
Prometheus	Yes
AWS	Yes
Cloud Infrastructure	Yes
Linux	Yes
UNIX	Yes

Top skills required for this role:

Programming: Proficiency in languages like Python Java or Go.

System Administration: Strong understanding of Linux/Unix systems.

Cloud Infrastructure: Experience with AWS

Infrastructure as Code (IaC): Knowledge of tools like Terraform or Ansible.

Monitoring Tools: Proficiency with tools such as Prometheus Grafana or Datadog

Job Description/ Responsibilities:

Automation and Tooling: SREs write code to automate operational tasks such as provisioning configuration changes and system updates to reduce manual work and human error.

System Monitoring and Alerting: Developing and maintaining observability stacks (logs metrics tracing) to proactively detect issues before they impact users.

Incident Response and On-Call: Managing 24/7 on-call rotation to respond to troubleshoot and resolve production incidents.

Post-Incident Reviews (Postmortems): Conducting blameless in-depth reviews of incidents to identify root causes and implement preventive measures.

Capacity Planning: Analyzing system resource utilization to ensure infrastructure can scale to handle future load requirements.

Performance Optimization: Identifying and fixing bottlenecks in software and infrastructure to improve system efficiency and responsiveness.

Error Budget Management: Setting and managing Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to determine if a service is reliable enough to allow new feature deployments.

Chaos Engineering: Testing system resilience by intentionally introducing failures to ensure systems are fault-tolerant

Years of Experience: 8 Years of Experience

Role: Site Reliability Engineer (SRE) Location: Palo Alto CA (Onsite from Day 1) Job Type: Contract (W2) Skill Matrix: Name Required Programming Yes SRE Yes Grafana Yes Prometheus Yes AWS Yes Cloud Infrastructure Yes Linux Yes UNIX Yes ...