Site Reliability Engineer (SRE)

Not Interested
Bookmark
Report This Job

profile Job Location:

Washington, AR - USA

profile Monthly Salary: Not Disclosed
Posted on: 7 hours ago
Vacancies: 1 Vacancy

Job Summary

Randstad is seeking a proactive Site Reliability Engineer to join a high-impact team supporting a premier client in the Washington D.C. this role you will bridge the gap between development and operations by building automated resilient infrastructure and maintaining deep visibility into system performance. You will be the primary steward of the Dynatrace observability stack ensuring that distributed services remain highly available and performant through sophisticated CI/CD integration Infrastructure-as-Code (IaC) and the application of SRE principles like SLOs and error budgets. This is a dynamic position for a technical problem-solver who enjoys maturing automation frameworks while acting as a reliable line of defense for production stability in a hybrid AWS/Azure environment.

Key Responsibilities
  • Automation & Deployment: Design and maintain robust CI/CD pipelines using GitHub Actions AWS CodePipeline and Jenkins; manage infrastructure provisioning via Terraform CloudFormation or AWS CDK.
  • Observability Mastery: Act as the subject matter expert for Dynatrace overseeing automated installation tagging standards distributed tracing and the creation of advanced anomaly detection dashboards.
  • Reliability & Incident Response: Serve as a production on-call responder performing deep-dive root cause analysis (RCA) and applying ITIL frameworks within ServiceNow to resolve complex incidents.
  • Performance Engineering: Manage capacity planning auto-scaling policies and operational cost optimization while executing resiliency and performance test plans.
  • Security & Compliance: Administer service accounts digital certificates and access permissions ensuring all remediation tasks align with strict security standards.
  • Operations Strategy: Define and track SLIs SLOs and error budgets to balance the velocity of feature delivery with the stability of the platform.
Qualifications
  • Education: Bachelors degree in Computer Science Engineering or a related technical field.
  • Experience: 2 to 4 years of professional experience in SRE DevOps or cloud infrastructure roles.
  • Cloud & Containers: Practical proficiency in AWS and Azure environments including hands-on knowledge of Docker and Kubernetes/ECS.
  • Technical Stack: Mid-level expertise in Python scripting and configuration management tools such as Ansible.
  • Systems Knowledge: Strong foundational understanding of Linux systems networking and both relational and NoSQL databases.
  • Soft Skills: Excellent written and verbal communication skills with the ability to manage priorities independently in a fast-paced environment.
  • Flexibility: Availability to participate in an on-call rotation and work outside standard business hours as required.

Required Skills :

Basic Qualification :

Additional Skills :

This is a high PRIORITY requisition. This is a PROACTIVE requisition

Background Check : No

Drug Screen : No

N/A

Stipend :false

Randstad is seeking a proactive Site Reliability Engineer to join a high-impact team supporting a premier client in the Washington D.C. this role you will bridge the gap between development and operations by building automated resilient infrastructure and maintaining deep visibility into system per...
View more view more

Key Skills

  • Kubernetes
  • FMEA
  • Continuous Improvement
  • Elasticsearch
  • Go
  • Root cause Analysis
  • Maximo
  • CMMS
  • Maintenance
  • Mechanical Engineering
  • Manufacturing
  • Troubleshooting