Senior DevOps and SRE Engineer

Rishabh RPO

Not Interested
Bookmark
Report This Job

profile Job Location:

Washington, AR - USA

profile Monthly Salary: Not Disclosed
Posted on: 30+ days ago
Vacancies: 1 Vacancy

Job Summary

Title: Senior DevOps and SRE Engineer

Location: Washington DC

Duration: 11 months

Responsibilities

Deployment & Automation Engineering

  • Implement maintain and optimize robust CI/CD pipelines utilizing tools such as GitHub Actions AWS CodePipeline and Jenkins.
  • Automate infrastructure provisioning and configuration management using Infrastructure-as-Code (IaC) tools like Terraform CloudFormation or AWS CDK.
  • Design and develop automation scripts and self-service tools to significantly enhance development and operational efficiency.
  • Proficiency in multiple programming languages (Python Go Java) to develop automation and troubleshoot applications.

Site Reliability & Observability

  • Serve as a production on-call responder leading incident management and orchestrating critical service outages and disaster recovery failover activities.
  • Facilitate detailed post-mortem meetings and drive systemic improvement patterns across teams.
  • Define monitor and enforce Service Level Indicators (SLIs) Service Level Objectives (SLOs) and Error Budgets.
  • Expertly leverage observability tools (Dynatrace AppDynamics ELK Stack Dynatrace strongly preferred) for proactive monitoring and troubleshooting.
  • Utilize distributed tracing and context propagation to identify performance bottlenecks and root causes of failures.
  • Design and implement custom dashboards and anomaly detectors to generate actionable insights.

Capacity Performance & Cost Management

  • Develop sophisticated capacity models and forecasting systems to ensure service scalability.
  • Lead cost optimization initiatives identifying and implementing efficiency gains across cloud services.
  • Design and execute comprehensive Resiliency and Performance testing frameworks.
  • Configure and maintain dynamic auto-scaling policies and thresholds for optimal resource utilization.

Security & Governance

  • Lead security incident investigations and execute swift remediation plans.
  • Design and implement automated compliance validation and security automation frameworks.
  • Drive the implementation of zero-trust architecture patterns within the cloud environment.
  • Proficiently apply ITIL framework principles preferably leveraging ITSM tools such as ServiceNow.

Qualifications

Education & Experience

  • Bachelors degree in computer science Engineering or a related technical field.
  • 5 to 8 years of progressive experience in DevOps Site Reliability Engineering (SRE) or Platform Engineering.
  • 3 years of experience maintaining and optimizing high-availability production environments.
  • Proven track record of leading complex technical initiatives from conception to completion.

Technical Expertise

  • Expert-level knowledge of at least one major cloud platform with AWS strongly preferred.
  • Deep expertise in cloud architecture networking and core services.
  • High proficiency in IaC tools such as Terraform CloudFormation or AWS CDK.
  • Expert-level experience with observability and APM tools with a strong preference for Dynatrace.
  • Proficiency in modern programming languages like Python Go or Java.
  • Knowledge of relational cloud-native and NoSQL database technologies.

Professional & Leadership Skills

  • Strong leadership and mentoring capabilities with the ability to elevate the technical skills of the team.
  • Exceptional ability to influence without direct authority across engineering and product teams.
  • Excellent technical writing and documentation skills (e.g. RCA development Knowledge articles).
  • Ability to maintain flexible availability for on-call duties and to work outside of standard business hours as required for incident response
Title: Senior DevOps and SRE Engineer Location: Washington DC Duration: 11 months Responsibilities Deployment & Automation Engineering Implement maintain and optimize robust CI/CD pipelines utilizing tools such as GitHub Actions AWS CodePipeline and Jenkins. Automate infrastructure provisioning ...
View more view more

Key Skills

  • Editorial
  • Academics
  • Engineering
  • Cruise
  • Datawarehousing
  • Fabrication