drjobs Chaos Engineering Specialist

Chaos Engineering Specialist

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Santa Clara County, CA - USA

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

We are seeking a highly skilled and proactive Chaos Engineering Specialist to join our team. This individual will play a crucial role in ensuring system resilience and reliability by executing chaos engineering experiments reviewing prior outages and collaborating with stakeholders. The ideal candidate will bring expertise in observability platforms Chaos engineering tools and help build a tool that meets clients needs or customize and automate existing tools in the market to run the chaos tests at scale.

Technical Skills

Chaos Engineering:

Handson experience with chaos engineering tools like AWS FIS Gremlin Chaos Monkey Chaos Toolkit or similar platforms.

Experience conducting experiments in production or controlled staging environments.

Ability to design execute and analyze chaos experiments to improve system reliability.

Ability to document the findings and observations and come up with solutions to mitigate future issues

Tool Development Expertise:

Proficiency in programming languages like Python Java or Go for building scalable applications.

Ability to customize and extend existing frameworks for specific testing needs.

Knowledge of frameworks for automation such as Selenium Robot Framework or custombuilt solutions.

Familiarity with REST APIs and the ability to integrate chaos testing tools with other platforms.

Architectural Knowledge:

Experience preparing architectural flow diagrams to illustrate system designs and processes.

Ability to identify and document potential failure points in complex systems.

Observability Platforms:

Proficiency in New Relic and Grafana for monitoring logging and visualization.

Ability to create custom dashboards and trace system performance.

Infrastructure as Code (IaC):

Expertise in tools like Terraform Ansible or CloudFormation.

Strong understanding of infrastructure automation principles.

Familiarity with scripting languages like Python Bash or Groovy.

CI/CD Pipelines:

Deep understanding of Jenkins including writing custom jobs and automating Chaos Engineering experiments

Familiarity with containerization (e.g. Docker Kubernetes is a plus.

Professional Experience

6 years of experience in Chaos Engineering SRE or similar roles.

Proven track record of identifying system failures and implementing solutions.

Experience reviewing outages gathering data and presenting findings.

Familiarity with REST APIs and the ability to integrate chaos testing tools with other platforms.

Soft Skills

Excellent communication and collaboration skills for working with Application stakeholders and SRE engineers.

Strong organizational and project management abilities for scheduling and leading meetings.

Analytical mindset to gather data assess system behavior and propose recommendations.

AWS FIS,Chaos

Employment Type

Full Time

Company Industry

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.