Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via emailLine of Service
AdvisoryIndustry/Sector
Not ApplicableSpecialism
MicrosoftManagement Level
ManagerJob Description & Summary
At PwC our people in integration and platform architecture focus on designing and implementing seamless integration solutions and robust platform architectures for clients. They enable efficient data flow and optimise technology infrastructure for enhanced business performance.Why PWC
At PwC you will be part of a vibrant community of solvers that leads with trust and creates distinctive outcomes forour clients and communities. This purpose-led and values-driven work powered by technology in an environment that drives innovation will enable you to make a tangible impact in the real world. We reward your contributions support your wellbeing and offer inclusive benefits flexibility programmes and mentorship that will help you thrive in work and life. Together we grow learn care collaborate and create a future of infinite experiences foreach other. Learn more about us.
At PwC we believe in providing equal employment opportunities without any discrimination on the grounds of gender ethnic background age disability marital status sexual orientation pregnancy gender identity or expression religion or other beliefs perceived differences and status protected by law. We strive to create an environment where each one of our people can bring their true selves and contribute to their personal growth and the firms growth. To enable this we have zero tolerance for any discrimination and harassment based on the above considerations.
Responsibilities:
We are seeking a highly skilled Architect/Lead with deep expertise in Chaos Engineering and Automated Root Cause Analysis (RCA) to drive the resilience observability and reliability strategy of our customers. You will be responsible for designing implementing and evangelizing chaos practices and automated diagnostics across diverse platforms ensuring that customers applications and systems can withstand and recover gracefully from unexpected failures. Key experience requirements (must have): a) Demonstrable experience in preparing chaos engineering strategy architecture and roadmap for clients. b) Demonstrable experience in implementing chaos engineering solutions for the clients c) Demonstrable experience in microservices development; demonstrable experience in programming using Python C# Java d) Demonstrable experience in cloud platforms monitoring log analysis DevOps Azure preferred e) Demonstrable experience on one chaos engineering tool Gremlin Harness etc. Key Responsibilities: Define and own the Chaos Engineering strategy architecture and roadmap. Architect automated RCA systems leveraging observability platforms AI/ML techniques and event correlation tools. Develop resilience patterns and chaos experimentation frameworks integrated into the SDLC. Design and orchestrate controlled fault injection experiments to validate system robustness (e.g. latency injection dependency failures resource exhaustion). Evaluate and deploy Chaos Engineering tools (such as Harness Gremlin Chaos Mesh LitmusChaos Simian Army etc.) tailored to cloud-native hybrid and legacy environments. Establish guardrails blast radius controls and automated rollback procedures for experiments. Architect solutions to automatically detect triage and pinpoint root causes for production incidents. Integrate logs metrics traces and events across monitoring tools (Datadog New Relic Splunk ELK Prometheus) for correlation and insights. Develop or integrate ML models and rules engines to accelerate RCA and reduce MTTR.
Define policies processes and success criteria for chaos experiments and RCA automation. Create reusable playbooks runbooks and knowledge artifacts. Mentor engineering teams on resilience and reliability engineering. Partner with SRE Platform Engineering Application Development and Security teams. Champion a culture of proactive failure testing and continuous improvement. Key Skills & Qualification: Deep understanding of distributed systems microservices and cloud-native architectures (Kubernetes Microsoft Azure). Strong knowledge of observability pillars (logging monitoring tracing). Hands-on experience with Azure Monitor Azure Log Analytics Azure Application Insights Azure Sentinel Azure Automation Logic Apps Azure Machine Learning Azure Data Explorer Hands-on experience with Chaos Engineering tools and fault injection practices (Azure Chaos Studio harness Gremlin) Familiarity with AIOps and intelligent RCA frameworks. Proficiency in scripting/programming languages (Python Go Java). Experience automating experiments and RCA workflows via pipelines (GitHub GitHub Actions Azure DevOps). Strong analytical mindset for dissecting failures and correlating signals across multiple systems. Excellent communication and influencing skills. Proven ability to lead cross-functional initiatives and drive cultural change.
Mandatory skill sets:
Chaos Engineering and DevSecOps
Preferred skill sets:
Site Reliability experience and Experience automating experiments and RCA workflows via pipelines (GitHub GitHub Actions Azure DevOps).
Years of experience required:
8-10 Years
Education qualification:
B.E./
Education (if blank degree and/or field of study not specified)
Degrees/Field of Study required: Bachelor of Technology Bachelor of EngineeringDegrees/Field of Study preferred:Certifications (if blank certifications not specified)
Required Skills
Chaos EngineeringOptional Skills
Accepting Feedback Accepting Feedback Active Listening Amazon Web Services (AWS) Analytical Thinking Architectural Engineering Brainstorm Facilitation Business Impact Analysis (BIA) Business Process Modeling Business Requirements Analysis Business Systems Business Value Analysis Cloud Strategy Coaching and Feedback Communication Competitive Advantage Competitive Analysis Conducting Research Creativity Embracing Change Emotional Regulation Empathy Enterprise Architecture Enterprise Integration Evidence-Based Practice (EBP) 46 moreDesired Languages (If blank desired languages not specified)
Travel Requirements
Available for Work Visa Sponsorship
Government Clearance Required
Job Posting End Date
Required Experience:
Manager
Full-Time