Job Description
SRE PractitionerAs an SRE Practitioner you will ensure the scalability performance and reliability of our big-scale cloud-based systems and infrastructure. You will collaborate with development and operations teams to develop self-healing systems with reduced human intervention and high Responsibilities and Skills RequiredExperience: 8 years of experience in site reliability engineering DevOps and CloudTechnical Skills: Skills in programming languages like Python Go or Java. Hands-on experience with cloud platforms (AWS Azure GCP) and containerization tools (Docker Kubernetes).Observability Tools: Familiarity with tools of observability like Prometheus Grafana Splunk Datadog ELK stack and New Relic.5 years experience working in a Linux Operating System environment including at least one Linux variant (Red Hat CentOS Debian)Good knowledge of Configuration Management tools such as Puppet or Ansible Familiar with monitoring tools such as Dynatrace Kibana etc. Tools: Familiarity with monitoring tools (Prometheus Grafana) CI/CD pipelines (GitAction Jenkins GitLab) and configuration management tools (Ansible Terraform).System Reliability and Availability: Ensure system reliability and : Create and deploy automation scripts to minimize manual intervention and enhance system Tuning: Fine-tune system performance and and Observability: Establish monitoring and observability frameworks to sense and resolve issues Recovery: Develop and sustain disaster recovery Improvement: Pinpoint room for improvement and implement measures to increase system dependability and and Mentorship: Offer technical leadership and mentorship to junior SREs and other team Planning: Assist in the strategic planning of infrastructure and reliability Remediation: Create and execute proactive remediation techniques to avoid incidents before they happenFault Tolerant Design: Design and execute fault-tolerant systems to provide high availability and resilienceStrong communication and collaboration : Appropriate certifications like AWS Certified DevOps Engineer Google Professional Cloud DevOps Engineer (1.) Creation of solution and architectural views (logical conceptual development execution infrastructure & operations architecture) (2.) To assess the domain IT landscape assessment and Application portfolio optimization for gap analysis (3.) To contribute towards white/technical papers and knowledge base (4.) To ensure knowledge up-gradation and work with new and emerging products/technologies (5.) To manage Non Functional Requirement adaption for the solution (6.) To study and define system requirements addressing stakeholder portfolio concerns