We are seeking a highly skilled Site Reliability Engineer (SRE) with expertise in GitHub Actions AWS DevOps Helm Charts and YAML configuration. The ideal candidate will be responsible for ensuring the reliability scalability and efficiency of our cloudbased applications. You will work closely with development teams to implement and manage automation processes infrastructure and deployment strategies.
Key Responsibilities
Develop and maintain CI/CD pipelines using GitHub Actions to streamline the software development lifecycle.
Design deploy and manage AWS infrastructure ensuring high availability and security.
Implement and manage Helm Charts for Kubernetes to automate the deployment of applications.
Utilize YAML configuration files for defining and managing infrastructure and application settings.
Apply SRE principles to enhance system reliability performance and capacity through automation and monitoring.
Collaborate with development teams to integrate reliability and scalability into the software development process.
Monitor application and infrastructure performance troubleshoot issues and implement solutions to improve system reliability.