Work Schedule
Standard (Mon-Fri)
Environmental Conditions
Office
Job Description
We are seeking a dedicated and skilled DevOps/SRE Engineer to join our dynamic team. The ideal candidate will have extensive experience managing Kubernetes clusters Linux servers AWS and Python scripting. A strong understanding of infrastructure as code automation and CI/CD pipelines is essential. Experience with GitHub Actions and observability stacks is highly desirable.
Key Responsibilities:
- Kubernetes Management: Design implement and maintain Kubernetes clusters and associated infrastructure to ensure high availability and scalability.
- Server and Cloud Management: Oversee and monitor Linux servers and AWS resources ensuring optimal performance and security.
- Automation and Scripting: Develop and maintain automation scripts and tools using Python to streamline and enhance operational tasks.
- CI/CD Pipelines: Build implement and maintain CI/CD pipelines to facilitate rapid and reliable application deployments.
- Collaboration: Work closely with development teams to ensure seamless integration of new applications and services.
- Observability and Monitoring: Implement and maintain observability tools and practices (e.g. Prometheus Grafana Promatail Open telemetry stack) to ensure system reliability and performance.
Key Requirements:
- Linux and AWS Expertise: Proven experience in Linux system administration and AWS cloud services.
- Python Proficiency: Strong skills in Python scripting for automation and tool development.
- CI/CD Tools: Experience with CI/CD tools such as GitHub Actions and Jenkins.
- Familiarity: Understanding of for building and maintaining applications.
- Observability Tools: Knowledge of observability stacks such as Prometheus Grafana and ELK stack.
- Problem-Solving Skills: Excellent problem-solving abilities and the capability to troubleshoot complex issues in a distributed environment.