Job description:
Intangles Lab is looking for a handson Senior Site Reliability Engineer to manage large 247 Cloud Operations. Looking for a Site Reliability Engineer with 2 years of experience having handson with the below technologies:
Automation Tools: Terraform* CircleCI*/Jenkins/Argo Cloud Provider: AWS* Azure/GCP (optional)
Containerization Tools: Docker*/Podman
Orchestration: Kubernetes* Helm Apache Airflow Scripting: Python* Bash*/Shell*
Others: Strong Linux* & Networking* basics
Database Administration (MongoDB & PostgreSQL Elasticsearch)
Monitoring Stack: Prometheus* Grafana* Istio Jaeger Datadog PagerDuty*
All * marks skills are mandatory
Responsibilities :
1.To work in a production environment with technologies like Linux AWS Terraform Kubernetes MongDB Elasticsearch & PostgreSQL Adminstration
2.To keep production environment up & running i.e. ensuring reliability of the production environment.
3.To troubleshoot debug and fix issues in case of failures of production and QA environment and provide technical solutions.
4.To own the responsibilities of oncall as per teams policy. To write and enhance automations as and when needed.
5.To work closely with internal teams and customers to follow the processes and SLAs of uptime.
6.To write update and enhance documentation including runbooks/playbooks and prepare postmortem reports for the production incidents.
7.Considering the role is to ensure platforms reliability ready to work in a 24*7 work environment when required.
Requirements:
One should be aware of change/incident/problem/issue/risk management/escalations.
Should be flexible in working in rotational shifts (Including weekends).
Excellent thinking and problemsolving skills.
Optional Skills:
1.Medium to High Level of Application Development experience in languages like Javascript Python Java will be a bonus.
2.Understanding of Ntier Architectures
3.Understanding of REST & gRPC API Frameworks
4.Understanding of Web Servers in NodeJS