Site Reliability Engineer (SRE)

Dhaka - Bangladesh

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

As a Site Reliability Engineer (SRE) you will be responsible for ensuring the reliability scalability and performance of our production systems. You will work closely with Development QA and Infrastructure teams to maintain high availability optimize system performance and implement SRE best practices. Your role will focus on operational excellence incident management and building resilient systems while collaborating with engineering teams to improve application reliability.

Responsibilities:

Monitor maintain and improve system reliability availability and performance.
Participate in on-call rotations respond to incidents conduct root cause analysis (RCA) and implement preventive measures.
Define and enforce Service Level Objectives (SLOs) Service Level Indicators (SLIs) and Error Budgets.
Reduce manual toil through automation and self-healing systems.
Analyze system performance identify bottlenecks and optimize infrastructure.
Conduct capacity planning and scaling strategies to handle growth.
Work with Development teams to ensure deployment strategies (blue-green canary) minimize downtime.
Enhance monitoring logging and alerting (e.g. Prometheus Grafana ELK Datadog).
Ensure proper observability for proactive issue detection.
Implement distributed tracing for microservices troubleshooting.
Manage cloud/infrastructure components (AWS /Azure Kubernetes Terraform).
Automate operational tasks using scripting (Bash/Python) and Infrastructure as Code (IaC).
Collaborate with Infrastructure teams to improve deployment reliability.
Partner with Development teams to improve application resilience (retries circuit breakers graceful degradation).
Work with QA teams to ensure reliability testing is part of the development lifecycle.
Document runbooks operational procedures and postmortems.

Qualifications :

Years of Experience: 3-5 Years
Education: BS/MS in Computer Science preferred but can be waived for exceptional candidates.

Requirements:

3 years in SRE Production Engineering or Cloud Operations.
Strong experience with Linux Kubernetes Docker and cloud platforms (AWS/GCP/Azure).
Proficiency in monitoring (Prometheus Grafana Datadog).
Coding/scripting skills (Python Bash) for automation.
Experience with IaC (Terraform CloudFormation)
Knowledge of networking security and database performance tuning.

Nice to Have:

Knowledge of GitOps (ArgoCD Flux).
Certifications like AWS CKA (Kubernetes) or Google SRE.

Additional Information :

Employment Type: Full-time
Weekend: 2 Days
Work Model: Hybrid

Compensation and Benefits:

Join a Workplace That Values You

At Nifty Coders Pvt. Ltd. we celebrate innovation collaboration and the unique contributions each of our employees brings. We prioritize a work environment that encourages growth well-being and a healthy work-life balance. Here youll be part of a team that values creativity promotes flexibility and empowers individuals to thrive.

As part of our commitment to supporting you we offer a range of benefits and perks designed to enhance your work experience:

Competitive compensation plans
Two annual bonuses
Paid Maternity Leave (4 months) and Paternity Leave (5 working days)
Comprehensive medical insurance for you and your dependents
Monthly and quarterly team-building events
Transport allowance
Mobile allowance
Corporate home internet support
Subsidized daily lunch
A dynamic performance review process that fosters ongoing transparency between managers and team members
Company-sponsored certifications programs for internal career growth and development

At Nifty Coders we foster a culture of collaboration continuous learning and innovation ensuring that every employee has the opportunity to grow and succeed.

Application Deadline: February 10 2026

Remote Work :

Employment Type :

Full-time

Responsibilities:

Monitor maintain and improve system reliability availability and performance.
Participate in on-call rotations respond to incidents conduct root cause analysis (RCA) and implement preventive measures.
Define and enforce Service Level Objectives (SLOs) Service Level Indicators (SLIs) and Error Budgets.
Reduce manual toil through automation and self-healing systems.
Analyze system performance identify bottlenecks and optimize infrastructure.
Conduct capacity planning and scaling strategies to handle growth.
Work with Development teams to ensure deployment strategies (blue-green canary) minimize downtime.
Enhance monitoring logging and alerting (e.g. Prometheus Grafana ELK Datadog).
Ensure proper observability for proactive issue detection.
Implement distributed tracing for microservices troubleshooting.
Manage cloud/infrastructure components (AWS /Azure Kubernetes Terraform).
Automate operational tasks using scripting (Bash/Python) and Infrastructure as Code (IaC).
Collaborate with Infrastructure teams to improve deployment reliability.
Partner with Development teams to improve application resilience (retries circuit breakers graceful degradation).
Work with QA teams to ensure reliability testing is part of the development lifecycle.
Document runbooks operational procedures and postmortems.

Qualifications :

Years of Experience: 3-5 Years
Education: BS/MS in Computer Science preferred but can be waived for exceptional candidates.

Requirements:

3 years in SRE Production Engineering or Cloud Operations.
Strong experience with Linux Kubernetes Docker and cloud platforms (AWS/GCP/Azure).
Proficiency in monitoring (Prometheus Grafana Datadog).
Coding/scripting skills (Python Bash) for automation.
Experience with IaC (Terraform CloudFormation)
Knowledge of networking security and database performance tuning.

Nice to Have:

Knowledge of GitOps (ArgoCD Flux).
Certifications like AWS CKA (Kubernetes) or Google SRE.

Additional Information :

Employment Type: Full-time
Weekend: 2 Days
Work Model: Hybrid

Compensation and Benefits:

Join a Workplace That Values You

As part of our commitment to supporting you we offer a range of benefits and perks designed to enhance your work experience:

Competitive compensation plans
Two annual bonuses
Paid Maternity Leave (4 months) and Paternity Leave (5 working days)
Comprehensive medical insurance for you and your dependents
Monthly and quarterly team-building events
Transport allowance
Mobile allowance
Corporate home internet support
Subsidized daily lunch
A dynamic performance review process that fosters ongoing transparency between managers and team members
Company-sponsored certifications programs for internal career growth and development

At Nifty Coders we foster a culture of collaboration continuous learning and innovation ensuring that every employee has the opportunity to grow and succeed.

Application Deadline: February 10 2026

Remote Work :

Employment Type :

Full-time

Key Skills

Kubernetes
FMEA
Continuous Improvement
Elasticsearch
Go
Root cause Analysis
Maximo
CMMS
Maintenance
Mechanical Engineering
Manufacturing
Troubleshooting

Apply Now

About Company

Nifty Coders Pvt. Ltd.

We are Nifty Coders Pvt. Ltd., a leading provider of Enterprise-Grade Software Engineering Services for startups and corporates. As experts in DevOps & Infrastructure, Application Development, and Service Reliability Engineering (SRE), we deliver innovative, reliable, and scalable sol ... View more

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click