Job Title: Site Reliability Engineer (SRE)
Location: New York/EST area Hybrid/Remote
Duration: 6 Months (Can be extendable)
Experience Required: 10 Years
Certification: AWS Certification is Mandatory (e.g. AWS Certified DevOps Engineer Solutions Architect or SysOps Administrator)
Job Summary:
- We are seeking an experienced Senior Site Reliability Engineer (SRE) with a strong background in AWS Cloud DevOps automation and system reliability engineering. The ideal candidate should bring hands-on expertise in cloud infrastructure CI/CD monitoring and automation along with proven experience in supporting large-scale high-availability systems within Telecom Banking or Retail industries.
- This role is responsible for ensuring platform stability reliability scalability and continuous improvement of infrastructure through automation and DevOps best practices.
Key Responsibilities:
- Design implement and maintain highly available and scalable cloud infrastructure on AWS.
- Build and manage end-to-end CI/CD pipelines to enable efficient and reliable software delivery.
- Develop and maintain Infrastructure as Code (IaC) using Terraform CloudFormation or Ansible.
- Monitor automate and enhance system reliability performance and incident response processes.
- Implement observability solutions (Prometheus Grafana ELK/EFK Splunk or Datadog).
- Collaborate with development teams to improve application reliability and deployment processes.
- Participate in on-call rotations incident management and root cause analysis (RCA).
- Optimize infrastructure costs and ensure cloud security and compliance with enterprise standards.
- Develop automation scripts and tools using Python Go or Shell to eliminate manual tasks.
- Prepare and maintain documentation including architecture diagrams and operational runbooks.
Primary Skills:
- Cloud Platform: AWS (EC2 S3 EKS Lambda CloudWatch RDS IAM etc.)
- DevOps & SRE Practices: CI/CD automation monitoring incident response performance tuning
- Infrastructure as Code (IaC): Terraform AWS CloudFormation Ansible
- CI/CD Tools: Jenkins GitLab CI GitHub Actions Argo CD or Spinnaker
- Containers & Orchestration: Docker Kubernetes Helm EKS OpenShift
- Monitoring & Logging: Prometheus Grafana ELK / EFK Splunk Datadog CloudWatch
- Scripting / Programming: Python Go Bash or Shell
- Version Control: Git GitHub Bitbucket
- Networking & Security: VPC VPN Load Balancers DNS SSL Security Groups IAM
Required Qualifications:
- Bachelors or Masters degree in Computer Science Information Technology or related field.
- 10 years of hands-on experience in SRE / DevOps / Cloud Infrastructure roles.
- Mandatory AWS Certification (e.g. AWS Certified DevOps Engineer Solutions Architect Associate/Professional or SysOps Administrator).
- Proven experience in Telecom Banking or Retail domain infrastructure and platform operations.
- Strong expertise in microservices distributed systems and containerized environments.
- Experience in monitoring alerting observability and automated remediation.
- Excellent problem-solving incident management and communication skills.
Preferred / Nice-to-Have:
- Experience with Kafka RabbitMQ or other messaging platforms.
- Familiarity with service mesh (Istio Linkerd Consul) and API Gateway solutions.
- Exposure to data pipeline management and streaming frameworks.
- Knowledge of FinOps and cost optimization strategies on AWS.
- Experience with security compliance frameworks (ISO PCI-DSS GDPR etc.).
Job Title: Site Reliability Engineer (SRE) Location: New York/EST area Hybrid/Remote Duration: 6 Months (Can be extendable) Experience Required: 10 Years Certification: AWS Certification is Mandatory (e.g. AWS Certified DevOps Engineer Solutions Architect or SysOps Administrator) Job Summary: We a...
Job Title: Site Reliability Engineer (SRE)
Location: New York/EST area Hybrid/Remote
Duration: 6 Months (Can be extendable)
Experience Required: 10 Years
Certification: AWS Certification is Mandatory (e.g. AWS Certified DevOps Engineer Solutions Architect or SysOps Administrator)
Job Summary:
- We are seeking an experienced Senior Site Reliability Engineer (SRE) with a strong background in AWS Cloud DevOps automation and system reliability engineering. The ideal candidate should bring hands-on expertise in cloud infrastructure CI/CD monitoring and automation along with proven experience in supporting large-scale high-availability systems within Telecom Banking or Retail industries.
- This role is responsible for ensuring platform stability reliability scalability and continuous improvement of infrastructure through automation and DevOps best practices.
Key Responsibilities:
- Design implement and maintain highly available and scalable cloud infrastructure on AWS.
- Build and manage end-to-end CI/CD pipelines to enable efficient and reliable software delivery.
- Develop and maintain Infrastructure as Code (IaC) using Terraform CloudFormation or Ansible.
- Monitor automate and enhance system reliability performance and incident response processes.
- Implement observability solutions (Prometheus Grafana ELK/EFK Splunk or Datadog).
- Collaborate with development teams to improve application reliability and deployment processes.
- Participate in on-call rotations incident management and root cause analysis (RCA).
- Optimize infrastructure costs and ensure cloud security and compliance with enterprise standards.
- Develop automation scripts and tools using Python Go or Shell to eliminate manual tasks.
- Prepare and maintain documentation including architecture diagrams and operational runbooks.
Primary Skills:
- Cloud Platform: AWS (EC2 S3 EKS Lambda CloudWatch RDS IAM etc.)
- DevOps & SRE Practices: CI/CD automation monitoring incident response performance tuning
- Infrastructure as Code (IaC): Terraform AWS CloudFormation Ansible
- CI/CD Tools: Jenkins GitLab CI GitHub Actions Argo CD or Spinnaker
- Containers & Orchestration: Docker Kubernetes Helm EKS OpenShift
- Monitoring & Logging: Prometheus Grafana ELK / EFK Splunk Datadog CloudWatch
- Scripting / Programming: Python Go Bash or Shell
- Version Control: Git GitHub Bitbucket
- Networking & Security: VPC VPN Load Balancers DNS SSL Security Groups IAM
Required Qualifications:
- Bachelors or Masters degree in Computer Science Information Technology or related field.
- 10 years of hands-on experience in SRE / DevOps / Cloud Infrastructure roles.
- Mandatory AWS Certification (e.g. AWS Certified DevOps Engineer Solutions Architect Associate/Professional or SysOps Administrator).
- Proven experience in Telecom Banking or Retail domain infrastructure and platform operations.
- Strong expertise in microservices distributed systems and containerized environments.
- Experience in monitoring alerting observability and automated remediation.
- Excellent problem-solving incident management and communication skills.
Preferred / Nice-to-Have:
- Experience with Kafka RabbitMQ or other messaging platforms.
- Familiarity with service mesh (Istio Linkerd Consul) and API Gateway solutions.
- Exposure to data pipeline management and streaming frameworks.
- Knowledge of FinOps and cost optimization strategies on AWS.
- Experience with security compliance frameworks (ISO PCI-DSS GDPR etc.).
View more
View less