AWS Cloud Ops SRE

New York City, NY - USA

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

we have a full time position for AWS Cloud Ops SRE (Onsite NEWYORKNY) role if you are interested please send me the resumes to

Job Position:AWS Cloud Ops SRE

Location:New york NY

Duration: Full time

Job Description

AWS Cloud Ops SRE

AWS Cloud Operations / Site Reliability Engineer (SRE) is responsible for delivering secure reliable and scalable cloud infrastructure. This role covers Infrastructure as a Service AWS platform release activities AMI lifecycle management patching infrastructure design documentation terraform scripting and maintaining visibility into the application layer and how it functions in production environments. Experience with Harness for DevOps pipelines is a strong plus.

Required Qualifications

10 years in SRE Cloud Ops or DevOps with heavy AWS experience.

Strong hands-on experience with:

o AWS compute (EC2 ASG EKS/ECS Lambda)

o Networking (VPC Route 53 SG/NACL ALB/NLB)

o Storage (S3 EBS EFS)

o Databases (RDS Aurora DynamoDB)

Expertise in AMI pipeline management image building and OS level hardening.

Solid experience with Terraform or CloudFormation for IaC..

Demonstrated ability to troubleshoot AWS and application stack issues end-to-end

1. AWS Platform Operations & Releases

Own and execute AWS platform release management across environments including validation regression checks and readiness reviews.

Operate and evolve AWS core services: VPC IAM KMS Route 53 networking baselines proxy layers and organizational guardrails.

2. Infrastructure as a Service (IaS) using Terraform

Build manage and scale cloud infrastructure using Terraform as primary IaC tooling.

Create reusable Terraform modules covering networking compute storage EKS and security.

Ensure IaC follows best practices-versioned immutable peer reviewed and automated through CI/CD.

3. Amazon EKS (Kubernetes) Operations

Deploy manage and maintain production grade AWS EKS clusters node groups and cluster add ons.

Implement Kubernetes platform standards for security networking namespaces RBAC and secrets management.

Work closely with application teams to ensure workloads run reliably and securely within EKS.

Optimize cluster scaling workload scheduling resource limits and performance tuning.

4. AMI Lifecycle & Image Management

Manage complete AMI lifecycle: creation CIS hardening vulnerability scanning tagging publishing and deprecation.

Build automated AMI pipelines using image builders Packer (if applicable) and validation workflows.

Maintain golden images for EC2 fleets containers and hybrid workloads.

5. VIT (Vulnerability / Integration / Integrity Testing) & Patch Management

Lead VIT proces s including vulnerability assessments remediation workflows compliance tracking and closure.

Own OS level and image patching using AWS Systems Manager (SSM) Patch Manager and automated maintenance windows.

Generate patch baselines dashboards compliance reports and ensure measurable SLA adherence.

6. Observability & Application Layer Insights

Build and maintain observability stack with CloudWatch X Ray OpenTelemetry and log analytics.

Establish deep visibility into application behavior dependencies performance and error patterns.

Create golden signals dashboards covering latency traffic errors and saturation for both infra and applications.

7. CI/CD & DevOps Automation

Implement and maintain CI/CD pipelines for infrastructure and application deployments.

Harness experience is an added advantage leveraging workflows verification steps and deployment strategies (canary blue/green).

Integrate Terraform AMI pipelines EKS updates and patch automation into CI/CD systems.

8. Reliability Engineering & Incident Response

Participate in on call rotation; lead incident triage and root cause analysis.

Build automation and runbooks to reduce operational toil.

Drive architectural improvements to increase availability resiliency and performance.

9. Documentation & Architecture

Produce high-quality Infrastructure Design Documents (IDDs) runbooks DR procedures release notes and architectural diagrams.

Conduct operational readiness reviews capacity planning and cost-optimization assessments

we have a full time position for AWS Cloud Ops SRE (Onsite NEWYORKNY) role if you are interested please send me the resumes to Job Position:AWS Cloud Ops SRE Location:New york NY Duration: Full time Job Description AWS Cloud Ops SRE AWS Cloud Operations / Site Reliability Engineer...

we have a full time position for AWS Cloud Ops SRE (Onsite NEWYORKNY) role if you are interested please send me the resumes to

Job Position:AWS Cloud Ops SRE

Location:New york NY

Duration: Full time

Job Description

AWS Cloud Ops SRE

Required Qualifications

10 years in SRE Cloud Ops or DevOps with heavy AWS experience.

Strong hands-on experience with:

o AWS compute (EC2 ASG EKS/ECS Lambda)

o Networking (VPC Route 53 SG/NACL ALB/NLB)

o Storage (S3 EBS EFS)

o Databases (RDS Aurora DynamoDB)

Expertise in AMI pipeline management image building and OS level hardening.

Solid experience with Terraform or CloudFormation for IaC..

Demonstrated ability to troubleshoot AWS and application stack issues end-to-end

1. AWS Platform Operations & Releases

Own and execute AWS platform release management across environments including validation regression checks and readiness reviews.

Operate and evolve AWS core services: VPC IAM KMS Route 53 networking baselines proxy layers and organizational guardrails.

2. Infrastructure as a Service (IaS) using Terraform

Build manage and scale cloud infrastructure using Terraform as primary IaC tooling.

Create reusable Terraform modules covering networking compute storage EKS and security.

Ensure IaC follows best practices-versioned immutable peer reviewed and automated through CI/CD.

3. Amazon EKS (Kubernetes) Operations

Deploy manage and maintain production grade AWS EKS clusters node groups and cluster add ons.

Implement Kubernetes platform standards for security networking namespaces RBAC and secrets management.

Work closely with application teams to ensure workloads run reliably and securely within EKS.

Optimize cluster scaling workload scheduling resource limits and performance tuning.

4. AMI Lifecycle & Image Management

Manage complete AMI lifecycle: creation CIS hardening vulnerability scanning tagging publishing and deprecation.

Build automated AMI pipelines using image builders Packer (if applicable) and validation workflows.

Maintain golden images for EC2 fleets containers and hybrid workloads.

5. VIT (Vulnerability / Integration / Integrity Testing) & Patch Management

Lead VIT proces s including vulnerability assessments remediation workflows compliance tracking and closure.

Own OS level and image patching using AWS Systems Manager (SSM) Patch Manager and automated maintenance windows.

Generate patch baselines dashboards compliance reports and ensure measurable SLA adherence.

6. Observability & Application Layer Insights

Build and maintain observability stack with CloudWatch X Ray OpenTelemetry and log analytics.

Establish deep visibility into application behavior dependencies performance and error patterns.

Create golden signals dashboards covering latency traffic errors and saturation for both infra and applications.

7. CI/CD & DevOps Automation

Implement and maintain CI/CD pipelines for infrastructure and application deployments.

Harness experience is an added advantage leveraging workflows verification steps and deployment strategies (canary blue/green).

Integrate Terraform AMI pipelines EKS updates and patch automation into CI/CD systems.

8. Reliability Engineering & Incident Response

Participate in on call rotation; lead incident triage and root cause analysis.

Build automation and runbooks to reduce operational toil.

Drive architectural improvements to increase availability resiliency and performance.

9. Documentation & Architecture

Produce high-quality Infrastructure Design Documents (IDDs) runbooks DR procedures release notes and architectural diagrams.

Conduct operational readiness reviews capacity planning and cost-optimization assessments

Apply Now

About Company

Siri InfoSolutions Inc

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click

AI Resume Builder

Create an ATS-ready CV in minutes

AI Cover Letter

Write a personalized letter instantly

AWS Cloud Ops SRE

New York City, NY - USA

Job Summary

About Company

Related Jobs