Title: Site Reliability Engineer (SRE) / Application Support Engineer
Work Location : Detroit MI - Onsite
Job type- W2 Contract
YOE : 5-8
Must Haves: SRE DevSecOps SRE Roadmaps AWS CI/CD
JD: Are you passionate about ensuring the reliability and scalability of complex systems Do you thrive on implementing efficient solutions to prevent and resolve incidents We are seeking a talented and motivated Site Reliability Engineer (SRE) to join our dynamic team.
The Work:
Collaborate with cross-functional teams to design build and maintain robust scalable and fault-tolerant systems
Work closely with development teams and architects to advocate for reliability best practices during the application development lifecycle
Design and implement monitoring and alerting to provide real-time visibility into user experience and system health and performance
Monitor and analyze system performance proactively identifying potential issues and implementing solutions to ensure optimal performance and reliability
Develop and maintain automated tools and processes to streamline operational tasks and reduce manual interventions
Participate in incident response and post-mortems contributing to continuous improvement efforts
Conduct capacity planning and resource optimization to handle growing demands on our infrastructure
Continuously research and evaluate new technologies and practices to enhance the reliability and efficiency of our systems
Conduct capacity planning and resource optimization to handle growing demands on our infrastructure
Continuously research and evaluate new technologies and practices to enhance the reliability and efficiency of our systems
The Skills You Bring:
Bachelors degree in computer science Engineering or related fields preferred (or equivalent practical experience)
Strong verbal and written communication skills
Experience of overall 4-8 years of managing an SRE or DevOps team with observability workload.
4-8 years of Agile Management owning SRE roadmaps and deliverables using Scrum / Kanban
4-8 years of delivering projects alongside a constant flow of side intake and production response workloads
Experience presenting to leadership and collaborate effectively/communicate technical concepts to non-technical business stakeholders
Proven 5 years experience as a Site Reliability Engineer or similar role in a production environment
Applied AWS/Cloud Certification (AWS Cloud Architect DevOps/SysOps) including experience with ASG Fargate Lambda Aurora DB Dynamo DB ALB/NLB
5 years working experience with CI/CD pipelines (Gitlab) and developing infrastructure-as-code (Terraform Python Ansible etc.)
Applied experience with Linux and Windows platforms Java EE JavaScript Spring Spring Boot REST API/Micro Services Shell Scripting Python PL/SQL and databases specifically Oracle
Working knowledge of observability platforms like Splunk Dynatrace
Working experience with designing Observability for enterprise applications
Experienced knowledge of system administration DevSecOps
Development experience along with cloud and physical servers
Understanding and experience working with business product and engineering teams in developing SLI SLO and SLAs
Conduct capacity planning and resource optimization to handle growing demands on our infrastructure
Other Skills & Experience Desired:
Strong knowledge of Linux/Unix systems and network protocols
Familiarity with cybersecurity best practices and principles
Ability to lead triage calls including working across multiple divisions to resolve issues.
Title: Site Reliability Engineer (SRE) / Application Support Engineer Work Location : Detroit MI - Onsite Job type- W2 Contract YOE : 5-8 Must Haves: SRE DevSecOps SRE Roadmaps AWS CI/CD JD: Are you passionate about ensuring the reliability and scalability of complex systems Do you thrive on impl...
Title: Site Reliability Engineer (SRE) / Application Support Engineer
Work Location : Detroit MI - Onsite
Job type- W2 Contract
YOE : 5-8
Must Haves: SRE DevSecOps SRE Roadmaps AWS CI/CD
JD: Are you passionate about ensuring the reliability and scalability of complex systems Do you thrive on implementing efficient solutions to prevent and resolve incidents We are seeking a talented and motivated Site Reliability Engineer (SRE) to join our dynamic team.
The Work:
Collaborate with cross-functional teams to design build and maintain robust scalable and fault-tolerant systems
Work closely with development teams and architects to advocate for reliability best practices during the application development lifecycle
Design and implement monitoring and alerting to provide real-time visibility into user experience and system health and performance
Monitor and analyze system performance proactively identifying potential issues and implementing solutions to ensure optimal performance and reliability
Develop and maintain automated tools and processes to streamline operational tasks and reduce manual interventions
Participate in incident response and post-mortems contributing to continuous improvement efforts
Conduct capacity planning and resource optimization to handle growing demands on our infrastructure
Continuously research and evaluate new technologies and practices to enhance the reliability and efficiency of our systems
Conduct capacity planning and resource optimization to handle growing demands on our infrastructure
Continuously research and evaluate new technologies and practices to enhance the reliability and efficiency of our systems
The Skills You Bring:
Bachelors degree in computer science Engineering or related fields preferred (or equivalent practical experience)
Strong verbal and written communication skills
Experience of overall 4-8 years of managing an SRE or DevOps team with observability workload.
4-8 years of Agile Management owning SRE roadmaps and deliverables using Scrum / Kanban
4-8 years of delivering projects alongside a constant flow of side intake and production response workloads
Experience presenting to leadership and collaborate effectively/communicate technical concepts to non-technical business stakeholders
Proven 5 years experience as a Site Reliability Engineer or similar role in a production environment
Applied AWS/Cloud Certification (AWS Cloud Architect DevOps/SysOps) including experience with ASG Fargate Lambda Aurora DB Dynamo DB ALB/NLB
5 years working experience with CI/CD pipelines (Gitlab) and developing infrastructure-as-code (Terraform Python Ansible etc.)
Applied experience with Linux and Windows platforms Java EE JavaScript Spring Spring Boot REST API/Micro Services Shell Scripting Python PL/SQL and databases specifically Oracle
Working knowledge of observability platforms like Splunk Dynatrace
Working experience with designing Observability for enterprise applications
Experienced knowledge of system administration DevSecOps
Development experience along with cloud and physical servers
Understanding and experience working with business product and engineering teams in developing SLI SLO and SLAs
Conduct capacity planning and resource optimization to handle growing demands on our infrastructure
Other Skills & Experience Desired:
Strong knowledge of Linux/Unix systems and network protocols
Familiarity with cybersecurity best practices and principles
Ability to lead triage calls including working across multiple divisions to resolve issues.