Role: SRE Infrastructure Engineer
Locations: SFO CA (5 Days Onsite)
Duration: Long term
Employment Type: Contract W2
Job Description:
We are seeking a SRE Infrastructure Resource having 8 years of professional experience ensuring the reliability scalability and performance of Google Cloud-based services through automation monitoring and proactive engineering. Key responsibilities include managing infrastructure as code (Terraform) optimizing GKE/Kubernetes incident response and implementing SLIs/SLOs to minimize manual toil.
This role requires close collaboration with cross functional teams adherence to DevOps and Agile practices and ownership of service quality and delivery.
Key Responsibilities
- GCP Infrastructure Management: Design deploy and maintain robust infrastructure components including VPCs Compute Engine GKE (Kubernetes) and storage solutions.
- Automation & IaC: Utilize Terraform or Deployment Manager to manage cloud resources and build CI/CD pipelines to automate deployments. Minimizing manual repetitive tasks by developing automation scripts and custom tools to streamline deployments and operations.
- Observability & Incident Management: Develop monitoring alerting and logging systems (e.g. Cloud Monitoring Prometheus Grafana). Act as primary on-call to troubleshoot production incidents.
- Incident Management: Serving as a first responder for system outages and conducting deep-dive root cause analysis (post-mortems) to prevent recurrence
- CI/CD Pipeline Management: Designing and supporting automated deployment pipelines using Jenkins ArgoCD Artifactory DevSecOps GitLab CI or GitHub Actions
- Reliability Engineering: Define and maintain Service Level Indicators (SLIs) and Service Level Objectives (SLOs) - Latency Traffic Errors and Saturation
- Optimization & Security: Proactively optimize infrastructure for cost performance and security compliance.
- Site Reliability Engineer Google Cloud Engine AI SRE at Google: Focus specifically on AI workload health and GCE visibility
Mandatory Technical Skills & Competencies
- Experience: 8 years in SRE DevOps or systems engineering specifically with Google Cloud Platform.
- Technical Skills: Deep knowledge of Linux Kubernetes (GKE) networking (VPCs CDNs) and containerization.
- Programming: Proficiency in scripting/programming languages like Python Go or Shell.
- Methodologies: Strong understanding of GitOps CI/CD pipelines and SRE principles (error budgets toil reduction)
- Strong troubleshooting skills across the full stack (network OS application).
- Ability to balance system stability with the need for rapid deployment.
- Observability Tools: Experience implementing monitoring and logging stacks like Prometheus Grafana or Google Cloud Operations Suite
- Excellent collaboration skills to work with development teams for service ownership
Soft Skills
- Strong problem-solving and analytical skills
- Clear communication with technical and non technical stakeholders
- Ownership mindset and production grade engineering discipline
- Ability to work independently and within cross functional teams
About Next Gen Software Solutions LLC:
Next Gen Software Solutions is a trusted provider of IT Staffing and consulting services dedicated to empowering businesses with cutting-edge technology solutions and exceptional talent. We specialize in delivering tailored IT consulting services innovative software solutions and connecting businesses with highly skilled IT professionals. Founded and led by a dedicated U.S. Army solider Next Gen Software Solutions is deeply rooted in the core values of integrity discipline commitment and experience-principles that guide every aspect of our operations.
Equal Employment Opportunity Statement:
Next Gen Software Solutions LLC is an Equal Opportunity Employer. We are committed to fostering an inclusive and diverse workplace where all employees and applicants are treated respect and dignity. We do not discriminate based on race colour religion sex (including pregnancy sexual orientation or gender identity) national origin age genetic information veteran status or any other legally protected characteristic under applicable federal state or local laws.
Role: SRE Infrastructure Engineer Locations: SFO CA (5 Days Onsite) Duration: Long term Employment Type: Contract W2 Job Description: We are seeking a SRE Infrastructure Resource having 8 years of professional experience ensuring the reliability scalability and performance of Google Cloud-based...
Role: SRE Infrastructure Engineer
Locations: SFO CA (5 Days Onsite)
Duration: Long term
Employment Type: Contract W2
Job Description:
We are seeking a SRE Infrastructure Resource having 8 years of professional experience ensuring the reliability scalability and performance of Google Cloud-based services through automation monitoring and proactive engineering. Key responsibilities include managing infrastructure as code (Terraform) optimizing GKE/Kubernetes incident response and implementing SLIs/SLOs to minimize manual toil.
This role requires close collaboration with cross functional teams adherence to DevOps and Agile practices and ownership of service quality and delivery.
Key Responsibilities
- GCP Infrastructure Management: Design deploy and maintain robust infrastructure components including VPCs Compute Engine GKE (Kubernetes) and storage solutions.
- Automation & IaC: Utilize Terraform or Deployment Manager to manage cloud resources and build CI/CD pipelines to automate deployments. Minimizing manual repetitive tasks by developing automation scripts and custom tools to streamline deployments and operations.
- Observability & Incident Management: Develop monitoring alerting and logging systems (e.g. Cloud Monitoring Prometheus Grafana). Act as primary on-call to troubleshoot production incidents.
- Incident Management: Serving as a first responder for system outages and conducting deep-dive root cause analysis (post-mortems) to prevent recurrence
- CI/CD Pipeline Management: Designing and supporting automated deployment pipelines using Jenkins ArgoCD Artifactory DevSecOps GitLab CI or GitHub Actions
- Reliability Engineering: Define and maintain Service Level Indicators (SLIs) and Service Level Objectives (SLOs) - Latency Traffic Errors and Saturation
- Optimization & Security: Proactively optimize infrastructure for cost performance and security compliance.
- Site Reliability Engineer Google Cloud Engine AI SRE at Google: Focus specifically on AI workload health and GCE visibility
Mandatory Technical Skills & Competencies
- Experience: 8 years in SRE DevOps or systems engineering specifically with Google Cloud Platform.
- Technical Skills: Deep knowledge of Linux Kubernetes (GKE) networking (VPCs CDNs) and containerization.
- Programming: Proficiency in scripting/programming languages like Python Go or Shell.
- Methodologies: Strong understanding of GitOps CI/CD pipelines and SRE principles (error budgets toil reduction)
- Strong troubleshooting skills across the full stack (network OS application).
- Ability to balance system stability with the need for rapid deployment.
- Observability Tools: Experience implementing monitoring and logging stacks like Prometheus Grafana or Google Cloud Operations Suite
- Excellent collaboration skills to work with development teams for service ownership
Soft Skills
- Strong problem-solving and analytical skills
- Clear communication with technical and non technical stakeholders
- Ownership mindset and production grade engineering discipline
- Ability to work independently and within cross functional teams
About Next Gen Software Solutions LLC:
Next Gen Software Solutions is a trusted provider of IT Staffing and consulting services dedicated to empowering businesses with cutting-edge technology solutions and exceptional talent. We specialize in delivering tailored IT consulting services innovative software solutions and connecting businesses with highly skilled IT professionals. Founded and led by a dedicated U.S. Army solider Next Gen Software Solutions is deeply rooted in the core values of integrity discipline commitment and experience-principles that guide every aspect of our operations.
Equal Employment Opportunity Statement:
Next Gen Software Solutions LLC is an Equal Opportunity Employer. We are committed to fostering an inclusive and diverse workplace where all employees and applicants are treated respect and dignity. We do not discriminate based on race colour religion sex (including pregnancy sexual orientation or gender identity) national origin age genetic information veteran status or any other legally protected characteristic under applicable federal state or local laws.
View more
View less