Google Cloud DevOps Site Reliability Engineer (SRE)

Alpharetta, GA - USA

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

Role: Google Cloud DevOps / Site Reliability Engineer (SRE)

Location: Alpharetta GA
Experience: 8-12 Years (Senior Level)

Job Summary

We are seeking an experienced Google Cloud DevOps / SRE Engineer to design build and operate highly reliable scalable and secure cloud infrastructure on Google Cloud Platform (GCP). The ideal candidate will bring deep Linux expertise strong cloud networking and security knowledge and hands-on experience with automation CI/CD and Kubernetes-based deployments. This role plays a critical part in ensuring system reliability performance and operational excellence across large-scale distributed systems.

Key Responsibilities

Cloud Infrastructure & Platform Engineering

Design deploy and manage cloud infrastructure using Google Cloud Platform services including Compute Engine GKE VPC IAM Cloud Storage and Cloud SQL.
Architect and support highly available scalable and fault-tolerant systems on GCP.
Implement and manage Shared VPCs VPC peering firewall rules load balancers DNS and VPN tunnels.

DevOps & Automation

Build and maintain CI/CD pipelines using Jenkins (Declarative & Scripted) and GitHub Actions.
Automate infrastructure provisioning and configuration using Terraform including module development remote state management dependency handling and DRY principles.
Implement modern deployment strategies such as Canary releases and Blue/Green deployments.
Manage container artifacts using Docker and Helm.

Site Reliability & Operations

Ensure high availability performance and reliability of production systems.
Troubleshoot complex system issues including CPU memory disk I/O bottlenecks kernel issues and system boot failures.
Analyze logs and metrics to proactively identify and resolve performance and stability issues.
Support incident response root cause analysis and post-incident reviews.

Linux Systems Engineering (Must Have)

Demonstrate deep hands-on expertise with Linux systems (RHEL Ubuntu CentOS).
Perform kernel tuning system optimization storage management (LVM) and systemd administration.
Maintain OS-level security patching and performance best practices.

Security & Identity Management

Implement and troubleshoot Cloud IAM service accounts and Workload Identity Federation.
Enforce least privilege access and security best practices across environments.
Partner with security teams to maintain compliance and secure cloud operations.

Collaboration & Process

Work closely with application teams architects and security stakeholders.
Participate in on-call rotations and incident management processes.
Contribute to operational documentation runbooks and best practices.

Required Skills & Qualifications

Must-Have Skills

Strong hands-on experience with Google Cloud Platform (GCP).
Deep expertise in Linux systems engineering (RHEL Ubuntu CentOS).
Proficiency in at least one programming language: Python Go (Golang) or Java.
Strong troubleshooting and debugging skills across infrastructure and application layers.
Hands-on experience with Terraform for infrastructure as code.
Experience with CI/CD pipelines using Jenkins and/or GitHub Actions.
Kubernetes experience with GKE Docker and Helm.

Preferred Qualifications

GCP Certifications:
- Google Professional Cloud DevOps Engineer
- Google Professional Cloud Architect
CKA (Certified Kubernetes Administrator).
Experience supporting large-scale distributed systems and microservices architectures.
Familiarity with ITIL processes Change Advisory Board (CAB) workflows and incident management.

Soft Skills

Strong analytical and problem-solving abilities.
Excellent communication skills with the ability to collaborate across teams.
Ownership mindset with a focus on reliability and continuous improvement.
Ability to work in fast-paced production-critical environments.

Role: Google Cloud DevOps / Site Reliability Engineer (SRE) Location: Alpharetta GAExperience: 8-12 Years (Senior Level) Job Summary We are seeking an experienced Google Cloud DevOps / SRE Engineer to design build and operate highly reliable scalable and secure cloud infrastructure on Google Cloud P...

Role: Google Cloud DevOps / Site Reliability Engineer (SRE)

Location: Alpharetta GA
Experience: 8-12 Years (Senior Level)

Job Summary

Key Responsibilities

Cloud Infrastructure & Platform Engineering

Design deploy and manage cloud infrastructure using Google Cloud Platform services including Compute Engine GKE VPC IAM Cloud Storage and Cloud SQL.
Architect and support highly available scalable and fault-tolerant systems on GCP.
Implement and manage Shared VPCs VPC peering firewall rules load balancers DNS and VPN tunnels.

DevOps & Automation

Build and maintain CI/CD pipelines using Jenkins (Declarative & Scripted) and GitHub Actions.
Automate infrastructure provisioning and configuration using Terraform including module development remote state management dependency handling and DRY principles.
Implement modern deployment strategies such as Canary releases and Blue/Green deployments.
Manage container artifacts using Docker and Helm.

Site Reliability & Operations

Ensure high availability performance and reliability of production systems.
Troubleshoot complex system issues including CPU memory disk I/O bottlenecks kernel issues and system boot failures.
Analyze logs and metrics to proactively identify and resolve performance and stability issues.
Support incident response root cause analysis and post-incident reviews.

Linux Systems Engineering (Must Have)

Demonstrate deep hands-on expertise with Linux systems (RHEL Ubuntu CentOS).
Perform kernel tuning system optimization storage management (LVM) and systemd administration.
Maintain OS-level security patching and performance best practices.

Security & Identity Management

Implement and troubleshoot Cloud IAM service accounts and Workload Identity Federation.
Enforce least privilege access and security best practices across environments.
Partner with security teams to maintain compliance and secure cloud operations.

Collaboration & Process

Work closely with application teams architects and security stakeholders.
Participate in on-call rotations and incident management processes.
Contribute to operational documentation runbooks and best practices.

Required Skills & Qualifications

Must-Have Skills

Strong hands-on experience with Google Cloud Platform (GCP).
Deep expertise in Linux systems engineering (RHEL Ubuntu CentOS).
Proficiency in at least one programming language: Python Go (Golang) or Java.
Strong troubleshooting and debugging skills across infrastructure and application layers.
Hands-on experience with Terraform for infrastructure as code.
Experience with CI/CD pipelines using Jenkins and/or GitHub Actions.
Kubernetes experience with GKE Docker and Helm.

Preferred Qualifications

GCP Certifications:
- Google Professional Cloud DevOps Engineer
- Google Professional Cloud Architect
CKA (Certified Kubernetes Administrator).
Experience supporting large-scale distributed systems and microservices architectures.
Familiarity with ITIL processes Change Advisory Board (CAB) workflows and incident management.

Soft Skills

Strong analytical and problem-solving abilities.
Excellent communication skills with the ability to collaborate across teams.
Ownership mindset with a focus on reliability and continuous improvement.
Ability to work in fast-paced production-critical environments.

Key Skills

Kubernetes
FMEA
Continuous Improvement
Elasticsearch
Go
Root cause Analysis
Maximo
CMMS
Maintenance
Mechanical Engineering
Manufacturing
Troubleshooting

Apply Now

About Company

Purple Drive

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click

AI Resume Builder

Create an ATS-ready CV in minutes

AI Cover Letter

Write a personalized letter instantly

Google Cloud DevOps Site Reliability Engineer (SRE)

Alpharetta, GA - USA

Job Summary

Role: Google Cloud DevOps / Site Reliability Engineer (SRE)

Job Summary

Key Responsibilities

Cloud Infrastructure & Platform Engineering

DevOps & Automation

Site Reliability & Operations

Linux Systems Engineering (Must Have)

Security & Identity Management

Collaboration & Process

Required Skills & Qualifications

Must-Have Skills

Preferred Qualifications

Soft Skills

Role: Google Cloud DevOps / Site Reliability Engineer (SRE)

Job Summary

Key Responsibilities

Cloud Infrastructure & Platform Engineering

DevOps & Automation

Site Reliability & Operations

Linux Systems Engineering (Must Have)

Security & Identity Management

Collaboration & Process

Required Skills & Qualifications

Must-Have Skills

Preferred Qualifications

Soft Skills

Key Skills

About Company

Related Jobs