Job Title: Senior Site Reliability Engineer / DevOps Engineer
Location: Bothell WA
Duration: Contract
Term: 6 months
Job Description:
Experience Desired: 7 Years.
Key Responsibilities
Platform Reliability & Operations
- Own reliability availability scalability and performance of API Gateway services running on Kubernetes
- Design and implement SRE best practices including SLIs SLOs SLAs error budgets and incident management
- Lead production readiness reviews root cause analysis (RCA) and post-incident improvements
- Drive capacity planning performance tuning and resilience testing
- Kubernetes & Cloud Engineering
- Manage and optimize Kubernetes clusters (EKS / AKS / GKE / On-prem)
- Develop and maintain Helm charts manifests and deployment strategies
- Implement rollout strategies such as blue-green canary and rolling deployments
- Collaborate with development teams to ensure cloud-native design patterns
- Observability & Monitoring (Strong Focus)
- Build and maintain enterprise-grade observability (O11y) solutions:
- Prometheus & Grafana for metrics and dashboards
- Splunk for centralized logging and alerting
- OpenTelemetry for distributed tracing
- Define actionable alerts and dashboards for platform and application health
- Improve MTTR through better visibility and automation
- CI/CD & Automation
- Design and maintain CI/CD pipelines (Jenkins GitHub Actions GitLab CI etc.)
- Automate infrastructure using Infrastructure as Code (Terraform CloudFormation etc.)
- Develop automation scripts using Python Bash or Groovy
- Security & Compliance
- Implement DevSecOps practices including secrets management image scanning and RBAC
- Work closely with security teams on vulnerability remediation and compliance controls
- Innovation & POCs
- Actively contribute to POCs for AI Gateway / Intelligent API Gateway initiatives
- Evaluate and prototype integrations with AI/ML-driven routing observability and security features
- Stay current with emerging SRE cloud and AI gateway technologies
Soft Skills
- Strong troubleshooting and problem-solving skills
- Ability to work cross-functionally with developers architects and security teams
- Proactive mindset with a passion for automation and reliability
- Good documentation and communication skills
Key Skills:
SRE Devops Java Kubernetes Observability
Job Title: Senior Site Reliability Engineer / DevOps Engineer Location: Bothell WA Duration: Contract Term: 6 months Job Description: Experience Desired: 7 Years. Key Responsibilities Platform Reliability & Operations Own reliability availability scalability and performance of API Gateway servic...
Job Title: Senior Site Reliability Engineer / DevOps Engineer
Location: Bothell WA
Duration: Contract
Term: 6 months
Job Description:
Experience Desired: 7 Years.
Key Responsibilities
Platform Reliability & Operations
- Own reliability availability scalability and performance of API Gateway services running on Kubernetes
- Design and implement SRE best practices including SLIs SLOs SLAs error budgets and incident management
- Lead production readiness reviews root cause analysis (RCA) and post-incident improvements
- Drive capacity planning performance tuning and resilience testing
- Kubernetes & Cloud Engineering
- Manage and optimize Kubernetes clusters (EKS / AKS / GKE / On-prem)
- Develop and maintain Helm charts manifests and deployment strategies
- Implement rollout strategies such as blue-green canary and rolling deployments
- Collaborate with development teams to ensure cloud-native design patterns
- Observability & Monitoring (Strong Focus)
- Build and maintain enterprise-grade observability (O11y) solutions:
- Prometheus & Grafana for metrics and dashboards
- Splunk for centralized logging and alerting
- OpenTelemetry for distributed tracing
- Define actionable alerts and dashboards for platform and application health
- Improve MTTR through better visibility and automation
- CI/CD & Automation
- Design and maintain CI/CD pipelines (Jenkins GitHub Actions GitLab CI etc.)
- Automate infrastructure using Infrastructure as Code (Terraform CloudFormation etc.)
- Develop automation scripts using Python Bash or Groovy
- Security & Compliance
- Implement DevSecOps practices including secrets management image scanning and RBAC
- Work closely with security teams on vulnerability remediation and compliance controls
- Innovation & POCs
- Actively contribute to POCs for AI Gateway / Intelligent API Gateway initiatives
- Evaluate and prototype integrations with AI/ML-driven routing observability and security features
- Stay current with emerging SRE cloud and AI gateway technologies
Soft Skills
- Strong troubleshooting and problem-solving skills
- Ability to work cross-functionally with developers architects and security teams
- Proactive mindset with a passion for automation and reliability
- Good documentation and communication skills
Key Skills:
SRE Devops Java Kubernetes Observability
View more
View less