Senior Site Reliability Engineer (SRE) DevOps Engineer

TekWissen LLC

Not Interested
Bookmark
Report This Job

profile Job Location:

Aliso Viejo, CA - USA

profile Monthly Salary: Not Disclosed
Posted on: 6 hours ago
Vacancies: 1 Vacancy

Job Summary

Overview:
TekWissen is a global workforce management provider headquartered in Ann Arbor Michigan that offers strategic talent solutions to our clients world-wide. Our client provider of digital technology and transformation information technology and services
Position: Senior Site Reliability Engineer (SRE) / DevOps Engineer
Location: Aliso Viejo CA
Duration: 11 Months
Job Type: Temporary Assignment
Work Type: Onsite
Job Description
  • We are seeking a highly experienced SRE / DevOps Engineer to support and scale a Kubernetes-based API Gateway platform built on a Java technology stack.
  • The role focuses on reliability observability automation and performance while also contributing to POCs around next-generation AI Gateway capabilities.
Key Responsibilities

Platform Reliability & Operations
  • Own reliability availability scalability and performance of API Gateway services running on Kubernetes
  • Design and implement SRE best practices including SLIs SLOs SLAs error budgets and incident management
  • Lead production readiness reviews root cause analysis (RCA) and post-incident improvements
  • Drive capacity planning performance tuning and resilience testing
Kubernetes & Cloud Engineering
  • Manage and optimize Kubernetes clusters (EKS / AKS / GKE / On-prem)
  • Develop and maintain Helm charts manifests and deployment strategies
  • Implement rollout strategies such as blue-green canary and rolling deployments
  • Collaborate with development teams to ensure cloud-native design patterns
Observability & Monitoring (Strong Focus)
  • Build and maintain enterprise-grade observability (O11y) solutions:
  • Prometheus & Grafana for metrics and dashboards
  • Splunk for centralized logging and alerting
  • OpenTelemetry for distributed tracing
  • Define actionable alerts and dashboards for platform and application health
  • Improve MTTR through better visibility and automation
CI/CD & Automation
  • Design and maintain CI/CD pipelines (Jenkins GitHub Actions GitLab CI etc.)
  • Automate infrastructure using Infrastructure as Code (Terraform CloudFormation etc.)
  • Develop automation scripts using Python Bash or Groovy
Security & Compliance
  • Implement DevSecOps practices including secrets management image scanning and RBAC
  • Work closely with security teams on vulnerability remediation and compliance controls
Innovation & POCs
  • Actively contribute to POCs for AI Gateway / Intelligent API Gateway initiatives
  • Evaluate and prototype integrations with AI/ML-driven routing observability and security features
  • Stay current with emerging SRE cloud and AI gateway technologies
Required Skills & Qualifications

Must Have
  • 7 8 years of experience in SRE / DevOps / Platform Engineering
  • Strong hands-on experience with Kubernetes in production environments
  • Solid understanding of Java-based applications and JVM performance considerations
  • Deep expertise in Splunk Prometheus Grafana and observability practices
  • Experience operating API Gateway platforms (Kong Apigee NGINX Istio etc.)
  • Strong Linux fundamentals and networking knowledge (TCP/IP DNS HTTP TLS)
  • Experience with cloud platforms (AWS / Azure / GCP)
Nice to Have
  • Experience with OpenTelemetry and distributed tracing
  • Exposure to AI Gateway / Intelligent Traffic Management concepts
  • Experience with service mesh (Istio / Linkerd)
  • Certification in Kubernetes (CKA / CKAD) or Cloud platforms
Soft Skills
  • Strong troubleshooting and problem-solving skills
  • Ability to work cross-functionally with developers architects and security teams
  • Proactive mindset with a passion for automation and reliability
  • Good documentation and communication skills
TekWissen Group is an equal opportunity employer supporting workforce diversity.
Overview: TekWissen is a global workforce management provider headquartered in Ann Arbor Michigan that offers strategic talent solutions to our clients world-wide. Our client provider of digital technology and transformation information technology and services Position: Senior Si...
View more view more

Key Skills

  • Kubernetes
  • FMEA
  • Continuous Improvement
  • Elasticsearch
  • Go
  • Root cause Analysis
  • Maximo
  • CMMS
  • Maintenance
  • Mechanical Engineering
  • Manufacturing
  • Troubleshooting