Anticipated Contract End Date/Length: August 28 2026
Work Set Up: Hybrid (must be eligible for BPSS)
Our client in the Information Technology and Services industry is looking for a Site Reliability Engineer (SRE) to support and enhance a complex multi-cloud Kubernetes platform environment. This role is focused on driving platform reliability automation observability and security across AWS Azure and on-premise infrastructure.
The successful candidate will play a key role in improving uptime reducing operational toil through GitOps and automation strengthening platform security posture and enabling scalable onboarding of new tenants and workloads. This is a hands-on engineering role operating within regulated environments and modern cloud-native architectures.
What you will do:
- Operate and enhance Kubernetes platforms across AWS Azure and on-premise environments.
- Lead incident response problem management and root cause analysis activities.
- Deliver cluster lifecycle management including upgrades patching node pool management CNI and CSI configuration ingress management and Rancher operations.
- Own observability strategy including dashboards alerting monitoring and definition of SLOs and SLIs.
- Implement GitOps practices using Fleet and reduce operational toil through automation and governance.
- Apply secure API gateway and Web Application Firewall (WAF) patterns.
- Design and support distributed systems including event brokers and asynchronous messaging architectures.
- Maintain platform security posture including CVE remediation GRC controls and security scanning pipelines.
- Provision and manage infrastructure using Terraform and Crossplane as orchestration layers.
- Implement and maintain CI/CD pipelines using Concourse GitHub Actions and Azure DevOps.
- Ensure compliance with PCI DSS and GDPR security patterns.
Qualifications :
- Deep expertise in Kubernetes Rancher GitOps Linux and cloud networking.
- Strong experience operating in hybrid cloud environments across AWS Azure and on-premise platforms.
- Strong automation and scripting skills in Python Go Bash PowerShell .
- Proven experience with Infrastructure as Code using Terraform and Crossplane.
- Experience implementing and managing observability tooling including Grafana Prometheus Jaeger or Tempo CloudWatch Loki and OpenTelemetry.
- Strong understanding of API gateway and Web Application Firewall patterns.
- Experience working with distributed systems and event-driven architectures.
- Experience operating within regulated environments including PCI DSS and GDPR.
- Knowledge of service mesh technologies such as Istio or Kuma is desirable.
- AWS operational experience is advantageous.
- Experience within payments or other regulated industries is beneficial.
Additional Information :
All your information will be kept confidential according to EEO guidelines.
Candidates must be legally authorized to live and work in the country where the position is based without requiring employer sponsorship.
HelloKindred is committed to fair transparent and inclusive hiring practices. We assess candidates based on skills experience and role-related requirements.
We appreciate your interest in this opportunity. While we review every application carefully only candidates selected for an interview will be contacted.
HelloKindred is an equal opportunity employer. We welcome applicants of all backgrounds and do not discriminate on the basis of race colour religion sex gender identity or expression sexual orientation age national origin disability veteran status or any other protected characteristic under applicable law.
Remote Work :
No
Employment Type :
Contract
Anticipated Contract End Date/Length: August 28 2026Work Set Up: Hybrid (must be eligible for BPSS)Our client in the Information Technology and Services industry is looking for a Site Reliability Engineer (SRE) to support and enhance a complex multi-cloud Kubernetes platform environment. This role i...
Anticipated Contract End Date/Length: August 28 2026
Work Set Up: Hybrid (must be eligible for BPSS)
Our client in the Information Technology and Services industry is looking for a Site Reliability Engineer (SRE) to support and enhance a complex multi-cloud Kubernetes platform environment. This role is focused on driving platform reliability automation observability and security across AWS Azure and on-premise infrastructure.
The successful candidate will play a key role in improving uptime reducing operational toil through GitOps and automation strengthening platform security posture and enabling scalable onboarding of new tenants and workloads. This is a hands-on engineering role operating within regulated environments and modern cloud-native architectures.
What you will do:
- Operate and enhance Kubernetes platforms across AWS Azure and on-premise environments.
- Lead incident response problem management and root cause analysis activities.
- Deliver cluster lifecycle management including upgrades patching node pool management CNI and CSI configuration ingress management and Rancher operations.
- Own observability strategy including dashboards alerting monitoring and definition of SLOs and SLIs.
- Implement GitOps practices using Fleet and reduce operational toil through automation and governance.
- Apply secure API gateway and Web Application Firewall (WAF) patterns.
- Design and support distributed systems including event brokers and asynchronous messaging architectures.
- Maintain platform security posture including CVE remediation GRC controls and security scanning pipelines.
- Provision and manage infrastructure using Terraform and Crossplane as orchestration layers.
- Implement and maintain CI/CD pipelines using Concourse GitHub Actions and Azure DevOps.
- Ensure compliance with PCI DSS and GDPR security patterns.
Qualifications :
- Deep expertise in Kubernetes Rancher GitOps Linux and cloud networking.
- Strong experience operating in hybrid cloud environments across AWS Azure and on-premise platforms.
- Strong automation and scripting skills in Python Go Bash PowerShell .
- Proven experience with Infrastructure as Code using Terraform and Crossplane.
- Experience implementing and managing observability tooling including Grafana Prometheus Jaeger or Tempo CloudWatch Loki and OpenTelemetry.
- Strong understanding of API gateway and Web Application Firewall patterns.
- Experience working with distributed systems and event-driven architectures.
- Experience operating within regulated environments including PCI DSS and GDPR.
- Knowledge of service mesh technologies such as Istio or Kuma is desirable.
- AWS operational experience is advantageous.
- Experience within payments or other regulated industries is beneficial.
Additional Information :
All your information will be kept confidential according to EEO guidelines.
Candidates must be legally authorized to live and work in the country where the position is based without requiring employer sponsorship.
HelloKindred is committed to fair transparent and inclusive hiring practices. We assess candidates based on skills experience and role-related requirements.
We appreciate your interest in this opportunity. While we review every application carefully only candidates selected for an interview will be contacted.
HelloKindred is an equal opportunity employer. We welcome applicants of all backgrounds and do not discriminate on the basis of race colour religion sex gender identity or expression sexual orientation age national origin disability veteran status or any other protected characteristic under applicable law.
Remote Work :
No
Employment Type :
Contract
View more
View less