T3 Operations & Support Specialist — Compute & OS (PID9066)

Interval


Job Location:

Berlin - Germany

Monthly Salary: Not Disclosed
Posted on: 5 days ago
Vacancies: 1 Vacancy

Job Summary

This is a remote position.

T3 Operations & Support Specialist Compute & OS (PID9066)

  • Contract / Freelance
  • Full-time
  • Remote with travel readiness required (Germany)
  • Start: ASAP

About the role

We are working with a long-standing anchor client to source a T3 Operations & Support Specialist (Compute & OS) for a large-scale cloud-native platform programme supporting a major energy transmission operator in Germany. The platform is a service-oriented hybrid cloud environment providing application teams with self-service capabilities to develop run and operate software products across private and public cloud infrastructure.

In this role you will provide Tier-3 operational ownership for Compute & Operating System services within Local Production (DE) handling complex incidents deep troubleshooting and root cause analysis and driving permanent fixes and preventive measures.

What youll be doing

  • Providing T3 operational ownership for Compute & OS services: handling complex incidents troubleshooting and RCA and driving permanent fixes and preventive measures
  • Ensuring compute/OS readiness for releases and changes: monitoring/alerting coverage performance baselines hardening patch strategy rollback and recovery procedures and runbooks
  • Executing and improving standard operational procedures through automation to reduce toil and improve MTTR and stability
  • Coordinating with Kubernetes Data Network and Storage SMEs to resolve cross-domain production issues
  • Validating deployment artefacts from an operations perspective and enforcing quality assurance measures
  • Monitoring system health performance metrics and service availability across multi-tenant environments
  • Identifying analysing and resolving incidents to minimise service disruption and triggering RCA and corrective actions
  • Implementing monitoring and logging strategies to support audit and compliance requirements
  • Performing routine security scans and remediating identified vulnerabilities


Requirements

What youll need

  • 5 to 10 years in IT operations service delivery or platform operations with demonstrated leadership in mission-critical environments
  • Proven experience implementing and leading Incident Problem Change and Release governance in production
  • Hands-on experience with VMware 8 virtualisation
  • Operating Systems: Red Hat Enterprise Linux and Ubuntu
  • OS tooling: Satellite IPA Certificate Server
  • ITSM/collaboration tooling: Jira Service Management Jira Confluence
  • Fundamental understanding of core operations processes (Incident Change Problem management ITSM) and SRE concepts
  • Experience gathering operational insights from monitoring/observability including SLI/SLA/SLO management and tracking
  • Hands-on experience documenting procedures and enforcing clear runbooks and playbooks
  • Hands-on experience with monitoring and logging tools (e.g. Prometheus Grafana Datadog Mimir Loki)
  • Understanding of modern platform operations (Kubernetes/containers automation observability) sufficient to govern specialists
  • Fluent English and German (C1 minimum in both)

Desirable

  • Experience operating in regulated or high-availability industries (banking telco public sector healthcare)
  • Experience with SRE practices (SLOs/SLIs error budgets) and reliability management
  • Familiarity with enterprise DevOps toolchains (GitLab JFrog Artifactory Backstage Harness)
  • GitOps and IaC awareness (Terraform/OpenTofu ArgoCD Helm)


Benefits

As a freelancer / contractor with us you will enjoy flexible working hours and the freedom to choose your own projects. Our platform gives you access to exciting projects in various industries and supports you in advancing your career. Youll benefit from competitive pay and a dedicated team to help you with any questions you may have. Work independently and utilise our strong network to achieve your professional goals.

This is a remote position.T3 Operations & Support Specialist Compute & OS (PID9066) Contract / Freelance Full-time Remote with travel readiness required (Germany) Start: ASAP About the roleWe are working with a long-standing anchor client to source a T3 Operations & Support Specialist (Compute ...