Site Reliability Engineer

Madrid - Spain

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

We are looking for an experienced Site Reliability Engineer to ensure the stability scalability and operational excellence of a Kubernetes-based platform running in a hybrid environment.

The project is entering a pivotal phase with a major go-live planned for mid-February and a target audience of 75000 users. User onboarding is already underway with over 5000 users connected and expected to be active by year-end. While the system is stable we anticipate increased activity and new challenges in January February and after the go-livemaking this an exciting opportunity to make a real role focuses on performance optimization scaling strategies observability and reliability engineering.

Required Skills:

4 years of experience as SRE / DevOps Engineer
Strong hands-on experience with Kubernetes in production
Experience working with hybrid infrastructure (on-prem cloud)
Solid knowledge of PostgreSQL performance tuning and scaling
Experience with Qdrant or other vector databases
Experience with Helm Kubernetes autoscaling and resource optimization
Familiarity with observability stacks (Prometheus Grafana ELK/Loki)
Understanding of performance engineering and load testing
Experience with Linux systems and networking
Strong troubleshooting and incident-management skills

Nice to Have:

Experience with STACKIT or other sovereign clouds
Experience with PgBouncer
Knowledge of SRE practices (SLO/SLI)
Experience in regulated or public-sector environments
German language skills

Responsibilities:

Operate and optimize hybrid infrastructure (on-prem & STACKIT)
Manage and scale Kubernetes clusters
Optimize Helm charts resource usage and autoscaling
Conduct performance load and stress testing
Ensure reliability availability and monitoring of production systems
Tune and operate PostgreSQL
Operate and optimize vector databases (e.g. Qdrant)
Implement monitoring logging and alerting
Support incident response and capacity planning

We offer*:

Flexible working format - remote office-based or flexible
A competitive salary and good compensation package
Personalized career growth
Professional development tools (mentorship program tech talks and trainings centers of excellence and more)
Active tech communities with regular knowledge sharing
Education reimbursement
Memorable anniversary presents
Corporate events and team buildings
Other location-specific benefits

*not applicable for freelancers

Required Experience:

We are looking for an experienced Site Reliability Engineer to ensure the stability scalability and operational excellence of a Kubernetes-based platform running in a hybrid environment.The project is entering a pivotal phase with a major go-live planned for mid-February and a target audience of 750...