Senior Infrastructure Engineer

Kraków - Poland

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

Our client a new Silicon Valley-based profitable B2C product startup building innovative mobile solutions for the planet is now looking for an experienced Infrastructure Engineer (SRE Focus) to help build reliable scalable and observable systems.

Location: Poland
Type: Remote Full-time
Start date: ASAP
About project and position:

Based in Silicon Valley and backed by top-tier VCs is a new mobile innovator delivering exciting new products for consumers across the planet.
The company has a flagship VPN application with over 1B downloads ensuring online privacy and anonymity for our users by creating a private network from a public internet connection.

We are looking for a Senior Infrastructure Engineer (SRE Focus) who combines strong hands-on experience with bare-metal systems Linux and automation and brings an SRE mindset to drive reliability observability and continuous improvement across the global infrastructure.
Roughly 80% of your work will focus on bare-metal networking and automation while the remaining 20% will involve cloud platforms (AWS DigitalOcean) and modern observability systems.

Responsibilities:

Deploy and manage bare-metal infrastructure provisioning OS tuning lifecycle management hardware and network troubleshooting.
Automate infrastructure operations using tools such as Ansible and Python/Go.
Build and enhance observability across systems and services (metrics logging alerting tracing) using Prometheus Grafana ClickHouse etc.
Ensure system reliability and performance by defining SLIs/SLOs developing dashboards and alerts and leading incident response and postmortems to strengthen reliability culture.
Define and maintain SRE policies procedures and best practices ensuring consistent standards for reliability automation and monitoring across teams.
Improve operational workflows contributing to Jira backlog management ticket tracking and sprint planning and helping teams translate SRE priorities into actionable tasks.
Contribute to CI/CD and GitOps workflows ensuring reliable deployments and secure automation pipelines.
Mentor team members and help establish technical standards for reliability automation and monitoring.

Requirements:

5 years in infrastructure/systems engineering with proven experience managing bare-metal and 3 years in cloud infrastructure/SRE role.
Good understanding of Linux systems including performance tuning networking and kernel-level debugging.
Strong knowledge of observability platforms (Grafana Prometheus ClickHouse etc.) and SRE metrics (SLIs/SLOs).
Practical experience with infrastructure automation (Ansible Terraform Vagrant TestInfra or similar).
Experience defining or improving incident response on-call and postmortem processes.
Hands-on experience with cloud platforms (AWS Google Cloud Digital Ocean)
Knowledge of infrastructure security (GPG OpenSSL Security key HSMs.
Experience programming with Python Go Rust or similar languages including writing test cases and performing integration testing.
English - advanced spoken and written

Our client a new Silicon Valley-based profitable B2C product startup building innovative mobile solutions for the planet is now looking for an experienced Infrastructure Engineer (SRE Focus) to help build reliable scalable and observable systems.Location: PolandType: Remote Full-timeStart date: ASAP...

Responsibilities:

Deploy and manage bare-metal infrastructure provisioning OS tuning lifecycle management hardware and network troubleshooting.
Automate infrastructure operations using tools such as Ansible and Python/Go.
Build and enhance observability across systems and services (metrics logging alerting tracing) using Prometheus Grafana ClickHouse etc.
Ensure system reliability and performance by defining SLIs/SLOs developing dashboards and alerts and leading incident response and postmortems to strengthen reliability culture.
Define and maintain SRE policies procedures and best practices ensuring consistent standards for reliability automation and monitoring across teams.
Improve operational workflows contributing to Jira backlog management ticket tracking and sprint planning and helping teams translate SRE priorities into actionable tasks.
Contribute to CI/CD and GitOps workflows ensuring reliable deployments and secure automation pipelines.
Mentor team members and help establish technical standards for reliability automation and monitoring.

Requirements:

5 years in infrastructure/systems engineering with proven experience managing bare-metal and 3 years in cloud infrastructure/SRE role.
Good understanding of Linux systems including performance tuning networking and kernel-level debugging.
Strong knowledge of observability platforms (Grafana Prometheus ClickHouse etc.) and SRE metrics (SLIs/SLOs).
Practical experience with infrastructure automation (Ansible Terraform Vagrant TestInfra or similar).
Experience defining or improving incident response on-call and postmortem processes.
Hands-on experience with cloud platforms (AWS Google Cloud Digital Ocean)
Knowledge of infrastructure security (GPG OpenSSL Security key HSMs.
Experience programming with Python Go Rust or similar languages including writing test cases and performing integration testing.
English - advanced spoken and written