Senior Observability Engineer

Sofia - Bulgaria

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Department:

Engineering

Job Summary

Our mission is to enable everyone to build wealth

We reinvent how trading and investing work by creating exceptional products people love.

Fostering a culture of excellence and high velocity is the key to our success.

Today we serve over 4.5 million clients with more than 30 billion in assets under management - a testament to the scale and trust weve built in just a few years.

Own and evolve Trading 212s observability and performance ecosystem across cloud and on-prem Kubernetes environments.

What youll do

Design automate and optimize observability infrastructure (Prometheus CloudWatch Elasticsearch Kafka etc.) using IaC and GitOps.
Build Grafana dashboards and implement a smart alerting strategy to surface actionable insights.
Monitor and analyze system performance identify bottlenecks and drive improvements in reliability and cost-efficiency.
Collaborate with product QA and engineering teams to embed observability best practices.
Maintain clear documentation and mentor engineers fostering a culture of data-driven performance.
Plan and test Multi-AZ/Region DR and resilience scenarios.

What you need to have

5 years of experience in DevOps SRE or Systems Engineering focusing on observability for large-scale distributed systems.
Proven experience deploying and maintaining observability tools.
- Metrics & Monitoring: Strong proficiency with Prometheus and Grafana; experience with AWS CloudWatch.
- Log Management: Deep knowledge of the ELK stack (Elasticsearch Logstash Kibana Fluentbit).
Cloud & Containers: Hands-on experience with AWS Docker and Kubernetes.
Automation & IaC: Skilled in Python Go or Bash for scripting and proficient with Terraform (Ansible/Puppet a plus).
Systems Knowledge: Strong grasp of distributed systems networking and Linux/Unix internals.
Problem-Solving: Analytical detail-oriented and methodical in root cause analysis and troubleshooting.

Nice to have

Experience managing and scaling high-throughput Kafka clusters.
Experience with CI/CD pipelines (e.g. Github Actions) for managing infrastructure deployments.
Familiarity with distributed tracing systems (Jaeger OpenTelemetry).
A background in Site Reliability Engineering (SRE)
understanding of SLOs SLIs and error budgets.

We offer

Challenges that will help you grow and realize your potential really fast
Opportunity to make a big Impact - you will build innovative services used by millions of investors to build wealth
Work with smart spirited helpful high-performing colleagues with a common goal
An environment where nothing is set in stone
Appreciation for your talent and ideas
Generous remuneration package including annual bonuses
Excellent social benefits package including private health insurance and sports card
25 days of paid vacation per year
Delicious treats and a spacious game room

Are you ready to accelerate your career with us Wed love to hear from you!

We thank all applicants but only candidates selected for an interview will be contacted.

All personal data of applicants is protected by the law and will be treated with strict confidentiality.

Required Experience:

Senior IC

Our mission is to enable everyone to build wealthWe reinvent how trading and investing work by creating exceptional products people love.Fostering a culture of excellence and high velocity is the key to our success.Today we serve over 4.5 million clients with more than 30 billion in assets under man...

Our mission is to enable everyone to build wealth

We reinvent how trading and investing work by creating exceptional products people love.

Fostering a culture of excellence and high velocity is the key to our success.

Today we serve over 4.5 million clients with more than 30 billion in assets under management - a testament to the scale and trust weve built in just a few years.

Own and evolve Trading 212s observability and performance ecosystem across cloud and on-prem Kubernetes environments.

What youll do

Design automate and optimize observability infrastructure (Prometheus CloudWatch Elasticsearch Kafka etc.) using IaC and GitOps.
Build Grafana dashboards and implement a smart alerting strategy to surface actionable insights.
Monitor and analyze system performance identify bottlenecks and drive improvements in reliability and cost-efficiency.
Collaborate with product QA and engineering teams to embed observability best practices.
Maintain clear documentation and mentor engineers fostering a culture of data-driven performance.
Plan and test Multi-AZ/Region DR and resilience scenarios.

What you need to have

5 years of experience in DevOps SRE or Systems Engineering focusing on observability for large-scale distributed systems.
Proven experience deploying and maintaining observability tools.
- Metrics & Monitoring: Strong proficiency with Prometheus and Grafana; experience with AWS CloudWatch.
- Log Management: Deep knowledge of the ELK stack (Elasticsearch Logstash Kibana Fluentbit).
Cloud & Containers: Hands-on experience with AWS Docker and Kubernetes.
Automation & IaC: Skilled in Python Go or Bash for scripting and proficient with Terraform (Ansible/Puppet a plus).
Systems Knowledge: Strong grasp of distributed systems networking and Linux/Unix internals.
Problem-Solving: Analytical detail-oriented and methodical in root cause analysis and troubleshooting.

Nice to have

Experience managing and scaling high-throughput Kafka clusters.
Experience with CI/CD pipelines (e.g. Github Actions) for managing infrastructure deployments.
Familiarity with distributed tracing systems (Jaeger OpenTelemetry).
A background in Site Reliability Engineering (SRE)
understanding of SLOs SLIs and error budgets.

We offer

Challenges that will help you grow and realize your potential really fast
Opportunity to make a big Impact - you will build innovative services used by millions of investors to build wealth
Work with smart spirited helpful high-performing colleagues with a common goal
An environment where nothing is set in stone
Appreciation for your talent and ideas
Generous remuneration package including annual bonuses
Excellent social benefits package including private health insurance and sports card
25 days of paid vacation per year
Delicious treats and a spacious game room

Are you ready to accelerate your career with us Wed love to hear from you!

We thank all applicants but only candidates selected for an interview will be contacted.

All personal data of applicants is protected by the law and will be treated with strict confidentiality.

Required Experience:

Senior IC

Key Skills

APIs
C/C++
Computer Graphics
Go
React
Redux
Node.js
AWS
Library Services
Assembly
GraphQL
High Voltage

Apply Now

About Company

Trading212

Invest in stocks and ETFs with zero commission. Trusted and secure. Practise with virtual £50,000.

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click

AI Resume Builder

Create an ATS-ready CV in minutes

AI Cover Letter

Write a personalized letter instantly

Senior Observability Engineer

Sofia - Bulgaria

Department:

Job Summary

What youll do

What you need to have

Nice to have

We offer

What youll do

What you need to have

Nice to have

We offer

Key Skills

About Company

Related Jobs