drjobs Observability Platform Engineer

Observability Platform Engineer

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Dallas - USA

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

Do you want to tackle the biggest questions in finance with near infinite compute power at your fingertips

G-Research is a leading quantitative research and technology firm with offices in London and Dallas.

We are proud to employ some of the best people in their field and to nurture their talent in a dynamic flexible and highly stimulating culture where world-beating ideas are cultivated and rewarded.

This is a hybrid role based in our new Dallas infrastructure hub where we work on the latest technologies in a cutting-edge environment.

The role

As an Engineer on the Observability Platform team youll manage the critical entry and exit points to our telemetry services ensuring engineers across the business can reliably produce and consume telemetry data for their services.

Youll work closely with the Observability Engineering team to design and implement robust scalable data pipelines that ingest route and visualise telemetry in predictable and composable ways. Your work will empower engineers to gain actionable insight into their systems enabling informed decision-making and operational efficiency.

Operating under the broader Platform Engineering department our team also holds responsibility for enhancing the reliability of our entire High-Performance Computing (HPC) stack from networking and storage through to compute and application platforms.

Were looking for an engineer with deep expertise in observability stacks and a keen understanding of the unique challenges associated with managing telemetry at cloud-scale volumes. Youre passionate about building systems that give customers clear consistent access to telemetry data helping them run their services as effectively as possible.

Experience running large-scale observability platforms for a diverse customer base is essential. Familiarity with core Site Reliability Engineering (SRE) principles is highly beneficial.

Key responsibilities of the role include:

  • Being a key contributor to the development of our observability and reliability platforms

  • Contributing to the roadmap for observability tooling ensuring alignment with business goals and scalability requirements

  • Working with telemetry data at enormous scale ingesting data from industry-leading GPU clusters

  • Working with AWS services ensuring seamless integration with the observability platform

  • Collaborating with cross functional engineering teams to establish observability as a core function of the development lifecycle

  • Working closely with application teams to ensure observability systems are fully integrated and providing the necessary insights

  • Enabling SRE frameworks promoting SLAs SLOs and SLIs and working closely with platform teams to ensure reliability is constantly improving

  • Helping to foster a culture of continuous learning and improvement encouraging adoption of new observability tools and techniques

Who are we looking for

The ideal candidate will have the following skills and experience:

  • Proven experience on observability or SRE teams in a cloud-native or hybrid-cloud environment running platforms in production and at scale

  • Well versed in reliability engineering concepts including different types of testing progressive deployments error budgets the role observability plays and fault-tolerant design

  • Hands-on experience with modern observability tools and frameworks such as Prometheus OTEL (OpenTelemetry) Grafana and enterprise SaaS Observability platforms such as Datadog and Dynatrace

  • Expertise in designing building and scaling observability solutions for distributed systems

  • Customer focused with an enthusiasm for providing infrastructure as a service and defaulting to a product lens when evaluating platform scale problems

  • Excellent communication skills and the ability to collaborate with cross-functional teams

  • Experience with cloud platforms such as AWS Azure or Google Cloud

  • Familiarity with microservices architecture and containerised environments such as Kubernetes and Docker

  • Knowledge of infrastructure as code (IaC) and automation tools such as Terraform and Ansible

Why should you apply

  • Market-leading compensation plus annual discretionary bonus

  • Lunch provided in the office (via GrubHub)

  • Informal dress code and excellent work/life balance

  • Excellent paid time off allowance of 25 days

  • Sick days military leave and family and medical leave

  • Generous 401(k) plan

  • 16-weeks fully paid parental leave

  • Medical and Prescription Dental and Vision insurance

  • Life and Accidental Death & Dismemberment (AD&D) insurance

  • Employee Assistance and Wellness programs

  • Generous relocation allowance and support

  • Great selection of office snacks and hot and cold drinks

  • Free on-site gym and car parking

This role is employed through our US affiliate.

G-Research is committed to cultivating and preserving an inclusive work environment. We are an ideas-driven business and we place great value on diversity of experience and opinions.

We want to ensure that applicants receive a recruitment experience that enables them to perform at their best. If you have a disability or special need that requires accommodation please let us know in the relevant section

Employment Type

Full Time

Company Industry

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.