Sr. Manager, Observability Platform Engineering

Databricks

Not Interested
Bookmark
Report This Job

profile Job Location:

Mountain View, CA - USA

profile Monthly Salary: Not Disclosed
Posted on: Yesterday
Vacancies: 1 Vacancy

Job Summary

RDQ427R138

At Databricks we are passionate about enabling data teams to solve the worlds toughest problems from making the next mode of transportation a reality to accelerating the development of medical breakthroughs. We do this by building and running the worlds best data and AI infrastructure platform so our customers can use deep data insights to improve their business. Founded by engineers and customer obsessed we leap at every opportunity to tackle technical challenges from designing next-gen UI/UX for interfacing with data to scaling our services and infrastructure across millions of virtual machines. And were only getting started.

As the Manager of the Observability Platform team you will lead the engineers responsible for building and scaling the next generation of Databricks global observability systems. Your team enables every Databricks engineerand our customersto monitor diagnose and improve the reliability of our platform at massive scale. You will guide the strategy architecture and execution of systems that handle billions of active time series and process petabytes of logs daily ensuring world-class visibility into the health and performance of our products.

In this role you will:

  • Lead the design and development of the next-generation observability platforms that support billions of active time series and process petabytes of logs every day.
  • Oversee infrastructure deployed across nearly a hundred cloud regions empowering internal engineers and customers to effectively monitor the reliability and performance of Databricks.
  • Drive the creation of advanced troubleshooting workflows that accelerate incident diagnosis enabling engineers to rapidly derive insights from logs metrics and other telemetry.
  • Leverage Databricks own data intelligence platform to push the boundaries of observability setting new standards for industry-leading incident analysis and reliability practices.
  • Establish and uplevel monitoring and reliability best practices across Databricks engineering by developing opinionated tools and standards for structured logs metrics alerts dashboards and on-call operations.
  • Mentor grow and inspire engineers fostering a culture of technical excellence and strengthening the broader observability community within Databricks.

What We Look For:

  • 5 years experience in the performance analysis discipline. Ability to identify performance issues root cause problems and be able to come up with potential solutions.
  • Ability to build strong working relationships with developers and field engineers to facilitate triaging and mitigation of performance problems.
  • At least 3 years of experience in managing top-tier engineering teams
  • BS in Computer Science (Masters or higher level of education preferred)
  • Expertise in attracting hiring and coaching engineers who will meet the Databricks hiring standards. Experience up-leveling teams via hiring top-notch talent and growing existing team members.


Required Experience:

Manager

RDQ427R138At Databricks we are passionate about enabling data teams to solve the worlds toughest problems from making the next mode of transportation a reality to accelerating the development of medical breakthroughs. We do this by building and running the worlds best data and AI infrastructure pla...
View more view more

Key Skills

  • Hospitality Experience
  • Go
  • Management Experience
  • React
  • Redux
  • Node.js
  • AWS
  • Mechanical Engineering
  • Team Management
  • Leadership Experience
  • Mentoring
  • Distributed Systems

About Company

Company Logo

The Databricks Platform is the world’s first data intelligence platform powered by generative AI. Infuse AI into every facet of your business.

View Profile View Profile