CSQ127R23
At Databricks an Incident Manager utilizes their technical experience and resourcefulness to lead urgent customer situations to resolution. Responsible for managing frequent high-quality updates to all internal and external stakeholders Incident Managers advocate with engineering and leadership on behalf of their customers to ensure that escalations are handled with the appropriate level of urgency from stakeholders.
The impact you will have:
- Drive critical customer escalations or widespread outages to conclusion and resolution.
- Escalate to on-call resources in support and engineering and establish checkpoint calls and action items to ensure that progress is made and status updates are delivered on time.
- Demonstrate cross-functional leadership while establishing ownership of escalations and outages.
- Compile and deliver frequent high-quality communications to internal and external stakeholders including executive staff. Candidate should be comfortable creating concise and effective messaging that is tailored to a technical or executive audience with minimal assistance from others.
- Commence and lead war rooms while establishing other temporary communication channels as warranted for the duration of an outage.
- Ability to multi-task on several incidents and/or projects at once.
- Be a leader who identifies product and process improvements from every incident and submits necessary feedback for improvements.
- Participate in on-call rotations.
What we look for:
- Minimum 8 years of experience in customer support support escalation and incident management is required.
- Excellent contextual interpretation and writing skill with an effective ability to summarize and communicate to technical and business audiences is required.
- Demonstrates strong ability to make timely decisions for both business and technical perspectives.
- Excellent analytical and troubleshooting skills are required. Candidate should be able to demonstrate technical excellence by applying engineering principles to solve complex problems.
- Hands-on experience developing any two or more of the following: Big Data Hadoop Spark Machine Learning Artificial Intelligence Streaming Kafka Data Science ElasticSearch related industry use cases at the production scale.
- Hands-on experience in the performance tuning/troubleshooting of Spark-based applications at production scale.
- Proven and real-time experience in JVM and Memory Management techniques such as Garbage collection and Heap/Thread Dump Analysis is required.
- Working knowledge in Data Lakes and preferably on the SCD types use cases at production scale.
- Working and hands-on experience with any SQL-based databases Data Warehousing/ETL technologies like Informatica DataStage Oracle Teradata SQL Server and MySQL
- Linux/Unix administration skills and hands-on experience with AWS or Azure or GCP is required.
Required Experience:
Manager