The ML Observability team is committed to empowering our customers with an advanced observability platform specifically designed for applications that increasingly integrate machine learning components such as large language models and generative AI. We provide comprehensive monitoring and diagnostics for ML-based components tracking model performance drift fairness and system stability. Our platform also offers model prediction explainability and root-cause analysis enhancing organizations confidence in the reliability of their deployments.
Team Tenets
- Customer centric innovation: Always prioritize the needs and feedback of our customers. Our solutions must simplify their challenges and enhance their machine learning journey.
- User experience matters: A powerful tool is only as good as its usability. Always strive for intuitive user-friendly designs that enhance customer engagement.
- Collaborative excellence: Every voice matters. Leverage the diverse expertise within our team - software engineering UX design product management applied science - to create well-rounded and effective solutions.
- End-to-End ownership: Take responsibility for our creations from ideation to deployment and maintenance. Pride in our work translates to higher quality products.
- Efficient execution: Balance innovation with practicality. Prioritize tasks and features that align closely with our mission and offer the most value to our customers.
- Agility in action: While planning is essential the ability to adapt and iterate based on real-world feedback is paramount. Embrace change as an opportunity for improvement.
- Transparent communication: Promote open dialogue about challenges successes and decisions. Honesty fosters trust and accelerates problem-solving.
- Scientific rigor: Ensure that our solutions are backed by solid research and empirical evidence. Prioritize robustness and accuracy in our observability tools.
- Continuous learning: Stay updated with the latest advancements in observability for machine learning. Embrace opportunities for professional growth and encourage knowledge-sharing.
- Ethical responsibility: Commit to ensuring that our tools promote fairness transparency and accountability in AI. Drive ethical considerations in all that we build.
As the Engineering Manager you will lead a team focused on enhancing and expanding Datadogs ML Observability product. Positioned at the forefront of R&D you will emphasize rigor and experimentation to design refine and implement advanced techniques for evaluating and monitoring AI components - LLMs in particular - in our customers applications. Your leadership and expertise in engineering will be pivotal in shaping the direction of our product ensuring Datadog remains a key player in this rapidly evolving field.
References:
- LLM Observability announcement blog
- Another blog about chaining
- DASH keynotes
- Product documentation
- Hallucination detection
At Datadog we place value in our office culture - the relationships and collaboration it builds and the creativity it brings to the table. We operate as a hybrid workplace to ensure our Datadogs can create a work-life harmony that best fits them.
What Youll Do:
- Manage and mentor a team of 3-5 engineers fostering a collaborative and innovative work environment
- Leverage your technical expertise in software engineering to guide the team in building robust and scalable solutions
- Apply your experience with LLMs to enhance the products capabilities in evaluating and monitoring LLM-based applications
- Explore and implement new techniques and tools to provide deeper insights into model behavior drift fairness and interpretability
- Engage with senior management and executives articulating complex technical concepts clearly and precisely
- Stay current with industry trends and advancements in machine learning and observability driving innovation within the team
Who You Are:
- Proven experience in software engineering with a focus on engineering LLM-based systems
- Demonstrated experience managing small teams of software engineers and/or applied scientists with a track record of delivering high-quality products
- Strong software development skills and proficiency in Python and Go
- Excellent communication abilities to convey complex technical concepts clearly
- A collaborative mindset and proven experience in working in cross-functional teams
- A proactive approach with a passion for continuous learning and innovation
- Experience in applied science strong understanding of machine learning theory statistics and fundamentals is a bonus
Datadog values people from all walks of life. We understand not everyone will meet all the above qualifications on day one. Thats okay. If youre passionate about technology and want to grow your skills we encourage you to apply.
Benefits and Growth:
- New hire stock equity (RSUs) and employee stock purchase plan (ESPP)
- Continuous professional development product training and career pathing
- Intradepartmental mentor and buddy program for in-house networking
- An inclusive company culture ability to join our Community Guilds (Datadog employee resource groups)
- Access to Inclusion Talks our Internal panel discussions
- Free global Spring Health benefits for employees and dependents age 6
- Competitive global benefits
Benefits and Growth listed above may vary based on the country of your employment and the nature of your employment with Datadog.
Required Experience:
Manager