Voice AI Engineering Principal
About the Role
Zendesk is seeking an innovative and visionary Voice AI Engineering Director to lead and accelerate our voice and conversational AI this pivotal role you will spearhead the development and deployment of cutting-edge AI/ML technologies focused on Speech and Natural Language Processing (NLP) shaping the future of voice-enabled customer experiences at scale.
You will oversee researchers innovating across Automatic Speech Recognition (ASR) Text-to-Speech (TTS) Large Language Models (LLM) and voice conversational systems driving impactful solutions that power Zendesks intelligent voice products.
What Youll Do
Lead the research design and engineering of next-generation Voice AI solutions including noise-robust multilingual ASR neural TTS and advanced QA dialog systems fine-tuned with state-of-the-art pretrained models (e.g. BERT GPT).
Build mentor and scale a high-performing AI/ML engineering team specialized in speech processing NLP and deep learning while fostering an innovative research-driven culture.
Drive collaboration across research scientists software engineers and product teams to transform advanced AI models into robust scalable production systems.
Oversee large-scale AI research and development projects ensuring delivery of high-quality real-world solutions optimized for diverse tasks and computing environments.
Architect and implement AI models leveraging deep learning algorithms such as DNNs CNNs RNNs and Transformer-based architectures across speech and NLP pipelines.
Champion best practices in software development including CI/CD code reviews version control (Git) and refactoring to support efficient and maintainable codebases.
Stay ahead of the curve by continuously researching and applying the latest breakthroughs in AI/ML to enhance Zendesks voice capabilities.
Collaborate with stakeholders to define technical vision roadmap and strategy for voice AI products that deliver superior user experiences and business impact.
Who You Are
Passionate about the frontiers of AI/ML and driven to apply breakthrough technologies to real-world voice and language problems.
Proven expertise developing and applying speech and NLP models with extensive hands-on experience using DL frameworks such as PyTorch TensorFlow Keras and Huggingface Transformers.
Deep knowledge of AI architectures including DNN CNN RNN Transformers and experience fine-tuning large pre-trained models (e.g. BERT GPT).
Skilled in programming languages and tools including Python C Java R Linux/Shell scripting with strong engineering discipline in software development lifecycle.
Demonstrated leadership in building and guiding AI/ML teams through complex research and engineering challenges.
Experience deploying voice AI systems in production including ASR diarization TTS NMT and dialog systems with a focus on noise robustness and multilingual capabilities.
Track record of managing large-scale research projects with real-world impact combining fundamental research with prototyping and product delivery.
Background in developing AI-driven speech technologies for complex domains such as autonomous pilot systems or court reporting is a highly valued asset.
Hold an M.S. in Engineering Computer Science or a related field with a strong foundation in machine learning speech processing and on-device AI for real-time and low-power applications.
Alternative for a more hands-on role matching the team size now and in the future:
The Agentic Tribe is revolutionizing the voice assistance landscape with Gen3 a cutting-edge AI Agent system that is pushing the boundaries of conversational AI. Gen3 is a goal-oriented dynamic and truly conversational system capable of complex reasoning planning and adapting to user needs in real-time spoken dialogue.
As a Staff AI Agent Engineer & Team Lead specializing in Voice AI you will be the definitive technical authority and hands-on leader for the Voice AI Agent platform. You will be responsible for defining the architecture setting the technical direction for the team leading major cross-functional initiatives and mentoring senior engineers. This role requires an individual who can balance deep technical work with strategic leadership ensuring our Voice AI system is not only robust and low-latency but also scalable safe and aligned with the companys long-term product vision.
Technical Leadership & Architecture
Architectural Ownership: Define the technical vision and architect the next generation of our voice-first AI Agent platform ensuring it meets extreme requirements for low-latency high availability and scalability for millions of concurrent voice interactions.
Technical Roadmap: Own and drive complex multi-quarter technical initiatives from concept to production solving ambiguous or highly complex challenges that impact multiple engineering teams across the organization.
Core Systems Design: Lead the design and development of critical real-time voice components including the strategic selection and integration of best-in-class real-time Speech-to-Text (STT) Text-to-Speech (TTS) and Voice Activity Detection (VAD) services.
Define Standards: Establish and enforce engineering best practices design patterns and coding standards for Python-based voice agent development focusing on robust state management dynamic tool use and sophisticated reasoning models (e.g. Tree-of-Thought CoT).
Team Lead & Mentorship
Team Leadership: Provide technical leadership and guidance to a dedicated project team including task delegation daily technical direction and ensuring high-quality on-time project delivery.
Mentorship: Actively mentor Senior and mid-level engineers fostering a culture of technical excellence deep ownership and continuous learning within the Voice AI team and the broader engineering organization.
Cross-Functional Strategy: Serve as the primary technical partner for Product Leadership ML Science and Infrastructure teams aligning technical implementation plans with product strategy and influencing the long-term Voice AI roadmap.
Evaluation & Reliability
Evaluation Platform: Design establish and continuously improve the organizational platforms and methodologies for evaluating voice agent performance and behavior setting key success metrics (e.g. WER conversational naturalness latency budget adherence) and driving iterative improvements across the Agentic Tribe.
Safety & Defense: Architect and implement advanced safety and reliability mechanisms including robust prompt injection defenses comprehensive LLM guardrails sophisticated fallback strategies and advanced error-handling to manage noisy audio input and speech recognition inaccuracies at scale.
10 years of progressive experience in software engineering with 4 years focused on AI/ML applications and 2 years operating in a Staff Principal or equivalent technical leadership capacity.
Expertise in LLM-Oriented System Architecture: Proven ability to architect and lead the development of complex multi-step tool-using agents (e.g. LangChain Autogen custom orchestrators).
Mastery in Voice AI/Spoken Dialogue Systems: Extensive hands-on experience building mission-critical low-latency streaming voice applications. This includes deep proficiency with:
Integrating and managing real-time STT/TTS models and APIs.
Advanced techniques for Voice Activity Detection (VAD) and noise suppression.
Architecting robust barge-in and interruption logic in real-time voice streams.
Platform & Deployment Expertise: Deep expertise in deploying complex large-scale AI applications to cloud platforms (AWS GCP or Azure) using advanced infrastructure-as-code and CI/CD best practices. Proven experience optimizing LLM token budgets latency and cost through sophisticated model routing caching (e.g. Redis) and quantization techniques.
Advanced ML & System Knowledge: Comprehensive understanding of foundational ML concepts Retrieval-Augmented Generation (RAG) pipelines vector databases and advanced context management to ensure deterministic and accurate agent behavior in complex production environments.
Programming Mastery: Expert-level proficiency in Python and modern web frameworks (e.g. FastAPI gRPC for streaming services).
M.S. or Ph.D. in Computer Science NLP Machine Learning or a related technical field.
Experience with real-time streaming architectures such as WebRTC or gRPC.
A track record of technical presentation publication or open-source contribution in the field of conversational AI or generative agents.
Experience driving organizational adoption of new technologies and influencing company-wide architectural decisions.
Hybrid: In this role our hybrid experience is designed at the team level to give you a rich onsite experience packed with connection collaboration learning and celebration - while also giving you flexibility to work remotely for part of the week. This role must attend our local office for part of the week. The specific in-office schedule is to be determined by the hiring manager.
The intelligent heart of customer experience
Zendesk software was built to bring a sense of calm to the chaotic world of customer service. Today we power billions of conversations with brands you know and love.
As part of our commitment to fairness and transparency we inform all applicants that artificial intelligence (AI) or automated decision systems may be used to screen or evaluate applications for this position in accordance with Company guidelines and applicable law.
Zendesk is an equal opportunity employer and were proud of our ongoing efforts to foster global diversity equity & inclusion in the workplace. Individuals seeking employment and employees at Zendesk are considered without regard to race color religion national origin age sex gender gender identity gender expression sexual orientation marital status medical condition ancestry disability military or veteran status or any other characteristic protected by applicable law. We are an AA/EEO/Veterans/Disabled employer. If you are based in the United States and would like more information about your EEO rights under the law please click here.
Zendesk endeavors to make reasonable accommodations for applicants with disabilities and disabled veterans pursuant to applicable federal and state law. If you are an individual with a disability and require a reasonable accommodation to submit this application complete any pre-employment testing or otherwise participate in the employee selection process please send an e-mail to with your specific accommodation request.
Required Experience:
IC