Your Career
With Prisma AIRS Palo Alto Networks is building the worlds most comprehensive AI security platform. Organizations are increasingly building complex ecosystems of AI models applications and agents creating dynamic new attack surfaces with risks that traditional security approaches cannot response Prisma AIRS delivers model security posture management AI red teaming and runtime protection. Our customers can confidently deploy AI-driven innovation while ensuring a formidable security posture from development through runtime.
As a Principal Machine Learning Inference Engineer you will serve as a technical authority and visionary for the Prisma AIRS team. You will be responsible for the architectural design and long-term strategy of our AI platform - ML inference. Beyond individual contribution you will lead complex technical projects mentor senior engineers and set the standard for performance scalability and engineering excellence across the organization. Your decisions will have a profound and lasting impact on our ability to deliver cutting-edge AI security solutions at a massive scale.
Your Impact
Architect and Design: Lead the architectural design of a highly scalable low-latency and resilient ML inference platform capable of serving a diverse range of models for real-time security applications.
Technical Leadership: Provide technical leadership and mentorship to the team driving best practices in MLOps software engineering and system design.
Strategic Optimization: Drive the strategy for model and system performance guiding research and implementation of advanced optimization techniques like custom kernels hardware acceleration and novel serving frameworks.
Set The Standard: Establish and enforce engineering standards for automated model deployment robust monitoring and operational excellence for all production ML systems.
Cross-Functional Vision: Act as a key technical liaison to other principal engineers architects and product leaders to shape the future of the Prisma AIRS platform and ensure end-to-end system cohesion.
Solve the Hardest Problems: Tackle the most ambiguous and challenging technical problems in large-scale inference from mitigating novel security threats to achieving unprecedented performance goals.
Qualifications :
Your Experience
BS/MS or Ph.D. in Computer Science a related technical field or equivalent practical experience.
Extensive professional experience in software engineering with a deep focus on MLOps ML systems or productionizing machine learning models at scale.
Expert-level programming skills in Python are required; experience in a systems language like Go Java or C is nice to have.
Deep hands-on experience designing and building large-scale distributed systems on a major cloud platform (GCP AWS Azure or OCI).
Proven track record of leading the architecture of complex ML systems and MLOps pipelines using technologies like Kubernetes and Docker.
Mastery of ML frameworks (TensorFlow PyTorch) and extensive experience with advanced inference optimization tools (ONNX TensorRT).
A strong understanding of popular model architectures (e.g. Transformers CNNs GNNs) is a significant plus.
Demonstrated expertise with modern LLM inference engines (e.g. vLLM SGLang TensorRT-LLM) is required. Open-source contributions in these areas are a significant plus.
Experience with low-level performance optimization such as custom CUDA kernel development or using Triton Language is a plus.
Experience with data infrastructure technologies (e.g. Kafka Spark Flink) is great to have.
Familiarity with CI/CD pipelines and automation tools (e.g. Jenkins GitLab CI Tekton) is a plus.
Additional Information :
The Team
Our Prisma AIRS team is a group of highly motivated and innovative engineers and researchers dedicated to solving the most challenging problems in AI security. We thrive in a collaborative environment where we value creativity ownership and a commitment to excellence. You will have the opportunity to work with cutting-edge technology and make a significant impact on the future of cybersecurity.
Compensation Disclosure
The compensation offered for this position will depend on qualifications experience and work location. For candidates who receive an offer at the posted level the starting base salary (for non-sales roles) or base salary commission target (for sales/commissioned roles) is expected to be between $151000/YR - $246500/YR. The offered compensation may also include restricted stock units and a bonus. A description of our employee benefits may be found here.
Our Commitment
Were problem solvers that take risks and challenge cybersecuritys status quo. Its simple: we cant accomplish our mission without diverse teams innovating together.
We are committed to providing reasonable accommodations for all qualified individuals with a disability. If you require assistance or accommodation due to a disability or special need please contact us at .
Palo Alto Networks is an equal opportunity employer. We celebrate diversity in our workplace and all qualified applicants will receive consideration for employment without regard to age ancestry color family or medical care leave gender identity or expression genetic information marital status medical condition national origin physical or mental disability political affiliation protected veteran status race religion sex (including pregnancy) sexual orientation or other legally protected characteristics.
All your information will be kept confidential according to EEO guidelines.
Is role eligible for Immigration Sponsorship: Yes
Remote Work :
No
Employment Type :
Full-time
Our enterprise security platform detects and prevents known and unknown threats while safely enabling an increasingly complex and rapidly growing number of applications. Come be part of the team that redefined the firewall industry and is now the fastest-growing security company in hi ... View more