SR Principal Software Engineer LLM Engineering

JPMorganChase

Job Location:

Palo Alto, CA - USA

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

Description

Were looking for a tech leader ready to take their career to new heights. Join the ranks of top talent at one of the worlds most influential companies.

As a Senior Principal Software Engineer at JPMorganChase within the Commercial & Investment Bank Trust & Safety Fraud Prevention team you provide deep engineering expertise and work across agile teams to enhance build and deliver trusted marketleading technology products in a secure stable and scalable way. Leverage your deep expertise to consistently challenge the status quo innovate for business impact lead the strategic development behind new and existing products and technology portfolios and remain at the forefront of industry trends best practices and technological advances.

Job responsibilities

Advises and leads on the strategy architecture and development of Model serving solutions for different model architectures including LLMs & GNNs across cloud and onpremises environments aligning initiatives to business outcomes.
Defines and implements MLOps and LLMOps strategies for endtoend model lifecycle management including training versioning deployment monitoring and governance.
Drives optimization of Model inferencing for high throughput and low latency using quantization model parallelism intelligent batching and hardware acceleration for all model architectures
Creates durable reusable software and platform frameworks to standardize ML Engineering services enabling scale across teams and functions.
Establishes best practices for automation CI/CD and infrastructureascode using containerization and orchestration technologies.
Partners closely with data science platform engineering and SRE teams to productionize the models on AWS ensuring observability reliability and cost efficiency.
Leads deployment and optimization using Model Inference servers such as Triton Inference Server and vLLM for highthroughput lowlatency serving at scale.
Oversees production operations for AI workloads including monitoring incident response security and compliance with continuous improvement.
Translates highly complex technical concepts and emerging trends into actionable strategies for executive and product leadership.
Influences senior stakeholders and crossfunctional partners to prioritize and deliver AI/ML capabilities that drive measurable business impact.
Promotes the firms culture of diversity opportunity inclusion and respect across teams and communities.

Required qualifications capabilities and skills

Formal training or certification on software engineering concepts and 10 years of applied experience.
8 years of AI/ML engineering experience with significant expertise in LLMs GNNs and other model architectures (e.g. GPT Llama Falcon Mistral).
Demonstrated success architecting and deploying LLM & GNN solutions on AWS (e.g. SageMaker Bedrock EKS) at enterprise scale; experience with Azure ML or GCP Vertex AI.
Experience building LLM GNN serving platforms in largescale environments typical of major tech firms.
Handson experience building LLM inference engines using Triton Inference Server and vLLM including autoscaling caching and throughput optimization.
Advanced proficiency in Python and optimization techniques applied to deep learning frameworks (PyTorch TensorFlow Hugging Face Transformers).
Deep understanding of LLMOps/MLOps (e.g. MLflow SageMaker Pipelines Kubeflow) with a track record of implementing best practices at scale.
Expertise in inference optimization and distributed systems for large models focused on highthroughput lowlatency applications.
Practical experience delivering system design application development testing and operational stability for enterprise AI platforms.
Proven collaboration with SRE to implement observability incident response and SLIs/SLOs for LLM services.
Excellent communication skills with the ability to influence both technical and nontechnical stakeholders and deliver value across functions at scale.

Preferred qualifications capabilities and skills

Masters or PhD in Computer Science Engineering or a related field (or equivalent experience).
Practical cloudnative experience including containerization (Docker) orchestration (Kubernetes) and infrastructureascode (Terraform CloudFormation).
Expertise in security compliance and governance for AI/ML deployments in regulated environments.
Experience in trust and safety or fraud prevention domains; familiarity with payments platforms is a plus.
Track record of contributions to opensource LLM projects or peerreviewed research and/or experience presenting at industry conferences or leading technical communities.
Familiarity with hardware acceleration strategies across GPUs TPUs and specialized inference runtimes.
Experience in building java based applications

This position is subject to Section 19 of the Federal Deposit Insurance Act. As such an employment offer for this position is contingent on JPMorgan Chases review of criminal conviction history including pretrial diversions or program entries.

Required Experience:

Staff IC

DescriptionWere looking for a tech leader ready to take their career to new heights. Join the ranks of top talent at one of the worlds most influential companies.As a Senior Principal Software Engineer at JPMorganChase within the Commercial & Investment Bank Trust & Safety Fraud Prevention team you ...

Description

Were looking for a tech leader ready to take their career to new heights. Join the ranks of top talent at one of the worlds most influential companies.

Job responsibilities

Advises and leads on the strategy architecture and development of Model serving solutions for different model architectures including LLMs & GNNs across cloud and onpremises environments aligning initiatives to business outcomes.
Defines and implements MLOps and LLMOps strategies for endtoend model lifecycle management including training versioning deployment monitoring and governance.
Drives optimization of Model inferencing for high throughput and low latency using quantization model parallelism intelligent batching and hardware acceleration for all model architectures
Creates durable reusable software and platform frameworks to standardize ML Engineering services enabling scale across teams and functions.
Establishes best practices for automation CI/CD and infrastructureascode using containerization and orchestration technologies.
Partners closely with data science platform engineering and SRE teams to productionize the models on AWS ensuring observability reliability and cost efficiency.
Leads deployment and optimization using Model Inference servers such as Triton Inference Server and vLLM for highthroughput lowlatency serving at scale.
Oversees production operations for AI workloads including monitoring incident response security and compliance with continuous improvement.
Translates highly complex technical concepts and emerging trends into actionable strategies for executive and product leadership.
Influences senior stakeholders and crossfunctional partners to prioritize and deliver AI/ML capabilities that drive measurable business impact.
Promotes the firms culture of diversity opportunity inclusion and respect across teams and communities.

Required qualifications capabilities and skills

Formal training or certification on software engineering concepts and 10 years of applied experience.
8 years of AI/ML engineering experience with significant expertise in LLMs GNNs and other model architectures (e.g. GPT Llama Falcon Mistral).
Demonstrated success architecting and deploying LLM & GNN solutions on AWS (e.g. SageMaker Bedrock EKS) at enterprise scale; experience with Azure ML or GCP Vertex AI.
Experience building LLM GNN serving platforms in largescale environments typical of major tech firms.
Handson experience building LLM inference engines using Triton Inference Server and vLLM including autoscaling caching and throughput optimization.
Advanced proficiency in Python and optimization techniques applied to deep learning frameworks (PyTorch TensorFlow Hugging Face Transformers).
Deep understanding of LLMOps/MLOps (e.g. MLflow SageMaker Pipelines Kubeflow) with a track record of implementing best practices at scale.
Expertise in inference optimization and distributed systems for large models focused on highthroughput lowlatency applications.
Practical experience delivering system design application development testing and operational stability for enterprise AI platforms.
Proven collaboration with SRE to implement observability incident response and SLIs/SLOs for LLM services.
Excellent communication skills with the ability to influence both technical and nontechnical stakeholders and deliver value across functions at scale.

Preferred qualifications capabilities and skills

Masters or PhD in Computer Science Engineering or a related field (or equivalent experience).
Practical cloudnative experience including containerization (Docker) orchestration (Kubernetes) and infrastructureascode (Terraform CloudFormation).
Expertise in security compliance and governance for AI/ML deployments in regulated environments.
Experience in trust and safety or fraud prevention domains; familiarity with payments platforms is a plus.
Track record of contributions to opensource LLM projects or peerreviewed research and/or experience presenting at industry conferences or leading technical communities.
Familiarity with hardware acceleration strategies across GPUs TPUs and specialized inference runtimes.
Experience in building java based applications

Required Experience:

Staff IC

Apply Now

About Company

JPMorganChase

JPMorganChase, one of the oldest financial institutions, offers innovative financial solutions to millions of consumers, small businesses and many of the world’s most prominent corporate, institutional and government clients under the J.P. Morgan and Chase brands. Our history spans ov ... View more

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click