As a Senior Machine learning Engineer in the Perception team you will play a pivotal role in building and maintaining the backbone of our L2 ADAS stack. This senior role calls for an experienced engineer who can think critically execute independently and deliver results on scalable deep learning infrastructure optimize massive data ingestion pipelines and ensure maximum efficiency across our compute clusters.
You will be responsible for the entire DL infrastructure lifecyclefrom managing Azure storage and hybrid Kubernetes clusters to designing efficient data loaders for multimodal training. You will work at the intersection of infrastructure data engineering and deep learning enabling feature teams to train complex models (single frame temporal and multimodal) with speed and reliability. Your ability to solve abstract infrastructure challenges and apply T-shaped expertisegoing deep in areas like infrastructure multitask deep learning among others while maintaining breadth in software designwill be key to our success.
Responsibilities
Deep Learning Infrastructure & Compute:
Manage and optimize the entire DL infrastructure including Azure Blob Storage integration VNET setups and hybrid compute resources (Cloud and On-premise/Frankfurt clusters).
Lead performance investigations and benchmarking for next-gen hardware (e.g. comparing H200 vs. H100 Azure native vs. deployment nodes) to ensure cost and speed efficiency.
Maintain and scale Kubernetes clusters for training and inference workloads.
Data Pipelines & Efficient Loading:
Architect and develop high-performance data loaders for complex multimodal datasets (camera radar temporal/non-temporal data).
Modernize data processing pipelines using Ray and Kubernetes to parallelize data caching shuffling and oversampling.
Leverage PyArrow and SQL to optimize data consumption and integration with data loops.
Implement efficient dataset update strategies (handling deltas) and ensure seamless integration of new tasks into the multimodal multi-task network.
CI/CD Monitoring & Quality:
Design and maintain robust GitHub Workflows and CI pipelines for new and existing feature teams.
Develop KPI dashboards using Grafana to monitor compute usage GPU efficiency unit test durations and overall system health.
Manage dependency updates (Torch upgrades Ubuntu updates Hydra maintenance Dependabots) to ensure a secure and modern stack.
Drive software design excellence by performing thorough code reviews (PRs) and enforcing high standards in software architecture.
Embedded & Evaluation:
Establish scalable evaluation pipelines for embedded targets specifically for QNN boards and other edge devices.
Collaborate with feature teams to support model compression experiments and on-target performance verification.
Required Qualifications
Education: Bachelors degree in Computer Science Electrical Engineering or a related field. An advanced degree is an advantage.
Experience: 5 years of industry experience in MLOps Data Engineering or Software Infrastructure with a focus on Deep Learning systems.
Programming & Software Design: Expert-level proficiency in Python with a strong emphasis on clean software design object-oriented programming and architectural patterns.
Infrastructure & Orchestration: Deep hands-on experience with Kubernetes Docker and Cloud platforms (specifically Azure ML Azure Storage/Networking).
Big Data & Optimization: Proficiency with high-performance data processing tools such as Ray PyArrow and SQL. Experience optimizing data loading bottlenecks for
GPU training.
DevOps & Monitoring: Experience setting up complex CI/CD pipelines (GitHub Actions) and observability stacks (Grafana Prometheus).
Soft Skills: Strong problem-solving abilities proactiveness and ownership of complex topics. Ability to adapt quickly to new technologies and work collaboratively in a supportive high-performance team.
Qualifications :
Bachelors degree in Computer Science Electrical Engineering or a related field. An advanced degree is an advantage.
Additional Information :
- Deep Learning Knowledge: While deep algorithm knowledge is not mandatory a strong understanding of Deep Learning workflows (PyTorch training loops multi-task learning) is highly beneficial.
- Embedded Systems: Experience with deploying or evaluating models on embedded hardware (Qualcomm QNN etc.) or setting up hardware-in-the-loop pipelines.
- T-Shaped Skills: Ability to specialize deeply in infrastructure while lending a hand to feature teams on diverse topics (e.g. documentation builds visualization tools).
Remote Work :
No
Employment Type :
Full-time
As a Senior Machine learning Engineer in the Perception team you will play a pivotal role in building and maintaining the backbone of our L2 ADAS stack. This senior role calls for an experienced engineer who can think critically execute independently and deliver results on scalable deep learning inf...
As a Senior Machine learning Engineer in the Perception team you will play a pivotal role in building and maintaining the backbone of our L2 ADAS stack. This senior role calls for an experienced engineer who can think critically execute independently and deliver results on scalable deep learning infrastructure optimize massive data ingestion pipelines and ensure maximum efficiency across our compute clusters.
You will be responsible for the entire DL infrastructure lifecyclefrom managing Azure storage and hybrid Kubernetes clusters to designing efficient data loaders for multimodal training. You will work at the intersection of infrastructure data engineering and deep learning enabling feature teams to train complex models (single frame temporal and multimodal) with speed and reliability. Your ability to solve abstract infrastructure challenges and apply T-shaped expertisegoing deep in areas like infrastructure multitask deep learning among others while maintaining breadth in software designwill be key to our success.
Responsibilities
Deep Learning Infrastructure & Compute:
Manage and optimize the entire DL infrastructure including Azure Blob Storage integration VNET setups and hybrid compute resources (Cloud and On-premise/Frankfurt clusters).
Lead performance investigations and benchmarking for next-gen hardware (e.g. comparing H200 vs. H100 Azure native vs. deployment nodes) to ensure cost and speed efficiency.
Maintain and scale Kubernetes clusters for training and inference workloads.
Data Pipelines & Efficient Loading:
Architect and develop high-performance data loaders for complex multimodal datasets (camera radar temporal/non-temporal data).
Modernize data processing pipelines using Ray and Kubernetes to parallelize data caching shuffling and oversampling.
Leverage PyArrow and SQL to optimize data consumption and integration with data loops.
Implement efficient dataset update strategies (handling deltas) and ensure seamless integration of new tasks into the multimodal multi-task network.
CI/CD Monitoring & Quality:
Design and maintain robust GitHub Workflows and CI pipelines for new and existing feature teams.
Develop KPI dashboards using Grafana to monitor compute usage GPU efficiency unit test durations and overall system health.
Manage dependency updates (Torch upgrades Ubuntu updates Hydra maintenance Dependabots) to ensure a secure and modern stack.
Drive software design excellence by performing thorough code reviews (PRs) and enforcing high standards in software architecture.
Embedded & Evaluation:
Establish scalable evaluation pipelines for embedded targets specifically for QNN boards and other edge devices.
Collaborate with feature teams to support model compression experiments and on-target performance verification.
Required Qualifications
Education: Bachelors degree in Computer Science Electrical Engineering or a related field. An advanced degree is an advantage.
Experience: 5 years of industry experience in MLOps Data Engineering or Software Infrastructure with a focus on Deep Learning systems.
Programming & Software Design: Expert-level proficiency in Python with a strong emphasis on clean software design object-oriented programming and architectural patterns.
Infrastructure & Orchestration: Deep hands-on experience with Kubernetes Docker and Cloud platforms (specifically Azure ML Azure Storage/Networking).
Big Data & Optimization: Proficiency with high-performance data processing tools such as Ray PyArrow and SQL. Experience optimizing data loading bottlenecks for
GPU training.
DevOps & Monitoring: Experience setting up complex CI/CD pipelines (GitHub Actions) and observability stacks (Grafana Prometheus).
Soft Skills: Strong problem-solving abilities proactiveness and ownership of complex topics. Ability to adapt quickly to new technologies and work collaboratively in a supportive high-performance team.
Qualifications :
Bachelors degree in Computer Science Electrical Engineering or a related field. An advanced degree is an advantage.
Additional Information :
- Deep Learning Knowledge: While deep algorithm knowledge is not mandatory a strong understanding of Deep Learning workflows (PyTorch training loops multi-task learning) is highly beneficial.
- Embedded Systems: Experience with deploying or evaluating models on embedded hardware (Qualcomm QNN etc.) or setting up hardware-in-the-loop pipelines.
- T-Shaped Skills: Ability to specialize deeply in infrastructure while lending a hand to feature teams on diverse topics (e.g. documentation builds visualization tools).
Remote Work :
No
Employment Type :
Full-time
View more
View less