Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via emailResponsibilities
1. Full-Stack AI Infrastructure Architecture & Development:
2. Intelligent Computing Power Scheduling System Design:
3. Hardware-Software Co-Optimization & System Reliability:
4. Technical Foresight & Architecture Evolution:
Qualifications
1. Bachelors/Masters in Computer Science or related fields 5-10 years of experience with strong self-motivation and execution ability to identify and resolve technical bottlenecks.
2. Deep expertise in AI infrastructure: Kubernetes GPU resource management RDMA/high-performance networking and large-scale distributed AI system design/deployment.
3. Proficient in *Golang/Python* with solid system programming and automation skills. Priority given to candidates with experience in *Volcano/Kueue schedulers K8s Operator development or open-source contributions*.
4. Familiar with core resource scheduling principles GPU lifecycle management (allocation isolation elasticity fault tolerance) and designing high-availability low-latency strategies for quantitative tasks.
5. Knowledge of mainstream AI frameworks (PyTorch/TensorFlow) with experience in training/inference performance optimization and cross-team collaboration for framework-infra co-optimization.
6. Preferred: Experience in **FinTech/quantitative AI infrastructure* understanding of business-critical computing demands and ability to drive cross-team collaboration and value delivery.
Full Time