Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via emailNot Disclosed
Salary Not Disclosed
1 Vacancy
Role : AI Site Reliability Engineer(W2)
Location : Virginia Reston
Skills : SRE NVIDIA (DGX) Python Ansible Terraform Site Reliability Engineer Linux
Role : AI Site Reliability Engineer
No. Positions : 2
Location: Remote
Notice period: 2 weeks
Visa: Any (Except OPT and CPT)
Note: Need atleast 1 or 2 resumes by today EOD please try to submit profiles please.
Your Role as an AI Site Reliability Engineer
We are building developing and expanding our artificial intelligence platforms which will empower the business to fundamentally change the world. You will be an AI Site Reliability Engineer in the IT Infrastructure Services organization. You will use SRE mechanisms to reduce toil and maintain Service Level Objectives (SLOs) for our internal NVIDIA DGX and Cisco-UCS based AI platforms. You will lead build and run fully automated pipelines through our Continuous Integration/ Continuous Delivery (CI/CD) system to deliver operational capabilities and improvements.
Responsibilities include
Who You Are
You are an experienced Site Reliability Engineer for high performance compute artificial intelligence machine learning and/or integrated computer systems. You have a software engineering approach for solving operational problems. You know HPC and are familiar with Kubernetes. You have experience delivering software solutions and Linux operating systems. You understand IT infrastructure customers and are passionate about diving deep into problems and fixing them.
Our Minimum Requirements include:
Preferred Qualifications
Full-time