drjobs Site Reliability engineer

Site Reliability engineer

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Austin - USA

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

Hello

My name is Shubham Pal and I am a Staffing Specialist at Sapear Inc. I am reaching out to you on an exciting job opportunity with one of our clients.

Title: Site Reliability Engineer SRE ML platform

Location: Austin TX OR Sunnyvale CA

Type: FTE/ FTC

Responsibilities:

  • Continuous Deployment using GitHub Actions Flux Kustomize
  • Design and implement cloud solutions build MLOps on cloud AWS
  • Data science model containerization deployment using docker VLLM Kubernetes
  • Communicate with a team of data scientists data engineers and architects document the processes
  • Develop and deploy scalable tools and services for our clients to handle machine learning training and inference.
  • Knowledge of ML models and LLM

Qualifications:

  • 6 years of experience in ML Ops with strong knowledge in Kubernetes Python MongoDB and AWS.
  • Good understanding of Apache SOLR.
  • Proficient with Linux administration.
  • Knowledge of ML models and LLM.
  • Ability to understand tools used by data scientists and experience with software development and test automation
  • Ability to design and implement cloud solutions and ability to build MLOps pipelines on cloud solutions (AWS)
  • Experience working with cloud computing and database systems
  • Experience building custom integrations between cloud-based systems using APIs
  • Experience developing and maintaining ML systems built with open-source tools
  • Experience with MLOps Frameworks like Kubeflow MLFlow DataRobot Airflow etc. experience with Docker and Kubernetes
  • Experience developing containers and Kubernetes in cloud computing environments
  • Familiarity with one or more data-oriented workflow orchestration frameworks (Kubeflow Airflow Argo etc.)
  • Ability to translate business needs to technical requirements
  • Strong understanding of software testing benchmarking and continuous integration
  • Exposure to machine learning methodology and best practices
  • Good communication skills and ability to work in a team

Note: Focus is to have 60% SRE and 40% ML Ops

Skill Area

Includes

Weight (%)

Platform Reliability & Containerization

Kubernetes Docker Microservices Linux

30%

MLOps & AWS Cloud

Model deployment versioning monitoring AWS (SageMaker S3 Lambda EKS)

25%

CI/CD & GitOps

GitHub Actions Flux

15%

Monitoring & Observability

Splunk Grafana Prometheus performance tracking

15%

Integration & Collaboration

Python scripting API integrations Apache Solr LLM awareness teamwork with data scientists & engineers

15%

Regards !!

Shubham Pal

Lead Business Development Manager

Sapear Inc.

Email :

Cell : 1

We are hiring:

Employment Type

Full-time

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.