drjobs ML Infrastructure Engineer العربية

ML Infrastructure Engineer

Employer Active

1 Vacancy
The job posting is outdated and position may be filled
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Tel Aviv - Israel

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

ML Infrastructure Engineer

This role is a member of the AI/ML Infrastructure Engineering team and will be dedicated to implementing and supporting AI/ML infrastructure solutions in cloud and onpremise environments. The role will work directly with infrastructure teams and potentially face off with data scientists machine learning engineers application developers and quantitative analysts by functioning as both a solutions architect helping them implement their own AI/ML solutions and as a professional services engineer implementing solutions for them in cloud environments such as AWS GCP and Kubernetes.

This is a handson developer role and candidates ideally have had experience deploying and supporting their own productionready AI/ML models in cloud environments as well as automating the build and management of a broad range of cloud infrastructure using tools like Terraform. Candidates should be familiar with developing unit and functional tests have experience designing and implementing CI/CD tools with infrastructure as code pipelines and have knowledge of Linux systems administration containerization networking security automated configuration and state management crosssystem orchestration configuration management logging metrics monitoring and alerting.

Principal Responsibilities:

Architect develop and maintain internal AI/ML infrastructure components frameworks and offerings

Architect develop and maintain AI/ML solutions for customers in cloud environments

Help customers architect develop and maintain their own AI/ML solutions in cloud environments

Implement CI/CD pipelines which include application tests security tests and gates

Implement availability security performance monitoring and alerting of AI/ML solutions

Automate data resiliency and replication for AI/ML models

Manage multiple environments and promote code between them

Automate systems configuration and orchestration using tools such as Terraform Chef Ansible or Salt

Automate creation of machine images and containers

Required Qualifications/Skills

6 years of experience designing and supporting production cloud environments

Experience consulting with customers to develop AI/ML solutions

Experience developing collaboratively including infrastructure as code preferably in Python

Systems engineering knowledge including understanding of Linux security and networking

Cloud templating tools such as Terraform

Experience with AI/ML frameworks (e.g. TensorFlow PyTorch)

Experience with distributed computing tools (e.g. Ray Dask)

Experience with model serving tools (e.g. vLLM KFServing)

Experience with building monitoring and alerting on logs and metrics

Cloud Networking including connectivity routing DNS VPCs proxies and load balancers

Cloud Security including IAM Certificate Management and Key Management

Excellent written and verbal communications

Excellent troubleshooting and analytical skills

Selfstarter able to execute independently on a deadline and under pressure

Employment Type

Full-Time

Company Industry

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.