The RoleAs an AI Solution Architect at Gcore you will serve as a trusted advisor to our AI-focused customers. Youll collaborate closely with clients to design and deploy large-scale GPU clusters containerized training pipelines and production inference systems. Your expertise in automation infrastructure as code and orchestration will ensure seamless repeatable deployments across hundreds to thousands of GPUs
Your Responsibilities
- Architect & Deploy: Design end-to-end GPU cluster architectures (on-premises and cloud) using Ansible Terraform Kubernetes and Slurm.
- Customer Engagement: Lead technical deep-dives conduct workshops and present solutions to stakeholders at all levels.
- Automation & IaC: Build and maintain Infrastructure as Code modules to automate provisioning scaling and monitoring of GPU resources.
- Documentation & Enablement: Produce whitepapers runbooks and training materials; host webinars and training sessions.
- Feedback Loop: Partner with Gcores engineering and product teams to relay customer insights and drive product enhancements.
Qualifications :
What Were Looking For
- Experience: 3 years in Cloud or GPU AI Infrastructure DevOps.
- Infrastructure Skills: Proven track record deploying GPU clusters at scale including multi-node multi-GPU setups.
- Automation Expertise: Hands-on with Ansible or similar configuration management tools; Terraform (IaC).
- Orchestration & Scheduling: Strong familiarity with Kubernetes (K8s) and Slurm.
- Programming: Proficient in Python / Go.
- ML Proficiency: Solid understanding of ML ecosystemsmodels tooling and production deployment patterns.
- Communication: Excellent verbal and written skills; ability to translate complex technical concepts for diverse audiences.
Nice-to-Haves
- Experience deploying high-availability inference infrastructure for production AI workloads.
- ML Ops Pipelines: Implement and optimize distributed training and inference pipelines with MLflow REST APIs and popular frameworks (PyTorch TensorFlow JAX).
- Demonstrated ability to transition ML pipelines from proof-of-concept to robust scalable production systems.
- Familiarity with GitOps workflows Docker Helm charts and CI/CD for ML.
- Knowledge of Hugging Face transformers Scikit-learn and experiment tracking best practices.
Additional Information :
What We Offer:
We value our employees and offer a benefits package designed to support your health well-being and professional growth throughout your journey at Gcore:
- Competitive salary
- Flexible working hours
- Remote hybrid or office work options depending on your role
- Work from anywhere in the world for up to 45 days per year
- Private medical insurance for you and your family*
- 5 additional vacation days*
- Additional fully paid sick leave days*
- Allowance for significant life events and birthdays
- Language classes
- Modern office space with free snacks drink and entertainment options*
- Team sports activities*
*Please be aware that this benefit may vary depending on your country.
About the Company
Gcore is an international cloud and edge leader in providing first-class web performance content delivery and security. Headquartered in Luxembourg with offices around the world the company provides its solutions to global leaders in numerous industries.
Millions of people worldwide use apps and play games based on our infrastructure and services: we are trusted by World of Tanks Albion Online Avast Photon Unity Sandbox Interactive and others.
Equal Opportunity Employer
We provide equal opportunity to all applicants without regard to race color religion sex sexual orientation age gender identity gender expression national origin disability or any other legally protected characteristics.
Remote Work :
Yes
Employment Type :
Full-time