Inference Platform Engineer (LLM & Kubernetes)

N-iX

Not Interested
Bookmark
Report This Job

profile Job Location:

Bucharest - Romania

profile Monthly Salary: Not Disclosed
Posted on: 17 hours ago
Vacancies: 1 Vacancy

Job Summary

N-iXis a global software development service company that helps businesses across the globe create next-generation software products. Founded in 2002 we unite 2400 tech-savvy professionals across 40 countries working on impactful projects for industry leaders and Fortune 500 companies. Our expertise spans cloud data AI/ML embedded softwareIoT and more driving digital transformation across finance manufacturing telecom healthcare and other industries. JoinN-iX and become part of a team where your ideas make a real impact.

We are looking for anInference Platform Engineer (LLM & Kubernetes) to join our team.

Our client is a leading European AI company developing large language models and generative platforms for enterprise and government clients.
Their products combine high-performance technologies transparency accessibility and data security fully aligned with European regulatory and ethical standards.

As an Inference Platform Engineer (LLM & Kubernetes) you will take ownership of inference API integration operations and platform reliability across production AI systems.
This role is designed to be covered by 12 FTE split across several senior specialists ensuring continuity of inference services and full coverage during planned and unplanned absences as we take over end-to-end LLM inference responsibility.

Responsibilities:

  • Take ownership of inference API integration orchestration and long-term platform reliability
  • Lead operations for LLM inference services as they transition under internal ownership
  • Ensure inference API availability latency and performance in production environments
  • Design and maintain multi-turn conversation handling chat templates and prompt orchestration
  • Proactively monitor troubleshoot and resolve inference platform issues logs and errors
  • Manage Kubernetes deployments Helm charts and ArgoCD workflows for inference services
  • Ensure platform security CVE monitoring and compliance with internal and regulatory standards
  • Collaborate closely with backend platform and infrastructure teams
  • Maintain clear operational documentation to support shared ownership across multiple FTEs

Requirements:

  • 5 years of Python programming experience
  • Strong Kubernetes (k8s) experience including deployment scaling and monitoring
  • Experience handling large-scale logs monitoring and observability in production
  • Basic knowledge of LLM fundamentals and the surrounding industry (e.g. what type of models exist how does an LLM generate output)
  • Experience from the user side developing against an Inference API (e.g. OpenAI Anthropic OpenRouter etc.) and understanding of their structure (experience with providing or deploying a similar API yourself a strong plus)
  • Ability to independently own and operate inference services in a shared-responsibility model (12 FTE split across multiple specialists)
  • Strong communication skills and experience working with cross-functional engineering teams
  • Solid Linux fundamentals

Nice to have:

  • Hands-on experience with Helm charts ArgoCD and CI/CD for AI services
  • Interest in partly working with Rust
  • Senior-level experience with production LLM inference or AI platform operations
  • Experience building or operating multi-turn conversational AI systems
  • Familiarity with real-time API orchestration or streaming inference workloads
  • Background in MLOps AI platform engineering or SRE
  • Experience with cloud-based inference deployments and scaling
  • Knowledge of security CVE scanning and operational best practices

Technology Stack:

  • Inference: OpenAI Anthropic or other LLM inference APIs
  • Focus Areas: API integration multi-turn conversation orchestration tool calling platform reliability
  • Infrastructure: Kubernetes Helm ArgoCD cloud or hybrid environments
  • Monitoring: Logs metrics observability tools for inference systems
  • Workflow: Git CI/CD pipelines documentation operational runbooks incident handling
  • Standards: Reliability latency performance security maintainability

We offer*:

  • Flexible working format - remote office-based or flexible
  • A competitive salary and good compensation package
  • Personalized career growth
  • Professional development tools (mentorship program tech talks and trainings centers of excellence and more)
  • Active tech communities with regular knowledge sharing
  • Education reimbursement
  • Memorable anniversary presents
  • Corporate events and team buildings
  • Other location-specific benefits

*not applicable for freelancers


Required Experience:

IC

N-iXis a global software development service company that helps businesses across the globe create next-generation software products. Founded in 2002 we unite 2400 tech-savvy professionals across 40 countries working on impactful projects for industry leaders and Fortune 500 companies. Our expertise...
View more view more

Key Skills

  • ASP.NET
  • Health Education
  • Fashion Designing
  • Fiber
  • Investigation

About Company

Company Logo

N-iX is a global software development company that helps world’s leading organizations achieve lasting business value using advanced technology.

View Profile View Profile