Senior ML Ops LLM Ops Engineer

Caixa Mágica Software

Job Location:

Lisbon - Portugal

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

Overview:

This role focuses onbuilding and operating the ML Ops / LLM Ops pipeline that closes it: ingest production signal redact it store it slice it classify it surface the failures mine new eval cases and alert on regressions. You drive the toolchain decisions the data-governance posture and the day-to-day reliability of the pipeline itself. The Head of AI sets vision and priorities and you own the technical execution end-to-end.

What will you do

Design and build a source-agnostic ingestion pipeline for production ML / LLM traffic
Design storage tiering based on automotive and company requirements policy-driven retention windows and privacy requirements
Build slicing dashboards and the query path engineers use to debug production at 11p.m.
Enable autoraters and lightweight LLM classifiers across production traffic
Build the rule-based triage layer for obvious failures
Stand up the eval-mining workflow and wire regression alerts to model and prompt deploys
Implement PII redaction at the ingestion boundary and safety / abuse classification on inbound content
Define dashboard architecture wipeout mechanisms tool and hosting selection and operate the pipeline end-to-end

What are we looking for

Must Have

Proven experience building and operating data or ML platform systems in production covering ingest schema storage access control alerts and on-call
Hands-on experience building and running ML / LLM evaluation systems in production (offline regression sets online autoraters LLM-as-judge pipelines golden datasets)
Hands-on experience with LLM tracing and observability tooling
Experience shipping PII redaction or comparable data-handling controls in a regulated or multi-tenant environment with a pragmatic approach to data governance
Strong understanding of how ML and LLM-based systems fail in production: hallucination retrieval failures agent loops that dont terminate ASR / TTS degradation and prompt or model regressions across deploys
Production Python proficiency; hands-on engineer not advisory. Comfortable leveraging AI in everything you build

Nice to Have

Preferable multi-tenant or white-label SaaS experience with per-tenant data isolation
Azure experience and ability to make self-host vs managed SaaS calls on tradeoffs
Experience with autorater methodology and contamination defenses
Knowledge of vector databases embedding-based clustering or unsupervised failure-mode discovery
Experience with data-versioning tooling (LakeFS DVC Delta Lake)
GDPR / right-to-erasure work
Embedded automotive or another constrained environment context
Working knowledge of a language beyond English sufficient to validate non-English failure modes
Prior experience using Cloud (Microsoft Azure and AWS);
Prior experience with Claude Code;
Prior experience with GitHub;
Languages: Python primary SQL and some TypeScript for dashboards;
LLM APIs: Claude (Anthropic) OpenAI open-source models as needed
Android/AAOS ecosystem as clients

What can you expect from us

A permanent job contract for a long term project;
Tech equipment SIM Card personal smartphone;
Health and Life Insurance;
Social events and team buildings;
The commitment of letting you grow with us and be rewarded accordingly;
A dynamic and young team that will be always there to support you;
Training in the latest technologies;
Coffee fruits snacks and a warm welcoming when you pass by the office.

Required Experience:

Senior IC

Overview:This role focuses onbuilding and operating the ML Ops / LLM Ops pipeline that closes it: ingest production signal redact it store it slice it classify it surface the failures mine new eval cases and alert on regressions. You drive the toolchain decisions the data-governance posture and the ...