As an MLOps & AI Infrastructure Technical Referent at dLocal you will be the senior technical reference for how we build operate and evolve our ML and AI infrastructure.
Your mission is to enable Data Science and AI teams to take models and AI-powered services from idea to production in a reliable observable and compliant way. You will own the technical direction of our MLOps stack introduce AI safely into our engineering workflows and help the team scale its impact as usage and complexity grow.
A core part of this role is to use agents and AI services to automate as much as possible of what we do in MLOps from feature store and platform operations to fraud/anomaly workflows and ML cost optimization working side by side with the AI team.
This is a hands-on architecture and leadership role: you wont own product models yourself but you will deeply influence how every model and AI component is trained deployed monitored and run in production.
1. Technical strategy & architecture (MLOps)
- Define and evolve the end-to-end ML platform architecture (data training registry serving monitoring governance) used by multiple squads.
- Design standard patterns for:
- Reproducible training pipelines and experiment tracking.
- Model packaging versioning and promotion flows (dev staging production).
- Online and batch inference with safe rollout strategies (canary shadow rollback).
- Balance reliability performance and cost for ML workloads working closely with SRE/Infra and Finance/FinOps.
2. Daytoday MLOps enablement & operations
- Act as the goto person for complex MLOps questions: how to structure pipelines choose serving patterns or design monitoring and rollback.
- Review and challenge designs and deployments for new models and data pipelines ensuring they follow platform standards and nonfunctional requirements.
- Partner with Fraud Anomaly and other product squads to ensure:
- Clear SLAs/SLOs for ML components.
- Proper logging metrics and alerts for incidents and regressions.
- Contribute to oncall readiness: playbooks dashboards incident reviews and continuous improvement of our operational posture.
3. AI infrastructure & AIassisted operations
- Define infrastructure contracts and guardrails so that we can safely consume agents and AI services built by the AI team and extend them when needed from MLOps.
- Design patterns and tooling so that AI and agents automate as much as possible of what we do in MLOps for example:
- Feature platform operations (feature store pipelines backfills parity checks DQ/drift monitoring).
- MLOps platform workflows (training/eval pipelines promotion gates rollbacks documentation and runbook generation).
- Operational flows in Fraud / Anomaly (triage of alerts log/metric analysis enrichment of incident context).
- Platform FinOps & cost optimization (suggesting rightsizing schedule changes decommissioning opportunities).
- Contribute to evaluation observability and safety for these AIpowered automations (e.g. prompts policies redaction auditability) in close collaboration with dedicated AI teams.
4. Governance security & compliance
- Set and maintain technical standards for:
- Model and data access control PII handling and redaction.
- Auditability of model changes deployments and runtime behavior.
- Environment separation and change management for ML/AI workloads.
- Work with InfoSec and Architecture to ensure the platform aligns with regulatory and internal requirements while remaining practical for engineers and data scientists.
Nice to have
- Experience rolling out AI assistants (code or infra copilots AI log analysis etc.) inside engineering organizations including policies and best practices.
- Exposure to LLM and AI infrastructure (gateways vector stores evaluation harnesses) even if not as a core focus.
- Prior responsibilities as Technical Referent / Tech Lead / Architect for platforms or shared services.
- Contributions to internal standards RFCs guilds or tech communities.
Simplify your cross-border payment operations in high-growth markets. Send and receive funds locally, reaching new customers. One easy integration, unlimited secure transactions.