Job Description
As the Principal Machine Learning Systems Engineer you will be the key architect and technical owner of the platform that powers Nearmaps AI innovation. You will lead a dedicated team responsible for our Kubernetes-based ML workflow orchestration platform setting its technical vision and driving its evolution.
This is a leadership role for a seasoned engineer who thinks in systems. You will not only build but also define the future of how we develop deploy and operate everything from traditional ML models to foundational models at scale. Your primary objective is to create a robust scalable and efficient ecosystem that acts as a force-multiplier for our entire AI organization.
Key Responsibilities
Architect and Own the ML Platform Vision: Define and own the technical roadmap for our core ML infrastructure. Make critical design decisions evaluate new technologies (e.g. orchestrators serving frameworks vector DBs) and ensure the platforms long-term scalability reliability and cost-effectiveness.
Lead and Mentor the Team: Lead a team of mid to senior ML Systems Engineers. Provide technical guidance foster their professional growth and cultivate a culture of engineering excellence pragmatic innovation and deep ownership.
Drive High-Impact System Design: Spearhead the design and implementation of foundational platform components including our Kubernetes-based workflow orchestration model training/inference services and observability stacks.
Define and Evangelize Engineering Excellence: Establish and champion best practices for MLOps and AIOps across the organization. Drive the adoption of automation Infrastructure as Code (IaC) CI/CD and robust security principles within the AI lifecycle.
Strategic Partnership and Influence: Act as a key technical partner to Data Science ML Engineering and other engineering teams. Proactively understand their challenges and translate their needs into a strategic actionable platform roadmap.
Personal Attributes We Value
Strategic Pragmatism: You possess a deep understanding of complex systems and can make informed practical trade-offs that balance immediate needs with a long-term architectural vision. You build for the future without over-engineering for the present.
Collaborative Leadership: You are a natural mentor and technical leader. You elevate the performance of your team by sharing knowledge setting high standards and fostering a collaborative environment where the best ideas win.
First-Principles Thinking: You have a deep-seated drive to understand the why behind technical choices. You cut through the noise to identify the core problem and design robust elegant solutions from the ground up.
Ownership Mentality: You take ultimate responsibility for the systems you and your team build. You are relentless in the pursuit of reliability efficiency and user impact.
Qualifications :
Key Requirements:
8 years of experience in software engineering with at least 4 years focused on building and operating large-scale infrastructure platform engineering or distributed systems in a production environment.
Proven experience leading technical projects and/or mentoring a team of engineers.
Deep production-level expertise in designing building and operating systems on Kubernetes.
Expert-level proficiency in a high-level programming language (Python preferred).
Demonstrable experience building and managing cloud infrastructure (AWS GCP Azure) using Infrastructure as Code (IaC) tools like Terraform or Pulumi.
Strong practical understanding of modern software development practices (Git CI/CD monitoring alerting automated testing).
Bachelors or Masters degree in Computer Science Engineering or a related technical field or equivalent practical experience.
Highly Desirable:
Hands-on experience with dedicated workflow orchestration frameworks such as Argo Workflows Kubeflow Pipelines or Flyte.
Experience designing and managing large-scale GPU-intensive workloads for ML workflows.
Deep familiarity with the MLOps/LLMOps ecosystem (e.g. model registries feature stores model serving frameworks RAG systems).
Experience with advanced Kubernetes concepts including service mesh (e.g. Istio) custom operators and multi-cluster networking.
A track record of contributing to open-source projects in the cloud-native or MLOps space.
Additional Information :
What we offer:
Remote Work :
No
Employment Type :
Full-time
The sky's not the limit at Nearmap. Nearmap is the Australian-founded, global tech pioneer innovating the location intelligence game. Customers rely on Nearmap for consistent, reliable, high-resolution imagery, insights, and answers to create meaningful change in the world and propel ... View more