AI Infrastructure Engineer

Percepta

Job Location:

New York City, NY - USA

Monthly Salary: $ 180 - 300

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Department:

Engineering

Job Summary

Who we are

Perceptas mission is to transform critical institutions with applied AI. We care that industries that power the world (healthcare manufacturing energy) benefit from frontier technology.

We collaborate with industry-leading customers to drive AI transformation. We bring:

Forward-deployed expertise in engineering product and research
Mosaic our in-house toolkit for rapidly deploying agentic architectures
Strategic partnerships with Anthropic McKinsey AWS and the General Catalyst portfolio

Our team is a fast-growing group of Applied AI Engineers Embedded Product Managers and Researchers motivated by getting frontier AI into the places that actually run the world.

Percepta is a direct partnership with General Catalyst.

About the role

Were hiring an AI Infrastructure Engineer to own the infrastructure deployment and operational reliability that powers Perceptas AI systems including the autonomous agents at the core of what we ship.

Part of the work is hardening what exists: tightening our Terraform footprint strengthening deployment pipelines bringing more rigor to how we manage infrastructure across regions and providers. Part of it is building whats missing. And part of it is genuinely new territory figuring out what SRE means when the systems youre operating make autonomous decisions.

The infrastructure patterns for the agentic systems of the future dont exist yet. Youll help define them.

Why this is different

Youre deploying autonomous systems. The infrastructure contract changes when your workloads have agency.
Observability means understanding why an agent made a decision not just whether a pod is healthy.
The gap between research and production is real here. Our teams move optimization algorithms and AI systems from research environments into production and youll be part of that handoff. MLOps experience isnt required but youll be closer to that boundary than most infra roles.
Small team. Real ownership. Youre making foundational decisions not inheriting someone elses.

What youll do

Define infrastructure patterns for multi-agent systems that need to be observable controllable and recoverable in ways traditional apps dont require
Own and evolve our IaC stack: Terraform and Kubernetes across AWS GCP and Azure
Build observability primitives for agentic workflows tracing agent decisions and execution paths not just service latency and pod health
Design and maintain CI/CD pipelines that give teams fast trustworthy feedback from commit to production
Build operational foundations: monitoring alerting incident response and the new patterns that emerge when AI systems are participants in that response
Work across engineering teams to meet the reliability and compliance requirements of the institutions we serve (SOC 2 HIPAA regulated environments in healthcare and energy)

What were looking for

5 years building and operating production infrastructure in DevOps or SRE roles
The kind of engineer who sees a manual process and cant rest until its automated well not just scripted
Strong hands-on Terraform experience
Deep experience with at least 1 major cloud provider (AWS GCP or Azure): networking IAM cost management the operational realities of production workloads
Solid Docker and Kubernetes experience in production. We run managed clusters across all 3 major clouds; this is a core part of the role
Experience designing and maintaining CI/CD pipelines (GitHub Actions GitLab CI or similar)
Scripting proficiency in Python Bash or similar
High agency: you dont wait for a ticket to fix whats broken but you communicate collaborate and bring the team along
Genuine curiosity about AI systems not just the infrastructure running them. You want to understand what youre operating
You find it interesting (not alarming) that some systems youll operate will be making decisions on their own

Nice to have

Multi-region and multi-cloud experience across 2 providers
Experience with single-tenant or on-prem deployments alongside multi-tenant SaaS
Familiarity with GitOps patterns and progressive delivery
Familiarity with the Grafana stack (Prometheus Grafana Loki) or equivalent
Experience with compliance frameworks (HIPAA SOC 2) and how they shape infrastructure decisions in regulated environments
Background supporting ML or research workflows moving to production: model deployment pipeline orchestration or similar
Youve thought about what observability means for non-deterministic systems and have opinions about it

The infrastructure patterns for autonomous AI systems are still being written. If you want to be one of the people writing them lets talk.

Our Values

Dream bigger: We have the unique privilege of taking on the most ambitious problems and we should chase them with optimism responsibility and genuine belief that we can make it happen. We have to embrace the hard things when no one else will.

Heart in the game: What were doing matters and we have to give a shit. Internally that means fixing badness when you find it. Externally it means honoring the trust our customers place in us with their most important problems. This isnt a 9-5 nor is it a job were ever going to monitor your hours. We promise to put work in front of you that matters and in return we ask you to promise to care.

Win for the customer: Everyone is an engineer and the job of an engineer is to deliver outcomes not outputs. Everything we dothe products we build the partnerships we launch the strategy we setexists to make our customers successful. Delivery is the strategy.

Make the call: Organizations are only as strong as the pace at which they make decisions. Everyone at Percepta should feel empowered to commit and shape the ambiguity in front of them. But make the call cuts both ways: make the decision and make the phone call. High-agency decision-making only works with high-bandwidth communication and we commit to never operate in silos.

Intensity with kindness: We believe in excellence in execution candor in feedback ruthlessness in prioritization and survivalist urgency. We also believe you dont need to be an asshole to deliver on any of this. The trust built through shared kindness and vulnerability is what makes the intensity sustainable.

Required Experience:

Who we arePerceptas mission is to transform critical institutions with applied AI. We care that industries that power the world (healthcare manufacturing energy) benefit from frontier technology.We collaborate with industry-leading customers to drive AI transformation. We bring:Forward-deployed expe...

Who we are

Perceptas mission is to transform critical institutions with applied AI. We care that industries that power the world (healthcare manufacturing energy) benefit from frontier technology.

We collaborate with industry-leading customers to drive AI transformation. We bring:

Forward-deployed expertise in engineering product and research
Mosaic our in-house toolkit for rapidly deploying agentic architectures
Strategic partnerships with Anthropic McKinsey AWS and the General Catalyst portfolio

Our team is a fast-growing group of Applied AI Engineers Embedded Product Managers and Researchers motivated by getting frontier AI into the places that actually run the world.

Percepta is a direct partnership with General Catalyst.

About the role

The infrastructure patterns for the agentic systems of the future dont exist yet. Youll help define them.

Why this is different

Youre deploying autonomous systems. The infrastructure contract changes when your workloads have agency.
Observability means understanding why an agent made a decision not just whether a pod is healthy.
The gap between research and production is real here. Our teams move optimization algorithms and AI systems from research environments into production and youll be part of that handoff. MLOps experience isnt required but youll be closer to that boundary than most infra roles.
Small team. Real ownership. Youre making foundational decisions not inheriting someone elses.

What youll do

Define infrastructure patterns for multi-agent systems that need to be observable controllable and recoverable in ways traditional apps dont require
Own and evolve our IaC stack: Terraform and Kubernetes across AWS GCP and Azure
Build observability primitives for agentic workflows tracing agent decisions and execution paths not just service latency and pod health
Design and maintain CI/CD pipelines that give teams fast trustworthy feedback from commit to production
Build operational foundations: monitoring alerting incident response and the new patterns that emerge when AI systems are participants in that response
Work across engineering teams to meet the reliability and compliance requirements of the institutions we serve (SOC 2 HIPAA regulated environments in healthcare and energy)

What were looking for

5 years building and operating production infrastructure in DevOps or SRE roles
The kind of engineer who sees a manual process and cant rest until its automated well not just scripted
Strong hands-on Terraform experience
Deep experience with at least 1 major cloud provider (AWS GCP or Azure): networking IAM cost management the operational realities of production workloads
Solid Docker and Kubernetes experience in production. We run managed clusters across all 3 major clouds; this is a core part of the role
Experience designing and maintaining CI/CD pipelines (GitHub Actions GitLab CI or similar)
Scripting proficiency in Python Bash or similar
High agency: you dont wait for a ticket to fix whats broken but you communicate collaborate and bring the team along
Genuine curiosity about AI systems not just the infrastructure running them. You want to understand what youre operating
You find it interesting (not alarming) that some systems youll operate will be making decisions on their own

Nice to have

Multi-region and multi-cloud experience across 2 providers
Experience with single-tenant or on-prem deployments alongside multi-tenant SaaS
Familiarity with GitOps patterns and progressive delivery
Familiarity with the Grafana stack (Prometheus Grafana Loki) or equivalent
Experience with compliance frameworks (HIPAA SOC 2) and how they shape infrastructure decisions in regulated environments
Background supporting ML or research workflows moving to production: model deployment pipeline orchestration or similar
Youve thought about what observability means for non-deterministic systems and have opinions about it

The infrastructure patterns for autonomous AI systems are still being written. If you want to be one of the people writing them lets talk.

Our Values

Required Experience:

Apply Now

About Company

Percepta

Transforming critical institutions using applied AI. Let's harness the frontier.

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click

AI Resume Builder

Create an ATS-ready CV in minutes

AI Cover Letter

Write a personalized letter instantly

AI Infrastructure Engineer

New York City, NY - USA

Department:

Job Summary

Who we are

About the role

Why this is different

What youll do

What were looking for

Nice to have

Our Values

Who we are

About the role

Why this is different

What youll do

What were looking for

Nice to have

Our Values

About Company

Related Jobs