Platform Engineer
Austin, TX - USA
Job Summary
Company Overview
Allen Control Systems (ACS) is acutting-edgedefense startup founded by two former Navy electrical engineers with a proventrack recordin robotics and software. We are developing a small autonomous gun turret that employs advanced computer vision and control systems to precisely target and neutralize small drones and loitering munitions. Our innovative approach requires overcoming significant technical challenges making this an exciting and dynamic environment for experienced engineers.
With an engineering-first culture ACS values technical excellence and innovation. Backed by our founders successful exits from twopreviousventuresacquiredfor a combined $180M in 2022 we are committed to ensuring that the groundbreaking technologies we develop have a real-world impact.
Position Overview
We are seeking an experiencedPlatform Engineerto design build and own the infrastructure poweringthe development ofACSs autonomous counter-drone systems. You will manage a 130 GPU bare-metal Kubernetes cluster own our CI/CD pipelines and ensure our systems run reliably in both lab and field environments.
WhatYoullDo
- Deploy andoperateKubernetes clusters on bare-metal infrastructure hosting 130 NVIDIA GPUs with hybrid burst capability to AWS for scalable compute and storage workloads.
- Manage NVIDIA GPU clusters for ML training.
- Own the full CI/CD pipeline from source to deployment including artifact signing build automation and version pinning ensuring repeatable delivery to cloud and edge targets.
- Build and maintain the observability stack including log aggregation metrics collection alerting and dashboards providing real-time visibility into cluster health and system performance. Collaborate with computer vision robotics and software engineering teams to build low-friction developer tooling that accelerates iteration on the ACS turret platform.
- Define and enforce infrastructure-as-code practices using Terraform Helm or Ansible across on-prem and cloud deployments.
- Manage network configuration storage provisioning and security hardening for the bare-metal cluster in compliance with applicable defense security requirements.
WhatYoullNeed
- Skill atPythonprogrammingand Bash scripting
- 2 years of experience in platform engineering DevOps or infrastructure engineering with hands-on experience in production Kubernetes environments.
- Deep expertise in bare-metal Kubernetes administration including CNI configuration storage backends node management and cluster upgrades.
- Hands-on experience with NVIDIA GPU infrastructure including CUDA device plugins GPU scheduling in Kubernetes and KubeFlow or similar ML orchestration tooling.
- Strong CI/CD experience including Debian packaging build automation artifact management and pipeline tooling (e.g. GitLab CI GitHub Actions Jenkins or equivalent).
- Proficiency with observability tooling (e.g. ELK) for log aggregation metrics and alerting in distributed Linux environments.
- Experience building C and Python toolchains on Linux using CMake with familiarity with cross-compilation for ARM targets such as NVIDIA Jetson.
- Strong Linux systems knowledge (Debian/Ubuntu preferred) including networking storage kernel tuning and security hardening for production environments.
What We Offer
- Competitive salary
- Health Dental Vision Insurance
- Paid Time Off
Allen Control Systems is an Equal Opportunity Employerprovidingequal employment opportunities to all employees and applicants for employment. Allen Control Systems prohibits discrimination and harassment of any type without regard to race color religion age sex national origin disability status genetics protected veteran status sexual orientation gender identity or expression or any other characteristic protected by federalstateor local laws.
#LI-AS1
Required Experience:
IC
About Company
Allen Control Systems is a robotics defense company purpose-built to deliver advanced robotic capabilities to fill modernization demands across the defense industry and national security communities.