Career Area:
Technology Digital and Data
Job Description:
Your Work Shapes the World at Caterpillar Inc.
When you join Caterpillar yourejoining a global team who cares not just about the work we do but also about each other. We are the makers problem solvers and future world builders who are creating stronger more sustainable communities. We dontjust talk about progress and innovation here we make it happen with our customers where we work and live. Together we are building a better world so we can all enjoy living in it.
Senior Engineering Manager Data ML DevOps & AI Ops
About the Role
We are seeking a Senior Engineering Manager to lead our Data ML DevOps and AI Ops capabilities driving the design development deployment and intelligent operation of enterprise-scale data platforms machine learning systems and cloud-native infrastructure.
This role is accountable for operationalizing data and AI at scaleensuring reliability performance security and continuous optimization across data pipelines ML platforms application infrastructure and production environments. You will enable advanced analytics AI-driven applications and digital transformation initiatives by embedding automation observability and AI-powered operations into the core engineering ecosystem.
You will lead a multidisciplinary organization spanning Data Engineering ML Engineering Platform Engineering DevOps and AI Ops and play a critical role in enabling real-time insights predictive intelligence resilient platforms and intelligent automation across the enterprise.
Key Responsibilities
Leadership & Strategy
- Provide strategic direction and technical leadership across Data Ops ML Ops DevOps and AI Ops fostering a culture of engineering excellence automation and operational rigor.
- Define and execute the end-to-end platform strategy spanning data pipelines ML lifecycle CI/CD infrastructure and intelligent operations.
- Partner with executive leadership on technology roadmaps platform modernization vendor strategy and emerging capabilities in AI DevOps and cloud platforms.
Data ML & Platform Engineering
- Architect and scale cloud-native data platforms supporting real-time and batch ingestion transformation analytics and AI workloads.
- Drive ML Ops best practices for model training deployment monitoring retraining and governance across the full model lifecycle.
- Ensure seamless integration of data platforms ML services and application ecosystems.
DevOps & Platform Reliability
- Establish and mature DevOps practices including CI/CD pipelines infrastructure-as-code automated testing and release management for data ML and application platforms.
- Ensure high availability performance scalability and cost efficiency across cloud infrastructure and platform services.
- Embed SRE principles SLIs/SLOs and resilience engineering into platform operations.
AI Ops & Intelligent Operations
- Lead the adoption of AI Ops capabilities for proactive monitoring anomaly detection incident correlation root cause analysis and predictive remediation.
- Integrate observability signals (logs metrics traces events) across data ML and application platforms to enable intelligent self-healing systems.
- Drive automation to reduce manual operational overhead and improve MTTR reliability and platform insights.
Governance Security & Compliance
- Establish enterprise standards for data governance model governance security privacy and compliance across platforms.
- Ensure platforms meet enterprise regulatory and cybersecurity requirements by design.
Collaboration & Talent
- Collaborate with data scientists product teams architects and business stakeholders to translate AI and platform strategies into production-ready solutions.
- Lead talent development hiring and organizational design building a high-performing globally scalable engineering organization.
Required Qualifications
- Bachelors or Masters degree in Computer Science Engineering or related field.
- 15 years of experience in software data or platform engineering with 5 years in senior engineering leadership roles.
- Strong expertise across Data Engineering ML Ops DevOps and production platform operations.
- Hands-on experience with cloud platforms (AWS Azure or GCP) and container orchestration (Docker Kubernetes).
- Proven experience with CI/CD pipelines infrastructure-as-code (Terraform ARM CloudFormation) and automation frameworks.
- Solid understanding of streaming and data platforms (Kafka Spark Flink) and ML Ops tooling (MLflow Kubeflow SageMaker).
- Experience driving platform reliability security governance and compliance at enterprise scale.
- Strong leadership communication and stakeholder management skills.
Preferred Qualifications
- Experience with AI Ops platforms intelligent observability and incident automation.
- Exposure to feature stores model registries real-time inference and event-driven architectures.
- Knowledge of SRE practices error budgets and resilience engineering.
- Familiarity with GPU acceleration distributed training and high-performance computing.
- Experience with observability stacks (Prometheus Grafana OpenTelemetry) and log analytics platforms.
- Contributions to open-source projects or published work in data platforms ML Ops DevOps or AI Ops.
Why Join Us
- Lead enterprise-critical platforms at the intersection of Data AI DevOps and Intelligent Operations.
- Shape how AI is built deployed and operated at scale not just experimented with.
- Influence platform strategy and engineering culture across a global organization.
- Competitive compensation flexible work options and strong career growth opportunities.
Posting Dates:
March 13 2026 - March 27 2026
Caterpillar is an Equal Opportunity Employer. Qualified applicants of any age are encouraged to apply
Not ready to apply Join our Talent Community.
Required Experience:
Senior Manager
Career Area:Technology Digital and DataJob Description:Your Work Shapes the World at Caterpillar Inc. When you join Caterpillar yourejoining a global team who cares not just about the work we do but also about each other. We are the makers problem solvers and future world builders who are creating ...
Career Area:
Technology Digital and Data
Job Description:
Your Work Shapes the World at Caterpillar Inc.
When you join Caterpillar yourejoining a global team who cares not just about the work we do but also about each other. We are the makers problem solvers and future world builders who are creating stronger more sustainable communities. We dontjust talk about progress and innovation here we make it happen with our customers where we work and live. Together we are building a better world so we can all enjoy living in it.
Senior Engineering Manager Data ML DevOps & AI Ops
About the Role
We are seeking a Senior Engineering Manager to lead our Data ML DevOps and AI Ops capabilities driving the design development deployment and intelligent operation of enterprise-scale data platforms machine learning systems and cloud-native infrastructure.
This role is accountable for operationalizing data and AI at scaleensuring reliability performance security and continuous optimization across data pipelines ML platforms application infrastructure and production environments. You will enable advanced analytics AI-driven applications and digital transformation initiatives by embedding automation observability and AI-powered operations into the core engineering ecosystem.
You will lead a multidisciplinary organization spanning Data Engineering ML Engineering Platform Engineering DevOps and AI Ops and play a critical role in enabling real-time insights predictive intelligence resilient platforms and intelligent automation across the enterprise.
Key Responsibilities
Leadership & Strategy
- Provide strategic direction and technical leadership across Data Ops ML Ops DevOps and AI Ops fostering a culture of engineering excellence automation and operational rigor.
- Define and execute the end-to-end platform strategy spanning data pipelines ML lifecycle CI/CD infrastructure and intelligent operations.
- Partner with executive leadership on technology roadmaps platform modernization vendor strategy and emerging capabilities in AI DevOps and cloud platforms.
Data ML & Platform Engineering
- Architect and scale cloud-native data platforms supporting real-time and batch ingestion transformation analytics and AI workloads.
- Drive ML Ops best practices for model training deployment monitoring retraining and governance across the full model lifecycle.
- Ensure seamless integration of data platforms ML services and application ecosystems.
DevOps & Platform Reliability
- Establish and mature DevOps practices including CI/CD pipelines infrastructure-as-code automated testing and release management for data ML and application platforms.
- Ensure high availability performance scalability and cost efficiency across cloud infrastructure and platform services.
- Embed SRE principles SLIs/SLOs and resilience engineering into platform operations.
AI Ops & Intelligent Operations
- Lead the adoption of AI Ops capabilities for proactive monitoring anomaly detection incident correlation root cause analysis and predictive remediation.
- Integrate observability signals (logs metrics traces events) across data ML and application platforms to enable intelligent self-healing systems.
- Drive automation to reduce manual operational overhead and improve MTTR reliability and platform insights.
Governance Security & Compliance
- Establish enterprise standards for data governance model governance security privacy and compliance across platforms.
- Ensure platforms meet enterprise regulatory and cybersecurity requirements by design.
Collaboration & Talent
- Collaborate with data scientists product teams architects and business stakeholders to translate AI and platform strategies into production-ready solutions.
- Lead talent development hiring and organizational design building a high-performing globally scalable engineering organization.
Required Qualifications
- Bachelors or Masters degree in Computer Science Engineering or related field.
- 15 years of experience in software data or platform engineering with 5 years in senior engineering leadership roles.
- Strong expertise across Data Engineering ML Ops DevOps and production platform operations.
- Hands-on experience with cloud platforms (AWS Azure or GCP) and container orchestration (Docker Kubernetes).
- Proven experience with CI/CD pipelines infrastructure-as-code (Terraform ARM CloudFormation) and automation frameworks.
- Solid understanding of streaming and data platforms (Kafka Spark Flink) and ML Ops tooling (MLflow Kubeflow SageMaker).
- Experience driving platform reliability security governance and compliance at enterprise scale.
- Strong leadership communication and stakeholder management skills.
Preferred Qualifications
- Experience with AI Ops platforms intelligent observability and incident automation.
- Exposure to feature stores model registries real-time inference and event-driven architectures.
- Knowledge of SRE practices error budgets and resilience engineering.
- Familiarity with GPU acceleration distributed training and high-performance computing.
- Experience with observability stacks (Prometheus Grafana OpenTelemetry) and log analytics platforms.
- Contributions to open-source projects or published work in data platforms ML Ops DevOps or AI Ops.
Why Join Us
- Lead enterprise-critical platforms at the intersection of Data AI DevOps and Intelligent Operations.
- Shape how AI is built deployed and operated at scale not just experimented with.
- Influence platform strategy and engineering culture across a global organization.
- Competitive compensation flexible work options and strong career growth opportunities.
Posting Dates:
March 13 2026 - March 27 2026
Caterpillar is an Equal Opportunity Employer. Qualified applicants of any age are encouraged to apply
Not ready to apply Join our Talent Community.
Required Experience:
Senior Manager
View more
View less