Principal Engineer - Site Reliability Engineering (SRE)

Not Interested
Bookmark
Report This Job

profile Job Location:

Plano, TX - USA

profile Monthly Salary: Not Disclosed
Posted on: 30+ days ago
Vacancies: 1 Vacancy

Job Summary

Overview

Who we are

Collaborative. Respectful. A place to dream and do. These are just a few words that describe what life is like at Toyota. As one of the worlds most admired brands Toyota is growing and leading the future of mobility through innovative high-quality solutions designed to enhance lives and delight those we serve. Were looking for talented team members who want to Dream. Do. Grow. with us.

To save time applying Toyota does not offer sponsorship of job applicants for employment-based visas or any other work authorization for this position at this time.

Who were looking for

Toyotas MTD Digital Platform team is seeking a passionate and highly motivatedPrincipal Engineer - Site Reliability Engineering (SRE)to drive our Kubernetes microservices and containerization this role you will play a pivotal part in ensuring our platforms resilience optimizing resource utilization and enabling seamless disaster recovery and business continuity. Reporting to the Manager of OneTech - Manufacturing Systems you will help shape scalable secure and efficient infrastructure that powers Toyotas manufacturing innovation.

What youll be doing

  • Own the end-to-end management of Kubernetes clusters across on-premises and cloud environments ensuring high availability and performance

  • Design deploy and maintain scalable microservices using Helm charts GitOps tools like Argo CD and CI/CD pipelines built with GitHub Actions and Terraform

  • Troubleshoot and resolve complex issues spanning cluster components networking storage and application layers to minimize downtime

  • Implement and enforce security best practices to protect our containerized environments and applications

  • Monitor system health and resource usage using tools like Datadog Splunk and Prometheus driving continuous performance improvements

  • Collaborate closely with infrastructure networking security and application teams to align solutions with business needs and accelerate delivery

  • Lead incident response efforts and conduct post-mortem analyses to prevent future disruptions

  • Automate repetitive tasks and create sustainable systems that support Toyotas growth and innovation goals

  • Document best practices procedures and troubleshooting guides to empower the broader team

What you bring

  • Bachelors degree or equivalent experience providing a strong foundation in software engineering systems administration or related fields

  • 7 years of hands-on experience managing Kubernetes clusters container orchestration and microservices deployment in high-performance environments

  • Proven expertise with DevOps automation tools such as GitHub Actions Terraform Ansible Helm Rancher and Harness

  • Strong scripting skills in Python or similar languages to build testable automation solutions

  • Deep understanding of monitoring and logging frameworks including Datadog Splunk and Prometheus

  • Advanced experience deploying and managing distributed messaging systems like Kafka RabbitMQ MQTT or Amazon Kinesis

  • Experience with hybrid cloud/on-premises infrastructure including VMware and AWS services

  • Familiarity with business process mining tools (Celonis SAP Signavio UIPath) and project management platforms (JIRA MS Project)

Added bonus if you have

  • Certifications such as Certified Kubernetes Administrator (CKA) Certified Kubernetes Application Developer (CKAD) or AWS Certified DevOps Engineer

  • Background in automotive or manufacturing application development and deployment enhancing your ability to align technology with industry needs

  • Excellent analytical problem-solving and communication skills with a collaborative mindset to work effectively across teams

  • Experience with incident management platforms and leading cross-functional incident response

What well bring

During your interview process our team can fill you in on all the details of our industry-leading benefits and career development opportunities. A few highlights include:

  • A work environment built on teamwork flexibility and respect

  • Professional growth and development programs to help advance your career as well as tuition reimbursement

  • Team Member Vehicle Purchase Discount

  • Toyota Team Member Lease Vehicle Program (if applicable)

  • Comprehensive health care and wellness plans for your entire family

  • Flextime and virtual work options (if applicable)

  • Toyota 401(k) Savings Plan featuring a company match as well as an annual retirement contribution from Toyota regardless of whether you contribute

  • Paid holidays and paid time off

  • Referral services related to prenatal services adoption child care schools and more

  • Tax Advantaged Accounts (Health Savings Account Health Care FSA Dependent Care FSA)

Belonging at Toyota

Our success begins and ends with our people. We embrace all perspectives and value unique human experiences. Respect for all is our North Star. Toyota is proud to have 10 different Business Partnering Groups across 100 different North American chapter locations that support team members efforts to dream do and grow without questioning that they belong.

Applicants for our positions are considered without regard to race ethnicity national origin sex sexual orientation gender identity or expression age disability religion military or veteran status or any other characteristics protected by law.

Have a question need assistance with your application or do you require any special accommodations Please send an email to .


Required Experience:

Staff IC

OverviewWho we areCollaborative. Respectful. A place to dream and do. These are just a few words that describe what life is like at Toyota. As one of the worlds most admired brands Toyota is growing and leading the future of mobility through innovative high-quality solutions designed to enhance live...
View more view more

Key Skills

  • Kubernetes
  • FMEA
  • Continuous Improvement
  • Elasticsearch
  • Go
  • Root cause Analysis
  • Maximo
  • CMMS
  • Maintenance
  • Mechanical Engineering
  • Manufacturing
  • Troubleshooting