DescriptionThe Lead Platform Engineer role is responsible for setting strategic priorities for the maintenance of platforms within the current technology environment. The role involves sharing leading best practices and providing guidance to maintain the overall integrity of the platform including operating systems hardware and software infrastructure that are critical to the companys user populace and application environments. They foster a culture of collaboration with cross-functional teams to ensure that all components of the platform are working efficiently securely and resiliently. The Lead Platform Engineer leverages a strategic mindset offers advice and drives the development and implementation of security policies to ensure high performance and reliability of the companys infrastructure including both cloud and on-premises systems in support of DevOps efforts and general business and IT objectives. The Lead Platform Engineer role is responsible for setting strategic priorities for the maintenance of platforms within the current technology environment. The role involves sharing leading best practices and providing guidance to maintain the overall integrity of the platform including operating systems hardware and software infrastructure that are critical to the companys user populace and application environments. They foster a culture of collaboration with cross-functional teams to ensure that all components of the platform are working efficiently securely and resiliently. The Lead Platform Engineer leverages a strategic mindset offers advice and drives the development and implementation of security policies to ensure high performance and reliability of the companys infrastructure including both cloud and on-premises systems in support of DevOps efforts and general business and IT objectives.
Responsibilities- Set strategic priorities for platform hardware software and network requirements in alignment with overall organizational goals
- Share best practices for monitoring system performance and lead teams responsible for system performance and troubleshooting of issues
- Leverage advanced techniques and methods for optimizing DevOps tools (such as API Gateway Teraform) to support processes as per organizational needs
- Foster a culture of collaboration among cross-functional teams to ensure new features and services are brought into production
- Leverage a strategic mindset to oversee the execution and maintenance of automated and orchestrated fulfillment mechanisms to optimize delivery of key platform services
- Lead cross-team collaboration and workstreams to ensure smooth operations
- Identify potential security threats mitigate them proactively and set standards for a strict security compliance
- Drive and oversee the development and implementation of security and resiliency policies standards and procedures in line with organizational strategic priorities and latest industry regulations
- Evaluate high-level design documentation share feedback and provide recommendations
- Interpret results and present findings from technical research regarding user requests for new/modified systems or severe problem resolution
- Foster strong relationships with vendors to create mutually beneficial opportunities for the organizationSet strategic priorities for platform hardware software and network requirements in alignment with overall organizational goals
- Share best practices for monitoring system performance and lead teams responsible for system performance and troubleshooting of issues
- Leverage advanced techniques and methods for optimizing DevOps tools (such as API Gateway Teraform) to support processes as per organizational needs
- Foster a culture of collaboration among cross-functional teams to ensure new features and services are brought into production
- Leverage a strategic mindset to oversee the execution and maintenance of automated and orchestrated fulfillment mechanisms to optimize delivery of key platform services
- Lead cross-team collaboration and workstreams to ensure smooth operations
- Identify potential security threats mitigate them proactively and set standards for a strict security compliance
- Drive and oversee the development and implementation of security and resiliency policies standards and procedures in line with organizational strategic priorities and latest industry regulations
- Evaluate high-level design documentation share feedback and provide recommendations
- Interpret results and present findings from technical research regarding user requests for new/modified systems or severe problem resolution
- Foster strong relationships with vendors to create mutually beneficial opportunities for the organization
QualificationsEducation
- Bachelors Degree in Computer Science or a related field. In lieu of a degree at least 12 years of experience in the role of Platform Engineer or related position
- Preferred
- Azure Solutions Architect Certified Kubernetes Administrator (CKA) or AWS Certified DevOps Engineer preferred
Knowledge & Experience
- 6-8 years of experience in platform engineering and infrastructure components
- Expertise in automated interaction with physical infrastructure
- Proven track record in managing and maintaining cloud-based environments such as AWS GCP or Azure
- Expertise in monitoring and log management tools such as Nagios Splunk and ELK
- Proven track record in developing and implementing backup and disaster recovery strategies
- Expertise in scripting experience in languages such as Python PowerShell and Bash
- Expertise in Infrastructure as Code (IaC) and configuration management tools such as Ansible Chef and Terraform
- Technical Skills
- Programming Languages
- Cloud Technologies & Platforms
- Continuous Integration and Continuous Deployment (CI/CD)
- DevOps Methodology
- Software Packaging and Deployment Procedures
- Operating Systems
- Network Operations Configuration & Services
- Containerization
- Monitoring and Logging
- Integration Technology
- Preferred
- 3 years of hands-on experience with containerization technologies including Docker and Kubernetes and Rancher for orchestration and cluster management
- Proven expertise in maintaining platform-level runtimes including ingress controllers (e.g. NGINX) TLS certificate lifecycle management (e.g. cert-manager) and secrets management integrations (e.g. HashiCorp Vault)
- Experience deploying and managing observability tools such as Sysdig for monitoring and CVE scanning Fluentd for log forwarding and Elasticsearch for log analysis
- Familiarity with GitOps practices and tools particularly ArgoCD to support continuous deployment workflows
- Strong communication skills with the ability to convey complex information to non-technical stakeholders and vice versa
- Ability to work with cross-functional teams including architects and leadership to define strategic roadmaps and drive progress on complex platform initiatives
- Experience supporting vendor engagements including coordinating technical requirements assisting with renewals and maintaining productive ongoing relationships to ensure alignment with platform goals and service expectations3 years of hands-on experience with containerization technologies including Docker and Kubernetes and Rancher for orchestration and cluster management
- Proven expertise in maintaining platform-level runtimes including ingress controllers (e.g. NGINX) TLS certificate lifecycle management (e.g. cert-manager) and secrets management integrations (e.g. HashiCorp Vault)
- Experience deploying and managing observability tools such as Sysdig for monitoring and CVE scanning Fluentd for log forwarding and Elasticsearch for log analysis
- Familiarity with GitOps practices and tools particularly ArgoCD to support continuous deployment workflows
- Strong communication skills with the ability to convey complex information to non-technical stakeholders and vice versa
- Ability to work with cross-functional teams including architects and leadership to define strategic roadmaps and drive progress on complex platform initiatives
- Experience supporting vendor engagements including coordinating technical requirements assisting with renewals and maintaining productive ongoing relationships to ensure alignment with platform goals and service expectations