Important Information
Experience: 10 years
Job Mode: Full-time
Work Mode: Remote
ID: 20192
Job Summary
We are seeking a Senior AWS Infrastructure & Platform Engineer with deep expertise in designing optimizing and operating cloud infrastructure on AWS. This role is responsible for improving the reliability performance and cost-efficiency of our AWS environments while enabling engineering teams to deploy faster and with greater confidence. The ideal candidate combines hands-on AWS architecture skills with a strategic mindset for infrastructure optimization and a track record of measurable improvements in system performance deployment velocity and cloud spend.
Responsibilities and Duties
AWS Infrastructure Strategy & Cost Optimization
- AWS Platform Strategy: Define and implement the strategic vision for LendKeys AWS infrastructure ensuring alignment with business goals while continuously optimizing for performance reliability and cost.
- Cost Optimization: Proactively analyze and reduce AWS spend using Cost Explorer Trusted Advisor Compute Optimizer and custom reporting. Implement rightsizing Reserved Instance/Savings Plan strategies spot instance utilization and architectural efficiencies to drive measurable cost reductions.
- Infrastructure Efficiency: Identify and eliminate waste across compute storage networking and data transfer. Establish tagging strategies and cost allocation models to drive accountability across teams.
Reliability & Performance Engineering
- System Reliability: Design and implement highly available fault-tolerant architectures across multiple AWS availability zones. Drive improvements to meet and exceed 99.9% SLA through proactive capacity planning auto-scaling and chaos engineering practices.
- Performance Optimization: Continuously profile benchmark and optimize system performance across compute networking storage and database layers. Reduce latency improve throughput and eliminate bottlenecks.
- Disaster Recovery: Design document and regularly test disaster recovery procedures to ensure business continuity across all critical systems.
Deployment Speed & Developer Experience
- CI/CD Pipeline Optimization: Design maintain and continuously improve CI/CD pipelines to minimize build times reduce deployment friction and enable engineering teams to release frequently and safely.
- Deployment Strategies: Implement and manage blue/green canary and rolling deployment patterns to minimize downtime and reduce deployment risk.
- Developer Self-Service: Build internal tooling templates and self-service capabilities that enable engineering teams to independently provision resources deploy services and troubleshoot issues within established guardrails.
- Infrastructure as Code (IaC): Champion Terraform best practices including module development state management code review workflows and automated drift detection to ensure all infrastructure is version-controlled and reproducible.
Monitoring Observability & Incident Management
- Observability: Implement and maintain comprehensive monitoring logging and tracing solutions (e.g. CloudWatch New Relic Prometheus/Grafana OpenTelemetry) to provide full-stack visibility into system health and performance.
- SLI/SLO Framework: Define and instrument Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to create data-driven reliability targets and alerting.
- Incident Response: Lead a rotational on-call schedule. Drive blameless post-incident reviews and implement preventive measures to reduce incident recurrence.
What We Are Looking For
- A hands-on AWS infrastructure engineer who architects and optimizes cloud environments for reliability speed and cost-efficiency.
- A systems thinker who proactively identifies bottleneckswhether in deployment pipelines cloud costs or system performanceand drives measurable improvements.
- A builder who creates tools automation and self-service platforms that multiply the effectiveness of engineering teams.
- A leader who thrives in collaborative environments and can independently drive complex projects from concept to production.
- Someone who stays current with AWS services cloud-native architecture patterns and infrastructure best practices.
- A professional who brings a FinOps mindsetbalancing cost performance and reliability in every infrastructure decision.
Essential Qualifications
10 years in the technology field with at least 35 years focused specifically on AWS infrastructure engineering platform engineering or cloud/site reliability operations.
Proven track record of managing and optimizing large-scale AWS environments in production with demonstrated cost savings or performance improvements.
AWS Cloud Services & Architecture
- Deep hands-on experience with core AWS services: ECS EC2 EKS EBS S3 RDS/Aurora Lambda CloudFront Route 53 ALB/NLB and VPC networking.
- Demonstrated ability to design highly available fault-tolerant and auto-scaling architectures across multiple availability zones.
- Hands-on experience with AWS cost management tools (Cost Explorer Trusted Advisor Compute Optimizer) and proven implementation of cost reduction strategies including rightsizing Reserved Instances Savings Plans and spot instance optimization.
- Working knowledge of the AWS Well-Architected Framework particularly the Cost Optimization Reliability and Performance Efficiency pillars.
Container Orchestration
- Experience managing and scaling Kubernetes clusters (EKS) in production including cluster lifecycle management node group optimization and workload right-sizing.
- Understanding of Kubernetes networking (CNI service mesh) resource management (requests/limits HPA/VPA) and container security best practices.
Infrastructure as Code & Automation
- Strong proficiency in Terraform for building managing and versioning AWS infrastructure including module authoring remote state management and workspace strategies.
- Experience with configuration management tools (Ansible Chef or equivalent) for operational automation.
- Proficiency in scripting languages (Python Bash or Go) for building infrastructure automation tooling and custom integrations.
CI/CD & Deployment Engineering
- Hands-on experience designing and optimizing CI/CD pipelines (e.g. GitHub Actions Jenkins or AWS CodePipeline) for fast safe and repeatable deployments.
- Experience implementing and managing deployment strategies such as blue/green canary and rolling deployments to minimize risk and downtime.
Monitoring & Observability
- Experience implementing and managing monitoring logging and alerting solutions (CloudWatch New Relic Prometheus/Grafana ELK/OpenSearch).
- Demonstrated ability to define SLIs/SLOs and build dashboards and alerts that drive proactive issue detection and resolution.
Documentation & Communication
- Strong capabilities in creating and maintaining comprehensive technical documentation including runbooks architecture decision records (ADRs) and operational playbooks.
- Ability to communicate complex technical concepts clearly to both engineering teams and non-technical business stakeholders.
Preferred Qualifications
- AWS certifications such as Solutions Architect Professional DevOps Engineer Professional or SysOps Administrator.
- Experience with FinOps practices including building cost visibility dashboards and driving cost accountability across engineering teams.
- Experience with GitOps workflows and tools (ArgoCD Flux).
- Familiarity with service mesh technologies or API gateway management.
- Experience migrating or re-architecting legacy workloads to cloud-native patterns on AWS.
About Encora
Encora is a global company that offers Software and Digital Engineering solutions. Our practices include Cloud Services Product Engineering & Application Modernization Data & Analytics Digital Experience & Design Services DevSecOps Cybersecurity Quality Engineering AI & LLM Engineering among others.
At Encora we hire professionals based solely on their skills and do not discriminate based on age disability religion gender sexual orientation socioeconomic status or nationality.
Required Experience:
Senior IC
Important InformationExperience: 10 yearsJob Mode: Full-timeWork Mode: RemoteID: 20192Job SummaryWe are seeking a Senior AWS Infrastructure & Platform Engineer with deep expertise in designing optimizing and operating cloud infrastructure on AWS. This role is responsible for improving the reliabilit...
Important Information
Experience: 10 years
Job Mode: Full-time
Work Mode: Remote
ID: 20192
Job Summary
We are seeking a Senior AWS Infrastructure & Platform Engineer with deep expertise in designing optimizing and operating cloud infrastructure on AWS. This role is responsible for improving the reliability performance and cost-efficiency of our AWS environments while enabling engineering teams to deploy faster and with greater confidence. The ideal candidate combines hands-on AWS architecture skills with a strategic mindset for infrastructure optimization and a track record of measurable improvements in system performance deployment velocity and cloud spend.
Responsibilities and Duties
AWS Infrastructure Strategy & Cost Optimization
- AWS Platform Strategy: Define and implement the strategic vision for LendKeys AWS infrastructure ensuring alignment with business goals while continuously optimizing for performance reliability and cost.
- Cost Optimization: Proactively analyze and reduce AWS spend using Cost Explorer Trusted Advisor Compute Optimizer and custom reporting. Implement rightsizing Reserved Instance/Savings Plan strategies spot instance utilization and architectural efficiencies to drive measurable cost reductions.
- Infrastructure Efficiency: Identify and eliminate waste across compute storage networking and data transfer. Establish tagging strategies and cost allocation models to drive accountability across teams.
Reliability & Performance Engineering
- System Reliability: Design and implement highly available fault-tolerant architectures across multiple AWS availability zones. Drive improvements to meet and exceed 99.9% SLA through proactive capacity planning auto-scaling and chaos engineering practices.
- Performance Optimization: Continuously profile benchmark and optimize system performance across compute networking storage and database layers. Reduce latency improve throughput and eliminate bottlenecks.
- Disaster Recovery: Design document and regularly test disaster recovery procedures to ensure business continuity across all critical systems.
Deployment Speed & Developer Experience
- CI/CD Pipeline Optimization: Design maintain and continuously improve CI/CD pipelines to minimize build times reduce deployment friction and enable engineering teams to release frequently and safely.
- Deployment Strategies: Implement and manage blue/green canary and rolling deployment patterns to minimize downtime and reduce deployment risk.
- Developer Self-Service: Build internal tooling templates and self-service capabilities that enable engineering teams to independently provision resources deploy services and troubleshoot issues within established guardrails.
- Infrastructure as Code (IaC): Champion Terraform best practices including module development state management code review workflows and automated drift detection to ensure all infrastructure is version-controlled and reproducible.
Monitoring Observability & Incident Management
- Observability: Implement and maintain comprehensive monitoring logging and tracing solutions (e.g. CloudWatch New Relic Prometheus/Grafana OpenTelemetry) to provide full-stack visibility into system health and performance.
- SLI/SLO Framework: Define and instrument Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to create data-driven reliability targets and alerting.
- Incident Response: Lead a rotational on-call schedule. Drive blameless post-incident reviews and implement preventive measures to reduce incident recurrence.
What We Are Looking For
- A hands-on AWS infrastructure engineer who architects and optimizes cloud environments for reliability speed and cost-efficiency.
- A systems thinker who proactively identifies bottleneckswhether in deployment pipelines cloud costs or system performanceand drives measurable improvements.
- A builder who creates tools automation and self-service platforms that multiply the effectiveness of engineering teams.
- A leader who thrives in collaborative environments and can independently drive complex projects from concept to production.
- Someone who stays current with AWS services cloud-native architecture patterns and infrastructure best practices.
- A professional who brings a FinOps mindsetbalancing cost performance and reliability in every infrastructure decision.
Essential Qualifications
10 years in the technology field with at least 35 years focused specifically on AWS infrastructure engineering platform engineering or cloud/site reliability operations.
Proven track record of managing and optimizing large-scale AWS environments in production with demonstrated cost savings or performance improvements.
AWS Cloud Services & Architecture
- Deep hands-on experience with core AWS services: ECS EC2 EKS EBS S3 RDS/Aurora Lambda CloudFront Route 53 ALB/NLB and VPC networking.
- Demonstrated ability to design highly available fault-tolerant and auto-scaling architectures across multiple availability zones.
- Hands-on experience with AWS cost management tools (Cost Explorer Trusted Advisor Compute Optimizer) and proven implementation of cost reduction strategies including rightsizing Reserved Instances Savings Plans and spot instance optimization.
- Working knowledge of the AWS Well-Architected Framework particularly the Cost Optimization Reliability and Performance Efficiency pillars.
Container Orchestration
- Experience managing and scaling Kubernetes clusters (EKS) in production including cluster lifecycle management node group optimization and workload right-sizing.
- Understanding of Kubernetes networking (CNI service mesh) resource management (requests/limits HPA/VPA) and container security best practices.
Infrastructure as Code & Automation
- Strong proficiency in Terraform for building managing and versioning AWS infrastructure including module authoring remote state management and workspace strategies.
- Experience with configuration management tools (Ansible Chef or equivalent) for operational automation.
- Proficiency in scripting languages (Python Bash or Go) for building infrastructure automation tooling and custom integrations.
CI/CD & Deployment Engineering
- Hands-on experience designing and optimizing CI/CD pipelines (e.g. GitHub Actions Jenkins or AWS CodePipeline) for fast safe and repeatable deployments.
- Experience implementing and managing deployment strategies such as blue/green canary and rolling deployments to minimize risk and downtime.
Monitoring & Observability
- Experience implementing and managing monitoring logging and alerting solutions (CloudWatch New Relic Prometheus/Grafana ELK/OpenSearch).
- Demonstrated ability to define SLIs/SLOs and build dashboards and alerts that drive proactive issue detection and resolution.
Documentation & Communication
- Strong capabilities in creating and maintaining comprehensive technical documentation including runbooks architecture decision records (ADRs) and operational playbooks.
- Ability to communicate complex technical concepts clearly to both engineering teams and non-technical business stakeholders.
Preferred Qualifications
- AWS certifications such as Solutions Architect Professional DevOps Engineer Professional or SysOps Administrator.
- Experience with FinOps practices including building cost visibility dashboards and driving cost accountability across engineering teams.
- Experience with GitOps workflows and tools (ArgoCD Flux).
- Familiarity with service mesh technologies or API gateway management.
- Experience migrating or re-architecting legacy workloads to cloud-native patterns on AWS.
About Encora
Encora is a global company that offers Software and Digital Engineering solutions. Our practices include Cloud Services Product Engineering & Application Modernization Data & Analytics Digital Experience & Design Services DevSecOps Cybersecurity Quality Engineering AI & LLM Engineering among others.
At Encora we hire professionals based solely on their skills and do not discriminate based on age disability religion gender sexual orientation socioeconomic status or nationality.
Required Experience:
Senior IC
View more
View less