Platform Engineer SRE ()
Job Summary
About the Role
We are looking for a Platform Engineer with a strong SRE/DevOps mindset to join a sustainability-focused engineering team. This role is centered on operational excellence not feature development. You will own production troubleshooting performance tuning and platform reliability improvements working on complex issues that Tier 1 2 and 3 support teams escalate when they cannot resolve them independently.
C# and SQL Server expertise are non-negotiable. Executive-level communication is equally critical as this role requires clear confident interaction with senior stakeholders across engineering product and operations.
Key Responsibilities
We are looking for a Platform Engineer with a strong SRE/DevOps mindset to join a sustainability-focused engineering team. This role is centered on operational excellence not feature development. You will own production troubleshooting performance tuning and platform reliability improvements working on complex issues that Tier 1 2 and 3 support teams escalate when they cannot resolve them independently.
C# and SQL Server expertise are non-negotiable. Executive-level communication is equally critical as this role requires clear confident interaction with senior stakeholders across engineering product and operations.
Key Responsibilities
- Troubleshoot and resolve complex production issues spanning application database and infrastructure layers.
- Diagnose and resolve software and SQL performance problems escalated by support teams.
- Optimize SQL Server performance through query tuning indexing strategies and execution plan analysis.
- Define and track SLIs SLOs and error budgets to maintain platform reliability standards.
- Drive platform improvements focused on scalability reliability and operational health.
- Build and maintain runbooks playbooks and observability frameworks to reduce mean time to resolution (MTTR).
- Monitor system performance and proactively identify risk areas using SolarWinds DPA Splunk and New Relic.
- Champion automation of toil reduction capacity planning and self-healing infrastructure patterns using Terraform and CI/CD pipelines.
- Manage and optimize containerized workloads using Docker and Kubernetes in Azure cloud environments.
- Lead incident response root cause analysis and post-incident reviews.
- Collaborate with cross-functional teams and communicate findings clearly to executive stakeholders.
Required Skills
- 8 years of software engineering experience with heavy production support and SRE/platform engineering exposure.
- Expert proficiency in C# Core 8.0.
- Deep expertise in Microsoft SQL Server including stored procedures query optimization and execution plan tuning.
- Hands-on experience with Azure cloud services including Azure Functions App Services Service Bus and AKS (Azure Kubernetes Service).
- Strong experience with containerization and orchestration using Docker and Kubernetes.
- Proficiency in infrastructure-as-code using Terraform for cloud provisioning and environment management.
- Hands-on experience building and maintaining CI/CD pipelines using tools such as Azure DevOps or GitHub Actions.
- Proven experience defining and operating against SLOs error budgets and reliability targets.
- Experience with SolarWinds DPA Splunk and New Relic.
- Strong understanding of microservices distributed systems and fault-tolerant architecture patterns.
- Excellent verbal and written communication skills including the ability to present to executive audiences.
Required Experience:
IC