Stefanini Group is hiring!
Stefanini is looking for a Platform Engineer in Dearborn MI (Onsite)
For quick apply please reach out to Adil Khan at/
We are looking for a Platform Engineer to help product teams deliver securely reliably and quickly. This role leans toward cloud infrastructure DevOps and Site Reliability Engineering (SRE) with strong software development skills.
Responsibilities
- Design and Operate Cloud Infrastructure: Build and manage cloud platforms including networking compute Kubernetes CI/CD secrets and identity.
- Define Reliability Metrics: Establish and enhance SLIs SLOs and error budgets.
- Implement Observability: Set up metrics logs and traces with actionable alerts.
- Automate Workflows: Develop self-service workflows (e.g. infrastructure as code GitOps CI/CD pipelines) to reduce manual efforts.
- Enhance Security & Compliance: Drive least-privilege access secure defaults and policy-as-code.
- Incident Management: Participate in on-call rotations handle incidents lead postmortems and deliver fixes.
- Collaborate with Teams: Partner with application teams to improve deployability resilience and cost efficiency.
Experience Required
- Managed production-grade infrastructure on major cloud platforms like GCP.Designed multi-region GCP networks using VPCs subnets firewalls and NAT managed with Terraform and GitOps.
- Strong understanding of networking IAM boundaries and tradeoffs between managed services and self-hosted solutions.
- Built production-grade Python tools or automation with structured testable and maintainable code. Automated tasks like querying GCP Asset Inventory generating IAM reports and creating tickets with retry/error handling.
- Operated GCP services like Cloud Run Workload Identity Secret Manager and VPC Service Controls. Applied GCP-specific reliability and security patterns with hands-on experience.
- Supported internal developer teams by handling on-call rotations resolving incidents and delivering systemic fixes.
- Managed production Kubernetes clusters performed upgrades configured policies and debugged issues. Configured HPA/VPA for autoscaling and troubleshot pod scheduling and service mesh connectivity. Strong understanding of Kubernetes control planes for debugging and management
Experience Preferred
- Wrote Go for platform tooling or infrastructure automation. Developed Kubernetes admission webhooks to enforce security policies or CLI tools for secret management. Produced idiomatic Go with proper error handling context propagation and unit tests.
- Contributed to or led the design of multi-team or multi-service platform architectures. Designed shared service networks (hub-and-spoke models) CI/CD templates and service mesh configurations. Documented architecture patterns adopted by teams and articulated tradeoffs in design reviews.
- Implemented SRE practices including SLIs SLOs and error budgets. Configured SLO-based alerting in Prometheus/Grafana and used burn rate alerts for incident management.
Required Skills
- Cloud Platforms: Experience managing production-grade systems on GCP AWS or Azure with an SRE mindset.
- Linux & Networking: Strong fundamentals in Linux distributed systems and debugging production issues.
- Infrastructure as Code: Skilled in tools like Terraform Helm Kustomize and GitOps practices.
- Containers & Orchestration: Proficient in Docker Kubernetes and modern CI/CD tools.
- Programming: Experience with languages like Python Go Java or TypeScript for building tools and automation.
- Communication: Clear communicator with effective incident leadership and a customer-first approach.
Preferred Skills
- SLI/SLO Expertise: Experience defining SLIs/SLOs and implementing SLO-based alerting and dashboards.
- Observability Platforms: Familiarity with Prometheus/Grafana OpenTelemetry and centralized logging.
- Security Practices: Knowledge of policy-as-code supply chain security SBOMs and artifact signing.
- Standardized Solutions: Experience creating reusable golden paths (e.g. container images templates pipelines).
- Cost Optimization: Skilled in FinOps practices capacity planning and multi-tenant platform controls.
- Go: Proficient in writing idiomatic Go for platform tooling or infrastructure automation.
- Cloud Architecture: Experience designing multi-service or multi-team platform architectures.
- Reliability Engineering: Practical implementation of SRE practices including SLIs SLOs error budgets and alerting.
Education Required
**Listed salary ranges may vary based on experience qualifications and local market. Also some positions may include bonuses or other incentives***
Stefanini takes pride in hiring top talent and developing relationships with our future employees. Our talent acquisition teams will never make an offer of employment without having a phone conversation with you. Those face-to-face conversations will involve a description of the job for which you have applied. We will also speak with you about the process including interviews and job offers.
About Stefanini Group
The Stefanini Group is a global provider of offshore onshore and near shore outsourcing IT digital consulting systems integration application and strategic staffing services to Fortune 1000 enterprises around the world. Our presence is in countries like the Americas Europe Africa and Asia and more than four hundred clients across a broad spectrum of markets including financial services manufacturing telecommunications chemical services technology public sector and utilities. Stefanini is a CMM level 5 IT consulting company with a global presence. We are a CMM Level 5 company.
#LI-AK3
#LI-ONSITE
Required Experience:
IC
Details:Stefanini Group is hiring!Stefanini is looking for a Platform Engineer in Dearborn MI (Onsite)For quick apply please reach out to Adil Khan at/ We are looking for a Platform Engineer to help product teams deliver securely reliably and quickly. This role leans toward cloud infrastructure DevO...
Stefanini Group is hiring!
Stefanini is looking for a Platform Engineer in Dearborn MI (Onsite)
For quick apply please reach out to Adil Khan at/
We are looking for a Platform Engineer to help product teams deliver securely reliably and quickly. This role leans toward cloud infrastructure DevOps and Site Reliability Engineering (SRE) with strong software development skills.
Responsibilities
- Design and Operate Cloud Infrastructure: Build and manage cloud platforms including networking compute Kubernetes CI/CD secrets and identity.
- Define Reliability Metrics: Establish and enhance SLIs SLOs and error budgets.
- Implement Observability: Set up metrics logs and traces with actionable alerts.
- Automate Workflows: Develop self-service workflows (e.g. infrastructure as code GitOps CI/CD pipelines) to reduce manual efforts.
- Enhance Security & Compliance: Drive least-privilege access secure defaults and policy-as-code.
- Incident Management: Participate in on-call rotations handle incidents lead postmortems and deliver fixes.
- Collaborate with Teams: Partner with application teams to improve deployability resilience and cost efficiency.
Experience Required
- Managed production-grade infrastructure on major cloud platforms like GCP.Designed multi-region GCP networks using VPCs subnets firewalls and NAT managed with Terraform and GitOps.
- Strong understanding of networking IAM boundaries and tradeoffs between managed services and self-hosted solutions.
- Built production-grade Python tools or automation with structured testable and maintainable code. Automated tasks like querying GCP Asset Inventory generating IAM reports and creating tickets with retry/error handling.
- Operated GCP services like Cloud Run Workload Identity Secret Manager and VPC Service Controls. Applied GCP-specific reliability and security patterns with hands-on experience.
- Supported internal developer teams by handling on-call rotations resolving incidents and delivering systemic fixes.
- Managed production Kubernetes clusters performed upgrades configured policies and debugged issues. Configured HPA/VPA for autoscaling and troubleshot pod scheduling and service mesh connectivity. Strong understanding of Kubernetes control planes for debugging and management
Experience Preferred
- Wrote Go for platform tooling or infrastructure automation. Developed Kubernetes admission webhooks to enforce security policies or CLI tools for secret management. Produced idiomatic Go with proper error handling context propagation and unit tests.
- Contributed to or led the design of multi-team or multi-service platform architectures. Designed shared service networks (hub-and-spoke models) CI/CD templates and service mesh configurations. Documented architecture patterns adopted by teams and articulated tradeoffs in design reviews.
- Implemented SRE practices including SLIs SLOs and error budgets. Configured SLO-based alerting in Prometheus/Grafana and used burn rate alerts for incident management.
Required Skills
- Cloud Platforms: Experience managing production-grade systems on GCP AWS or Azure with an SRE mindset.
- Linux & Networking: Strong fundamentals in Linux distributed systems and debugging production issues.
- Infrastructure as Code: Skilled in tools like Terraform Helm Kustomize and GitOps practices.
- Containers & Orchestration: Proficient in Docker Kubernetes and modern CI/CD tools.
- Programming: Experience with languages like Python Go Java or TypeScript for building tools and automation.
- Communication: Clear communicator with effective incident leadership and a customer-first approach.
Preferred Skills
- SLI/SLO Expertise: Experience defining SLIs/SLOs and implementing SLO-based alerting and dashboards.
- Observability Platforms: Familiarity with Prometheus/Grafana OpenTelemetry and centralized logging.
- Security Practices: Knowledge of policy-as-code supply chain security SBOMs and artifact signing.
- Standardized Solutions: Experience creating reusable golden paths (e.g. container images templates pipelines).
- Cost Optimization: Skilled in FinOps practices capacity planning and multi-tenant platform controls.
- Go: Proficient in writing idiomatic Go for platform tooling or infrastructure automation.
- Cloud Architecture: Experience designing multi-service or multi-team platform architectures.
- Reliability Engineering: Practical implementation of SRE practices including SLIs SLOs error budgets and alerting.
Education Required
**Listed salary ranges may vary based on experience qualifications and local market. Also some positions may include bonuses or other incentives***
Stefanini takes pride in hiring top talent and developing relationships with our future employees. Our talent acquisition teams will never make an offer of employment without having a phone conversation with you. Those face-to-face conversations will involve a description of the job for which you have applied. We will also speak with you about the process including interviews and job offers.
About Stefanini Group
The Stefanini Group is a global provider of offshore onshore and near shore outsourcing IT digital consulting systems integration application and strategic staffing services to Fortune 1000 enterprises around the world. Our presence is in countries like the Americas Europe Africa and Asia and more than four hundred clients across a broad spectrum of markets including financial services manufacturing telecommunications chemical services technology public sector and utilities. Stefanini is a CMM level 5 IT consulting company with a global presence. We are a CMM Level 5 company.
#LI-AK3
#LI-ONSITE
Required Experience:
IC
View more
View less