Our Opportunity
Were looking for a Data Platform Reliability Engineer II to own reliability deployments and cost efficiency for a domain within Chewys Enterprise Data Systems (EDS). Youll lead automation and adoption of the paved road - ensuring consistency transparency and scalability across Snowflake dbt Cloud AWS and BI.
What Youll Do
- Own domain-level reliability: run on-call lead sev-2/3 incidents coordinate stakeholder communications drive RCAs and automate the top recovery actions to reduce MTTR.
- Define and track domain SLOs for freshness completeness and accuracy; deliver to SLAs through tests monitors and dashboards.
- Manage dbt/Snowflake deployments and environment promotion pipelines; implement pre-deploy checks canary automated rollback and change-fail rate tracking.
- Lead Snowflake cost initiatives: credit allocation budget tracking optimization playbooks and transparency reporting.
- Author and maintain Terraform modules for domain infrastructure; ship changes via CI/CD with plans reviews and rollback paths.
- Contribute to paved-road onboarding materials and guardrails; help new teams land with standard configurations and observability defaults.
- Build AI-assisted observability views for anomaly detection drift and warehouse optimization.
- Embed catalog and lineage coverage checks in deployments; enforce coverage thresholds while data stewards own certification and metric definitions.
- Improve runbooks reduce operational toil and mentor Level I engineers on best practices.
What Youll Need
- BA/BS in Computer Science Engineering Mathematics or a related field
- 35 years of experience in data engineering DevOps or platform reliability.
- StrongSQL and scripting skills via Python.
- Hands-on experience withSnowflake including credit monitoring warehouse optimization and performance tuning.
- Experience withdbt Cloud AWS and Terraform for infrastructure provisioning.
- Familiarity with CI/CD tools and environment promotion workflows.
- Excellent analytical and problem-solving skills; strong ownership mentality.
- Enthusiasm for automation AI observability and continuous improvement.
Why Youll Love This Role
- Own platform reliability and cost optimization that impact every data team at Chewy.
- Use automation and infrastructure-as-code (Terraform) to scale operations sustainably.
- Enable smooth tool adoption and reduce time to value for new teams joining the paved road.
- See your efforts directly improve reliability predictability and cost savings across EDS.
Required Experience:
IC
Our OpportunityWere looking for a Data Platform Reliability Engineer II to own reliability deployments and cost efficiency for a domain within Chewys Enterprise Data Systems (EDS). Youll lead automation and adoption of the paved road - ensuring consistency transparency and scalability across Snowfla...
Our Opportunity
Were looking for a Data Platform Reliability Engineer II to own reliability deployments and cost efficiency for a domain within Chewys Enterprise Data Systems (EDS). Youll lead automation and adoption of the paved road - ensuring consistency transparency and scalability across Snowflake dbt Cloud AWS and BI.
What Youll Do
- Own domain-level reliability: run on-call lead sev-2/3 incidents coordinate stakeholder communications drive RCAs and automate the top recovery actions to reduce MTTR.
- Define and track domain SLOs for freshness completeness and accuracy; deliver to SLAs through tests monitors and dashboards.
- Manage dbt/Snowflake deployments and environment promotion pipelines; implement pre-deploy checks canary automated rollback and change-fail rate tracking.
- Lead Snowflake cost initiatives: credit allocation budget tracking optimization playbooks and transparency reporting.
- Author and maintain Terraform modules for domain infrastructure; ship changes via CI/CD with plans reviews and rollback paths.
- Contribute to paved-road onboarding materials and guardrails; help new teams land with standard configurations and observability defaults.
- Build AI-assisted observability views for anomaly detection drift and warehouse optimization.
- Embed catalog and lineage coverage checks in deployments; enforce coverage thresholds while data stewards own certification and metric definitions.
- Improve runbooks reduce operational toil and mentor Level I engineers on best practices.
What Youll Need
- BA/BS in Computer Science Engineering Mathematics or a related field
- 35 years of experience in data engineering DevOps or platform reliability.
- StrongSQL and scripting skills via Python.
- Hands-on experience withSnowflake including credit monitoring warehouse optimization and performance tuning.
- Experience withdbt Cloud AWS and Terraform for infrastructure provisioning.
- Familiarity with CI/CD tools and environment promotion workflows.
- Excellent analytical and problem-solving skills; strong ownership mentality.
- Enthusiasm for automation AI observability and continuous improvement.
Why Youll Love This Role
- Own platform reliability and cost optimization that impact every data team at Chewy.
- Use automation and infrastructure-as-code (Terraform) to scale operations sustainably.
- Enable smooth tool adoption and reduce time to value for new teams joining the paved road.
- See your efforts directly improve reliability predictability and cost savings across EDS.
Required Experience:
IC
View more
View less