About the Role
Youll build operate and evolve the end-to-end data platform that powers analytics automation and AI use cases. This is a hands-on role spanning cloud infrastructure ingestion/ETL and data modeling across a Medallion (bronze/silver/gold) architecture. Youll partner directly with stakeholders to turn messy source data into trusted datasets metrics and data products.
Who you are
- Pragmatic Builder: You write clear SQL/Python ship durable systems and leave pipelines more reliable than you found them.
- Data-Savvy Generalist: Youre comfortable moving up and down the stack (cloud pipelines warehousing and BI) and picking the right tool for the job.
- Fundamentals-first & Customer-Centric: You apply strong data modeling principles and optimize the analyst/stakeholder experience through consistent semantics and trustworthy reporting.
- Low-Ego High-Ownership Teammate: You take responsibility for outcomes seek feedback openly and will roll up your sleeves to move work across the finish line.
- High-Energy Communicator: Youre comfortable presenting facilitating discussions and getting in front of stakeholders to drive clarity and alignment.
- Self-Starter: You unblock yourself drive decisions and follow through on commitments; you bring a strong work ethic and invest in continuous learning.
What you will do
- Ingestion & ETL: Build reusable ingestion and ETL frameworks (Python and Spark) for APIs databases and un/semi-structured sources; handle JSON/Parquet and evolving schemas.
- Medallion Architecture: Own and evolve Medallion layers (bronze/silver/gold) for key domains with clear lineage metadata and ownership.
- Data Modeling & Marts: Design dimensional models and gold marts for core business metrics; ensure consistent grain and definitions.
- Analytics Enablement: Maintain semantic layers and partner on BI dashboards (Sigma or similar) so metrics are certified and self-serve.
- Reliability & Observability: Implement tests freshness/volume monitoring alerting and runbooks; perform incident response and root-cause analysis (RCA) for data issues.
- Warehouse & Performance: Administer and tune the cloud data warehouse (Snowflake or similar): compute sizing permissions query performance and cost controls.
- Standardization & Automation: Build paved-road patterns (templates operators CI checks) and automate repetitive tasks to boost developer productivity.
- AI Readiness: Prepare curated datasets for AI/ML/LLM use cases (feature sets embeddings prep) with appropriate governance.
Must Haves
- 35 years hands-on data engineering experience; strong SQL and Python; experience building data pipelines end-to-end in production.
- Strong cloud fundamentals (AWS preferred; other major clouds acceptable): object storage IAM concepts logging/monitoring and managed compute.
- Experience building and operating production ETL pipelines with reliability basics: retries backfills idempotency incremental processing patterns (e.g. SCDs late-arriving data) and clear operational ownership (docs/runbooks).
- Solid understanding of Medallion / layered architecture concepts (bronze/silver/gold or equivalent) and experience working within each layer.
- Strong data modeling fundamentals (dimensional modeling/star schema): can define grain build facts/dimensions and support consistent metrics.
- Working experience in a modern cloud data warehouse (Snowflake or similar): can write performant SQL and understand core warehouse concepts.
- Hands-on dbt experience: building and maintaining models writing core tests (freshness/uniqueness/RI) and contributing to documentation; ability to work in an established dbt project.
- Experience with analytics/BI tooling (Sigma Looker Tableau etc.) and semantic layer concepts; ability to support stakeholders and troubleshoot issues end-to-end.
Nice to Have
- Snowflake administration depth: warehouse sizing and cost management advanced performance tuning clustering strategies and designing RBAC models
- Advanced governance & security patterns: masking policies row-level security and least-privilege frameworks as a primary implementer/owner
- Strong Spark/PySpark proficiency: deep tuning/optimization and large-scale transformations.
- dbt platform-level ownership: CI/CD-based deployments environment/promotion workflows advanced macros/packages and leading large refactors or establishing standards from scratch.
- Orchestration: Airflow/MWAA DAG design patterns backfill strategies at scale dependency management and operational hardening
- Sigma-specific depth: semantic layer/metrics layer architecture in Sigma advanced dashboard standards and organization-wide certified metrics rollout.
- Automation / iPaaS experience: Workato (or similar) for business integrations and operational workflows.
- Infrastructure-as-code: Terraform (or similar) for data/cloud infrastructure provisioning environment management and safe change rollout.
- Data observability & lineage tooling: OpenLineage/Monte Carlo-style patterns automated lineage hooks anomaly detection systems.
- Lakehouse / unstructured patterns: Parquet/Iceberg event/data contracts and advanced handling of semi/unstructured sources.
- AI/ML/LLM data workflows: feature stores embeddings/RAG prep and privacy-aware governance.
#LI-EM4
Required Experience:
IC
About the RoleYoull build operate and evolve the end-to-end data platform that powers analytics automation and AI use cases. This is a hands-on role spanning cloud infrastructure ingestion/ETL and data modeling across a Medallion (bronze/silver/gold) architecture. Youll partner directly with stakeho...
About the Role
Youll build operate and evolve the end-to-end data platform that powers analytics automation and AI use cases. This is a hands-on role spanning cloud infrastructure ingestion/ETL and data modeling across a Medallion (bronze/silver/gold) architecture. Youll partner directly with stakeholders to turn messy source data into trusted datasets metrics and data products.
Who you are
- Pragmatic Builder: You write clear SQL/Python ship durable systems and leave pipelines more reliable than you found them.
- Data-Savvy Generalist: Youre comfortable moving up and down the stack (cloud pipelines warehousing and BI) and picking the right tool for the job.
- Fundamentals-first & Customer-Centric: You apply strong data modeling principles and optimize the analyst/stakeholder experience through consistent semantics and trustworthy reporting.
- Low-Ego High-Ownership Teammate: You take responsibility for outcomes seek feedback openly and will roll up your sleeves to move work across the finish line.
- High-Energy Communicator: Youre comfortable presenting facilitating discussions and getting in front of stakeholders to drive clarity and alignment.
- Self-Starter: You unblock yourself drive decisions and follow through on commitments; you bring a strong work ethic and invest in continuous learning.
What you will do
- Ingestion & ETL: Build reusable ingestion and ETL frameworks (Python and Spark) for APIs databases and un/semi-structured sources; handle JSON/Parquet and evolving schemas.
- Medallion Architecture: Own and evolve Medallion layers (bronze/silver/gold) for key domains with clear lineage metadata and ownership.
- Data Modeling & Marts: Design dimensional models and gold marts for core business metrics; ensure consistent grain and definitions.
- Analytics Enablement: Maintain semantic layers and partner on BI dashboards (Sigma or similar) so metrics are certified and self-serve.
- Reliability & Observability: Implement tests freshness/volume monitoring alerting and runbooks; perform incident response and root-cause analysis (RCA) for data issues.
- Warehouse & Performance: Administer and tune the cloud data warehouse (Snowflake or similar): compute sizing permissions query performance and cost controls.
- Standardization & Automation: Build paved-road patterns (templates operators CI checks) and automate repetitive tasks to boost developer productivity.
- AI Readiness: Prepare curated datasets for AI/ML/LLM use cases (feature sets embeddings prep) with appropriate governance.
Must Haves
- 35 years hands-on data engineering experience; strong SQL and Python; experience building data pipelines end-to-end in production.
- Strong cloud fundamentals (AWS preferred; other major clouds acceptable): object storage IAM concepts logging/monitoring and managed compute.
- Experience building and operating production ETL pipelines with reliability basics: retries backfills idempotency incremental processing patterns (e.g. SCDs late-arriving data) and clear operational ownership (docs/runbooks).
- Solid understanding of Medallion / layered architecture concepts (bronze/silver/gold or equivalent) and experience working within each layer.
- Strong data modeling fundamentals (dimensional modeling/star schema): can define grain build facts/dimensions and support consistent metrics.
- Working experience in a modern cloud data warehouse (Snowflake or similar): can write performant SQL and understand core warehouse concepts.
- Hands-on dbt experience: building and maintaining models writing core tests (freshness/uniqueness/RI) and contributing to documentation; ability to work in an established dbt project.
- Experience with analytics/BI tooling (Sigma Looker Tableau etc.) and semantic layer concepts; ability to support stakeholders and troubleshoot issues end-to-end.
Nice to Have
- Snowflake administration depth: warehouse sizing and cost management advanced performance tuning clustering strategies and designing RBAC models
- Advanced governance & security patterns: masking policies row-level security and least-privilege frameworks as a primary implementer/owner
- Strong Spark/PySpark proficiency: deep tuning/optimization and large-scale transformations.
- dbt platform-level ownership: CI/CD-based deployments environment/promotion workflows advanced macros/packages and leading large refactors or establishing standards from scratch.
- Orchestration: Airflow/MWAA DAG design patterns backfill strategies at scale dependency management and operational hardening
- Sigma-specific depth: semantic layer/metrics layer architecture in Sigma advanced dashboard standards and organization-wide certified metrics rollout.
- Automation / iPaaS experience: Workato (or similar) for business integrations and operational workflows.
- Infrastructure-as-code: Terraform (or similar) for data/cloud infrastructure provisioning environment management and safe change rollout.
- Data observability & lineage tooling: OpenLineage/Monte Carlo-style patterns automated lineage hooks anomaly detection systems.
- Lakehouse / unstructured patterns: Parquet/Iceberg event/data contracts and advanced handling of semi/unstructured sources.
- AI/ML/LLM data workflows: feature stores embeddings/RAG prep and privacy-aware governance.
#LI-EM4
Required Experience:
IC
View more
View less