Sr Data Engineer
Philadelphia, PA - USA
Job Summary
Position Overview
- Enterprise data leadership: Help define and mature data integration data consolidation MDM integration and data platform design patterns across Integrichain.
- Hands-on Snowflake engineering: Design build optimize and operate Snowflake data models pipelines stored procedures and high-volume data processing patterns.
- MDM/Reltio enablement: Partner with MDM and Product teams to support HCO Master data ingestion outbound extracts cross-reference data golden record consumption survivorship outputs and downstream publishing patterns.
- Cross-functional partnership: Work with Product Engineering MDM Data Science DevOps Security and business stakeholders to align data solutions to enterprise priorities.
- Modern ELT execution: Use dbt or similar ELT tooling to develop reliable maintainable testable and observable data pipelines.
- Cost and performance ownership: Drive Snowflake performance tuning warehouse sizing workload management cost tracking and cost optimization practices.
Key Responsibilities
Data Strategy Consolidation and Integration
- Partner with Data Science leadership to rationalize and consolidate the enterprise data landscape across products platforms and acquired capabilities.
- Define reusable data integration patterns for batch micro-batch near-real-time and application-to-application data exchange.
- Collaborate with cross-functional teams to understand business data needs source-system realities and enterprise application integration requirements.
- Design scalable patterns for ingesting transforming mastering and publishing data across operational and analytical use cases.
- Help establish standards for data contracts schema evolution data quality lineage and data ownership.
MDM / Reltio Data Engineering Enablement
- Design and build data pipelines that load source data into Reltio MDM and extract mastered outputs from Reltio for downstream Snowflake analytics AI and operational use cases.
- Partner with MDM configuration and Product Management teams to translate HCO mastering requirements into data pipeline mapping validation reconciliation and publishing patterns.
- Work with Reltio APIs exports crosswalks/XREFs event-based integration patterns and bulk load/extract mechanisms as needed to support inbound and outbound data flows.
- Engineer integration patterns for HCO Master data including party/entity address identifier hierarchy relationship match/merge survivorship and golden record outputs.
- Support source ingestion and reference data integration involving datasets such as HIN DEA NPI NCPDP 340B/PHS channel outlet data customer/account data and other life sciences master/reference sources.
- Develop validation and reconciliation processes to compare source data Reltio mastered data Snowflake curated data and downstream consumption layers.
- Help operationalize MDM outputs for business-facing data products semantic models reporting tables APIs and AI-ready datasets.
Snowflake Platform Engineering and Optimization
- Design Snowflake database schema table view and semantic-layer patterns that support performance governance and maintainability.
- Optimize Snowflake workloads using clustering micro-partition awareness warehouse sizing query profiling caching behavior and workload isolation.
- Implement Snowflake cost tracking and optimization practices including warehouse utilization monitoring inefficient query identification and cost allocation by workload team or use case.
- Build scalable SQL and Snowflake stored procedure logic for large-volume data processing and analytical workloads.
- Apply secure Snowflake design patterns including RBAC masking access isolation auditing and environment separation.
ETL/ELT dbt Python and Data Pipeline Development
- Design build and maintain reliable ELT pipelines using dbt or comparable modern data transformation tooling.
- Develop Python-based automation for API integration file processing metadata management validation orchestration support and operational tooling.
- Develop modular tested and reusable transformation models for raw curated mastered and business-ready data layers.
- Implement automated data quality checks source freshness checks reconciliation logging and exception-handling patterns.
- Build orchestration-ready pipelines that support dependency management restartability incremental loads and operational monitoring.
- Collaborate with DevOps/SRE teams on CI/CD deployment automation environment promotion and operational runbooks for data pipelines.
Data Modeling and Big Data Processing
- Spearhead logical and physical data modeling efforts for enterprise analytical operational MDM and AI-ready datasets.
- Design models that balance normalization dimensional modeling medallion/lakehouse concepts and application-specific consumption needs.
- Create denormalized reporting and semantic-model-ready structures that simplify business consumption and reduce ambiguity for AI/LLM use cases.
- Process and optimize large data volumes in Snowflake using efficient SQL PL/SQL-style procedural logic Snowflake Scripting and performance-aware design.
- Create reusable patterns for historical tracking snapshots audit columns data versioning and lifecycle management.
- Ensure data models support downstream BI AI/ML semantic models data apps MDM Explorer/Entity 360 use cases and enterprise reporting.
Qualifications :
Required Skills and Experience
- 10 years of experience in data engineering database engineering analytics engineering or data platform development in production environments.
- Strong hands-on experience with Snowflake including architecture performance tuning security design cost optimization and cost tracking.
- Thorough understanding of Snowflake design patterns for analytical workloads high-volume data processing data sharing and multi-environment deployments.
- Hands-on experience with ETL/ELT tools; dbt experience is strongly preferred.
- Strong SQL and PL/SQL-style development experience including complex transformations stored procedures performance tuning and large-scale data processing.
- Python experience for data automation API integration file handling data validation metadata processing or operational tooling.
- Experience designing and implementing enterprise data models curated data layers semantic layers and reusable data products.
- Experience with data integration patterns across enterprise applications APIs files cloud storage operational systems MDM platforms and analytical platforms.
- Working understanding of Master Data Management concepts such as golden records crosswalks/XREFs match/merge survivorship hierarchies entity relationships stewardship and data quality.
- Experience partnering with MDM Product or business teams to translate mastering requirements into source-to-target mappings transformation logic validations and downstream data consumption patterns.
- Ability to work directly with cross-functional stakeholders to gather requirements explain design tradeoffs and drive alignment.
- Experience implementing data quality lineage auditability observability and operational monitoring within data pipelines.
- Comfortable operating as a hands-on senior individual contributor who can also influence strategy and engineering standards.
Preferred Experience
- Experience with Reltio MDM including inbound data loads outbound exports Reltio APIs crosswalks match/merge outputs survivorship outputs and operational troubleshooting.
- Experience in life sciences healthcare pharma commercialization HCO/HCP mastering patient data channel data customer master or commercial data platforms.
- Experience with life sciences reference and commercial datasets such as HIN DEA NPI NCPDP 340B/PHSchargebacks gross-to-net government pricing PBR or UBR.
- Experience with orchestration frameworks such as Airflow Dagster dbt Cloud jobs cloud-native schedulers or similar tools.
- Experience with cloud platforms and storage patterns especially Azure or AWS object storage integrated with Snowflake.
- Exposure to AI-ready data architecture feature stores ML datasets semantic models or AI/ML pipeline enablement.
- Experience with Terraform CI/CD Git-based development and infrastructure-as-code practices.
- Snowflake SnowPro Reltio dbt or equivalent cloud/data engineering certifications.
Additional Information :
What does IntegriChain have to offer
- Mission driven: Work with the purpose of helping to improve patients lives!
- Excellent and affordable medical benefits non-medical perks including Flexible Paid Time Off and much more!
- Robust Learning & Development opportunities including over 700 development courses free to all employees
#LI-KL1
IntegriChain is committed to equal treatment and opportunity in all aspects of recruitment selection and employment without regard to race color religion national origin ethnicity age sex marital status physical or mental disability gender identity sexual orientation veteran or military status or any other category protected under the law. IntegriChain is an equal opportunity employer; committed to creating a community of inclusion and an environment free from discrimination harassment and retaliation.
Our policy on visa sponsorship for US based positions: Applicants for employment in the US must have valid work authorization that does not now and/or will not in the future require sponsorship of a visa for employment authorization in the US by IntegriChain.
Remote Work :
Yes
Employment Type :
Full-time
About Company
IntegriChain is the data and application backbone for market access departments of Life Sciences manufacturers. We deliver the data, the applications, and the business process infrastructure for patient access and therapy commercialization. More than 250 manufacturers rely on our ICyt ... View more