Seeking a highly skilled Senior Data Architect / Platform Data Engineer to support the design and implementation of a secure scalable Integrated Data Hub (IDH) leveraging Databricks and Medallion Architecture. This role will focus on designing security access controls data modeling metadata management and high-volume data processing across bronze silver and gold layers. Experience with FHIR data standards at the gold layer is a strong asset.
Requirements
Must Haves:
- Hands-on experience with Databricks (including Unity Catalog Delta Lake Auto loader and PySpark)
- Knowledge of Medallion Architecture patterns in Databricks and designing and supporting data pipelines in a Bronze/Silver/Gold (Medallion) architecture
- Experience conducting data profiling to identify structure completeness and data quality issues
- Experience in Azure cloud data architecture
- Extensive experience designing and managing ETL pipelines including Change Data Capture (CDC)
- Experience implementing role-based access control (RBAC)
- Demonstrated ability to lead data platform initiatives from requirements gathering through design development and deployment
Technical Knowledge 60%
- Expert knowledge of data warehouse design methodologies including Delta Lake and Medallion Architecture with deep understanding of Delta Lake optimizations.
- Proficient in Azure Data Lake Delta Lake Azure DevOps Git and API testing tools like Postman.
- Strong proficiency in relational databases with expertise in writing tuning and debugging complex SQL queries.
- Experienced in integrating and managing REST APIs for downstream systems like MDM and FHIR services.
- Skilled in designing and optimizing ETL/ELT pipelines in Databricks using PySpark SQL and Delta Live Tables including implementing Change Data Capture (batch and streaming).
- Experienced in metadata-driven ingestion and transformation pipelines with Python and PySpark.
- Familiar with Unity Catalog structure and management including configuring fine-grained permissions and workspace ACLs for secure data governance.
- Ability to lead logical and physical data modeling across lakehouse layers (Bronze Silver Gold) and define business and technical metadata.
- Experienced with Databricks job and all-purpose cluster configuration optimization and DevOps practices such as notebook versioning and environment management.
- Proficient in assessing and profiling large volumes of data to ensure data quality and support business rules.
- Able to collaborate effectively with ETL developers and business analysts to translate user stories into technical pipeline logic.
General Skills (40%)
- 5 years in data engineering ideally in cloud data lake environments
- Ability to translate business requirements into scalable data architectures data models and governance frameworks
- Able to serve as technical advisor during sprint planning and backlog grooming.
- Skilled in conducting data discovery profiling and quality assessments to guide architecture and modeling decisions
- Capable of conducting performance diagnostics and root cause analysis across multiple layers (DB ETL infrastructure)
- Strong communication skills for working with business stakeholders developers and executives
- Passion for mentoring training and establishing reusable frameworks and best practices
- Experience with agile practices including sprints user stories and iterative development especially when working in an agile data environment
- Experience grooming and assembling requirements into coherent user stories and use cases and managing the Product Backlog Items refining them and communicate changes to project manager/Team Lead
- Analyze current and future data needs data flows and data governance practices to support enterprise data strategies
- Lead data discovery efforts and participate in the design of data models and data integration solutions