Position :- GCP Data Engineer (Health Care Background Must) -
Location: Across USA any Location
FullTime
Summary:Strong experience architecting enterprise data platforms on Google Cloud (GCP). The architect will work as a strategic technical partner to design and build a GCP BigQuery-based Data Lake & Data Warehouse ecosystem.
The role requires deep hands-on expertise in data ingestion transformation modeling enrichment and governance combined with a strong understanding of clinical healthcare data standards interoperability and cloud architecture best practices.
Key Responsibilities:
1. Data Lake & Data Platform Architecture (GCP)
Architect and design an enterprise-grade GCP-based data lakehouse leveraging BigQuery GCS Dataproc Dataflow Pub/Sub Cloud Composer and BigQuery Omni.
Define data ingestion hydration curation processing and enrichment strategies for large-scale structured semi-structured and unstructured datasets.
Create data domain models canonical models and consumption-ready datasets for analytics AI/ML and operational data products.
Design federated data layers and self-service data products for downstream consumers.
2. Data Ingestion & Pipelines
Architect batch near-real-time and streaming ingestion pipelines using GCP Cloud Dataflow Pub/Sub and Dataproc.
Set up data ingestion for clinical (EHR/EMR LIS RIS/PACS) datasets including HL7 FHIR CCD DICOM formats.
Build ingestion pipelines for non-clinical systems (ERP HR payroll supply chain finance).
Architect ingestion from medical devices IoT remote patient monitoring and wearables leveraging IoMT patterns.
Manage on-prem cloud migration pipelines hybrid cloud data movement VPN/Interconnect connectivity and data transfer strategies.
3. Data Transformation Hydration & Enrichment
Build transformation frameworks using BigQuery SQL Dataflow Dataproc or dbt.
Define curation patterns including bronze/silver/gold layers canonical healthcare entities and data marts.
Implement data enrichment using external social determinants device signals clinical event logs or operational datasets.
Enable metadata-driven pipelines for scalable transformations.
4. Data Governance & Quality
Establish and operationalize a data governance framework encompassing data stewardship ownership classification and lifecycle policies.
Implement data lineage data cataloging and metadata management using tools such as Dataplex Data Catalog Collibra or Informatica.
Set up data quality frameworks for validation profiling anomaly detection and SLA monitoring.
Ensure HIPAA compliance PHI protection IAM/RBAC VPC SC DLP encryption retention and auditing.
5. Cloud Infrastructure & Networking
Work with cloud infrastructure teams to architect VPC networks subnetting ingress/egress firewall policies VPN/IPSec Interconnect and hybrid connectivity.
Define storage layers partitioning/clustering design cost optimization performance tuning and capacity planning for BigQuery.
Understand containerized processing (Cloud Run GKE) for data services.
6. Stakeholder Collaboration
Work closely with clinical operational research and IT stakeholders to define data use cases schema and consumption models.
Partner with enterprise architects security teams and platform engineering teams on cross-functional initiatives.
Guide data engineers and provide architectural oversight on pipeline implementation.
7. Hands-on Leadership
Be actively hands-on in building pipelines writing transformations building POCs and validating architectural patterns.
Mentor data engineers on best practices coding standards and cloud-native development.
Required Skills & Qualifications
Technical Skills (Must-Have)
10 years in data architecture engineering or data platform roles.
Strong expertise in GCP data stack (BigQuery Dataflow Composer GCS Pub/Sub Dataproc Dataplex).
Hands-on experience with data ingestion pipeline orchestration and transformations.
Deep understanding of clinical data standards:
HL7 v2.x FHIR CCD/C-CDA
DICOM (for scans and imaging)
LIS/RIS/PACS data structures
Experience with device and IoT data ingestion (wearables remote patient monitoring clinical devices).
Experience with ERP datasets (Workday Oracle Lawson PeopleSoft).
Strong SQL and data modeling skills (3NF star/snowflake canonical and logical models).
Experience with metadata management lineage and governance frameworks.
Solid understanding of HIPAA PHI/PII handling DLP IAM VPC security.
Cloud & Infrastructure
Solid understanding of cloud networking hybrid connectivity VPC design firewalling DNS service accounts IAM and security models.
Cloud Native Data movement services
Experience with on-prem to cloud migrations.
Required Skills:
Technical Skills (Must-Have) 10 years in data architectureengineeringDataflowComposerGCSPub/SubDataprocDataplex). Hands-on experience with data ingestionpipeline orchestrationFHIRremote patient monitoringclinical devices). Experience with ERP datasets (WorkdayOracleLawsonPeopleSoft). Strong SQL and data modeling skills (3NFstar/snowflakelineageand governance frameworks. Solid understanding of HIPAAPHI/PII handlingDLPIAMVPC security
Position :- GCP Data Engineer (Health Care Background Must) - Location: Across USA any Location FullTime Summary:Strong experience architecting enterprise data platforms on Google Cloud (GCP). The architect will work as a strategic technical partner to design and build a GCP BigQuery-based Data Lake...
Position :- GCP Data Engineer (Health Care Background Must) -
Location: Across USA any Location
FullTime
Summary:Strong experience architecting enterprise data platforms on Google Cloud (GCP). The architect will work as a strategic technical partner to design and build a GCP BigQuery-based Data Lake & Data Warehouse ecosystem.
The role requires deep hands-on expertise in data ingestion transformation modeling enrichment and governance combined with a strong understanding of clinical healthcare data standards interoperability and cloud architecture best practices.
Key Responsibilities:
1. Data Lake & Data Platform Architecture (GCP)
Architect and design an enterprise-grade GCP-based data lakehouse leveraging BigQuery GCS Dataproc Dataflow Pub/Sub Cloud Composer and BigQuery Omni.
Define data ingestion hydration curation processing and enrichment strategies for large-scale structured semi-structured and unstructured datasets.
Create data domain models canonical models and consumption-ready datasets for analytics AI/ML and operational data products.
Design federated data layers and self-service data products for downstream consumers.
2. Data Ingestion & Pipelines
Architect batch near-real-time and streaming ingestion pipelines using GCP Cloud Dataflow Pub/Sub and Dataproc.
Set up data ingestion for clinical (EHR/EMR LIS RIS/PACS) datasets including HL7 FHIR CCD DICOM formats.
Build ingestion pipelines for non-clinical systems (ERP HR payroll supply chain finance).
Architect ingestion from medical devices IoT remote patient monitoring and wearables leveraging IoMT patterns.
Manage on-prem cloud migration pipelines hybrid cloud data movement VPN/Interconnect connectivity and data transfer strategies.
3. Data Transformation Hydration & Enrichment
Build transformation frameworks using BigQuery SQL Dataflow Dataproc or dbt.
Define curation patterns including bronze/silver/gold layers canonical healthcare entities and data marts.
Implement data enrichment using external social determinants device signals clinical event logs or operational datasets.
Enable metadata-driven pipelines for scalable transformations.
4. Data Governance & Quality
Establish and operationalize a data governance framework encompassing data stewardship ownership classification and lifecycle policies.
Implement data lineage data cataloging and metadata management using tools such as Dataplex Data Catalog Collibra or Informatica.
Set up data quality frameworks for validation profiling anomaly detection and SLA monitoring.
Ensure HIPAA compliance PHI protection IAM/RBAC VPC SC DLP encryption retention and auditing.
5. Cloud Infrastructure & Networking
Work with cloud infrastructure teams to architect VPC networks subnetting ingress/egress firewall policies VPN/IPSec Interconnect and hybrid connectivity.
Define storage layers partitioning/clustering design cost optimization performance tuning and capacity planning for BigQuery.
Understand containerized processing (Cloud Run GKE) for data services.
6. Stakeholder Collaboration
Work closely with clinical operational research and IT stakeholders to define data use cases schema and consumption models.
Partner with enterprise architects security teams and platform engineering teams on cross-functional initiatives.
Guide data engineers and provide architectural oversight on pipeline implementation.
7. Hands-on Leadership
Be actively hands-on in building pipelines writing transformations building POCs and validating architectural patterns.
Mentor data engineers on best practices coding standards and cloud-native development.
Required Skills & Qualifications
Technical Skills (Must-Have)
10 years in data architecture engineering or data platform roles.
Strong expertise in GCP data stack (BigQuery Dataflow Composer GCS Pub/Sub Dataproc Dataplex).
Hands-on experience with data ingestion pipeline orchestration and transformations.
Deep understanding of clinical data standards:
HL7 v2.x FHIR CCD/C-CDA
DICOM (for scans and imaging)
LIS/RIS/PACS data structures
Experience with device and IoT data ingestion (wearables remote patient monitoring clinical devices).
Experience with ERP datasets (Workday Oracle Lawson PeopleSoft).
Strong SQL and data modeling skills (3NF star/snowflake canonical and logical models).
Experience with metadata management lineage and governance frameworks.
Solid understanding of HIPAA PHI/PII handling DLP IAM VPC security.
Cloud & Infrastructure
Solid understanding of cloud networking hybrid connectivity VPC design firewalling DNS service accounts IAM and security models.
Cloud Native Data movement services
Experience with on-prem to cloud migrations.
Required Skills:
Technical Skills (Must-Have) 10 years in data architectureengineeringDataflowComposerGCSPub/SubDataprocDataplex). Hands-on experience with data ingestionpipeline orchestrationFHIRremote patient monitoringclinical devices). Experience with ERP datasets (WorkdayOracleLawsonPeopleSoft). Strong SQL and data modeling skills (3NFstar/snowflakelineageand governance frameworks. Solid understanding of HIPAAPHI/PII handlingDLPIAMVPC security
View more
View less