Data Engineer – Databricks (Finance & Risk Cloudera Modernization)
Job Summary
Role Description: Seeking a highly experienced Principal Databricks Data Engineer to lead modernization of large-scale Finance and Risk data platforms from legacy Cloudera ecosystems to cloud-native Databricks Lakehouse architectures. The role requires deep hands-on expertise in enterprise data warehousing data lakes finance and risk data models and semantic consumption layers with strong experience supporting regulatory reporting management reporting and analytics use cases. The individual will serve as a hands-on architect and technical authority partnering closely with Finance Risk Analytics and Governance stakeholders while driving enterprise-scale platform modernization initiatives.
Experience Required
8 years across enterprise Data Warehouse and Data Lake platforms
5 years of hands-on experience with Databricks and Spark at scale
Key Responsibilities
Cloudera to Databricks Modernization
Lead modernization of legacy Cloudera platforms including:
CDH / CDP
Hive
HBase
Impala
Spark
Redesign ingestion transformation and consumption patterns from HDFS-centric architectures to cloud object storage and Delta Lake Refactor legacy Hive/Impala logic into PySpark and Spark SQL-based ELT pipelines.
Ensure data parity reconciliation and audit integrity during platform migration.
Enterprise Data Warehouse & Data Lake Architecture Design and govern enterprise Data Warehouse and Data Lake/Lakehouse architectures Implement layered architectures including:
Raw landing zones
Curated/conformed layers
Semantic consumption layers
Modernize traditional EDW patterns into scalable domain-aligned lakehouse designs Finance & Risk Data Modeling Support implementation of finance and risk data models including:
General Ledger and Sub-ledger data
Accounting events and financial hierarchies Risk exposure Liquidity Credit risk Market risk models Enable aggregation drill-down and drill-back capabilities from reports to transaction-level data.
Support:
Regulatory reporting
Management reporting
Analytics use cases
Semantic Consumption Layers
Build and manage semantic consumption layers to ensure consistent business logic across:
BI and reporting tools
Finance and Risk analytics
Self-service analytics platforms
Define:
Metrics
Dimensions
Hierarchies
KPIs aligned to finance and risk definitions Implement semantic models using:
Databricks SQL
Delta Tables
Required Skills:
Role Description: Seeking a highly experienced Principal Databricks Data Engineer to lead modernization of large-scale Finance and Risk data platforms from legacy Cloudera ecosystems to cloud-native Databricks Lakehouse architectures. The role requires deep hands-on expertise in enterprise data warehousing data lakes finance and risk data models and semantic consumption layers with strong experience supporting regulatory reporting management reporting and analytics use cases. The individual will serve as a hands-on architect and technical authority partnering closely with Finance Risk Analytics and Governance stakeholders while driving enterprise-scale platform modernization initiatives. Experience Required 8 years across enterprise Data Warehouse and Data Lake platforms 5 years of hands-on experience with Databricks and Spark at scale Key Responsibilities Cloudera to Databricks Modernization Lead modernization of legacy Cloudera platforms including: CDH / CDP Hive HBase Impala Spark Redesign ingestion transformation and consumption patterns from HDFS-centric architectures to cloud object storage and Delta Lake Refactor legacy Hive/Impala logic into PySpark and Spark SQL-based ELT pipelines. Ensure data parity reconciliation and audit integrity during platform migration. Enterprise Data Warehouse & Data Lake Architecture Design and govern enterprise Data Warehouse and Data Lake/Lakehouse architectures Implement layered architectures including: Raw landing zones Curated/conformed layers Semantic consumption layers Modernize traditional EDW patterns into scalable domain-aligned lakehouse designs Finance & Risk Data Modeling Support implementation of finance and risk data models including: General Ledger and Sub-ledger data Accounting events and financial hierarchies Risk exposure Liquidity Credit risk Market risk models Enable aggregation drill-down and drill-back capabilities from reports to transaction-level data. Support: Regulatory reporting Management reporting Analytics use cases Semantic Consumption Layers Build and manage semantic consumption layers to ensure consistent business logic across: BI and reporting tools Finance and Risk analytics Self-service analytics platforms Define: Metrics Dimensions Hierarchies KPIs aligned to finance and risk definitions Implement semantic models using: Databricks SQL Delta Tables
Required Education:
Bachelors Degree