**Job posted in Anticipation of Award**
We are seeking a Databricks Engineer to design build and operate a Data & AI platform with a strong foundation in the Medallion Architecture (raw/bronze curated/silver and mart/gold layers). This platform will orchestrate complex data workflows and scalable ELT pipelines to integrate data from enterprise systems such as PeopleSoft D2L and Salesforce delivering high-quality governed data for machine learning AI/BI and analytics at scale. You will play a critical role in engineering the infrastructure and workflows that enable seamless data flow across the enterprise ensure operational excellence and provide the backbone for strategic decision-making predictive modeling and innovation.
Key Responsibilities
Data & AI Platform Engineering (Databricks-Centric):
-
Design implement and optimize end-to-end data pipelines on Databricks following the Medallion Architecture principles.
-
Build robust and scalable ETL/ELT pipelines using Apache Spark and Delta Lake to transform raw (bronze) data into trusted curated (silver) and analytics-ready (gold) data layers.
-
Operationalize Databricks Workflows for orchestration dependency management and pipeline automation.
-
Apply schema evolution and data versioning to support agile data development.
Platform Integration & Data Ingestion:
-
Connect and ingest data from enterprise systems such as PeopleSoft D2L and Salesforce using APIs JDBC or other integration frameworks.
-
Implement connectors and ingestion frameworks that accommodate structured semi- structured and unstructured data.
-
Design standardized data ingestion processes with automated error handling retries and alerting.
Data Quality Monitoring and Governance:
-
Develop data quality checks validation rules and anomaly detection mechanisms to ensure data integrity across all layers.
-
Integrate monitoring and observability tools (e.g. Databricks metrics Grafana) to track ETL performance latency and failures.
-
Implement Unity Catalog or equivalent tools for centralized metadata management data lineage and governance policy enforcement.
Security Privacy and Compliance:
-
Enforce data security best practices including row-level security encryption at rest/in transit and fine-grained access control via Unity Catalog.
-
Design and implement data masking tokenization and anonymization for compliance with privacy regulations (e.g. GDPR FERPA).
-
Work with security teams to audit and certify compliance controls.
AI/ML-Ready Data Foundation:
-
Enable data scientists by delivering high-quality feature-rich data sets for model training and inference.
-
Support AIOps/MLOps lifecycle workflows using MLflow for experiment tracking model registry and deployment within Databricks.
-
Collaborate with AI/ML teams to create reusable feature stores and training pipelines.
Cloud Data Architecture and Storage:
-
Architect and manage data lakes on Azure Data Lake Storage (ADLS) or Amazon S3 and design ingestion pipelines to feed the bronze layer.
-
Build data marts and warehousing solutions using platforms like Databricks.
-
Optimize data storage and access patterns for performance and cost-efficiency.
Documentation & Enablement:
-
Maintain technical documentation architecture diagrams data dictionaries and runbooks for all pipelines and components.
-
Provide training and enablement sessions to internal stakeholders on the Databricks platform Medallion Architecture and data governance practices.
-
Conduct code reviews and promote reusable patterns and frameworks across teams.
Reporting and Accountability:
-
Submit a weekly schedule of hours worked and progress reports outlining completed tasks upcoming plans and blockers.
-
Track deliverables against roadmap milestones and communicate risks or dependencies.
Required Skills
- Hands-on experience with Databricks Delta Lake and Apache Spark for large-scale data engineering.
- Deep understanding of ELT pipeline development orchestration and monitoring in cloud-native environments.
- Experience implementing Medallion Architecture (Bronze/Silver/Gold) and working with data versioning and schema enforcement in enterprise grade environments.
- Strong proficiency in SQL Python or Scala for data transformations and workflow logic.
- Proven experience integrating enterprise platforms (e.g. PeopleSoft Salesforce D2L) into centralized data platforms.
- Familiarity with data governance lineage tracking and metadata management tools.
Preferred Skills
- Experience with Databricks Unity Catalog for metadata management and access control.
- Experience deploying ML models at scale using MLFlow or similar MLOps tools.
- Familiarity with cloud platforms like Azure or AWS including storage security and networking aspects.
- Knowledge of data warehouse design and star/snowflake schema modeling.
Preferred Qualifications
- Bachelors degree in Computer Science Information Security or related field (or equivalent experience).
- 3 years Databricks engineering experience
**Job posted in Anticipation of Award** We are seeking a Databricks Engineer to design build and operate a Data & AI platform with a strong foundation in the Medallion Architecture (raw/bronze curated/silver and mart/gold layers). This platform will orchestrate complex data workflows and scalable E...
**Job posted in Anticipation of Award**
We are seeking a Databricks Engineer to design build and operate a Data & AI platform with a strong foundation in the Medallion Architecture (raw/bronze curated/silver and mart/gold layers). This platform will orchestrate complex data workflows and scalable ELT pipelines to integrate data from enterprise systems such as PeopleSoft D2L and Salesforce delivering high-quality governed data for machine learning AI/BI and analytics at scale. You will play a critical role in engineering the infrastructure and workflows that enable seamless data flow across the enterprise ensure operational excellence and provide the backbone for strategic decision-making predictive modeling and innovation.
Key Responsibilities
Data & AI Platform Engineering (Databricks-Centric):
-
Design implement and optimize end-to-end data pipelines on Databricks following the Medallion Architecture principles.
-
Build robust and scalable ETL/ELT pipelines using Apache Spark and Delta Lake to transform raw (bronze) data into trusted curated (silver) and analytics-ready (gold) data layers.
-
Operationalize Databricks Workflows for orchestration dependency management and pipeline automation.
-
Apply schema evolution and data versioning to support agile data development.
Platform Integration & Data Ingestion:
-
Connect and ingest data from enterprise systems such as PeopleSoft D2L and Salesforce using APIs JDBC or other integration frameworks.
-
Implement connectors and ingestion frameworks that accommodate structured semi- structured and unstructured data.
-
Design standardized data ingestion processes with automated error handling retries and alerting.
Data Quality Monitoring and Governance:
-
Develop data quality checks validation rules and anomaly detection mechanisms to ensure data integrity across all layers.
-
Integrate monitoring and observability tools (e.g. Databricks metrics Grafana) to track ETL performance latency and failures.
-
Implement Unity Catalog or equivalent tools for centralized metadata management data lineage and governance policy enforcement.
Security Privacy and Compliance:
-
Enforce data security best practices including row-level security encryption at rest/in transit and fine-grained access control via Unity Catalog.
-
Design and implement data masking tokenization and anonymization for compliance with privacy regulations (e.g. GDPR FERPA).
-
Work with security teams to audit and certify compliance controls.
AI/ML-Ready Data Foundation:
-
Enable data scientists by delivering high-quality feature-rich data sets for model training and inference.
-
Support AIOps/MLOps lifecycle workflows using MLflow for experiment tracking model registry and deployment within Databricks.
-
Collaborate with AI/ML teams to create reusable feature stores and training pipelines.
Cloud Data Architecture and Storage:
-
Architect and manage data lakes on Azure Data Lake Storage (ADLS) or Amazon S3 and design ingestion pipelines to feed the bronze layer.
-
Build data marts and warehousing solutions using platforms like Databricks.
-
Optimize data storage and access patterns for performance and cost-efficiency.
Documentation & Enablement:
-
Maintain technical documentation architecture diagrams data dictionaries and runbooks for all pipelines and components.
-
Provide training and enablement sessions to internal stakeholders on the Databricks platform Medallion Architecture and data governance practices.
-
Conduct code reviews and promote reusable patterns and frameworks across teams.
Reporting and Accountability:
-
Submit a weekly schedule of hours worked and progress reports outlining completed tasks upcoming plans and blockers.
-
Track deliverables against roadmap milestones and communicate risks or dependencies.
Required Skills
- Hands-on experience with Databricks Delta Lake and Apache Spark for large-scale data engineering.
- Deep understanding of ELT pipeline development orchestration and monitoring in cloud-native environments.
- Experience implementing Medallion Architecture (Bronze/Silver/Gold) and working with data versioning and schema enforcement in enterprise grade environments.
- Strong proficiency in SQL Python or Scala for data transformations and workflow logic.
- Proven experience integrating enterprise platforms (e.g. PeopleSoft Salesforce D2L) into centralized data platforms.
- Familiarity with data governance lineage tracking and metadata management tools.
Preferred Skills
- Experience with Databricks Unity Catalog for metadata management and access control.
- Experience deploying ML models at scale using MLFlow or similar MLOps tools.
- Familiarity with cloud platforms like Azure or AWS including storage security and networking aspects.
- Knowledge of data warehouse design and star/snowflake schema modeling.
Preferred Qualifications
- Bachelors degree in Computer Science Information Security or related field (or equivalent experience).
- 3 years Databricks engineering experience
View more
View less