SeniorLead Data Engineer

Toronto - Canada

Monthly Salary: Not Disclosed

Posted on: 2 hours ago

Vacancies: 1 Vacancy

Job Summary

Role: Senior/Lead Data Engineer.

Location: Toronto ON (Onsite).

Duration: Long Term Contract.

Job Overview:

Architect and build the high-performance data foundations that power enterprise analytics. This senior role demands software engineering rigor applied to data infrastructure-optimizing latency compute costs and scalability using cutting-edge tools like Polars Ibis and Griffin.

High-Performance Data Engineering:

Build optimized data structures using Polars and Ibis for sub-second query performance at scale.
Implement memory-efficient transformations that minimize compute costs by 50%.

Advanced Orchestration & Governance:

Design complex Airflow DAGs managing 100 dependencies with precise SLAs.
Deploy Griffin for automated data quality profiling anomaly detection and lineage tracking.

Cloud-Native Data Lake Architecture:

Architect Azure Data Lake Storage (ADLS Gen2) with hierarchical partitioning optimized for Databricks/Synapse.
Implement liquid clustering Z-ordering and predictive optimization for petabyte-scale workloads.

NoSQL & Hybrid Storage:

Evaluate/implement Cassandra or MongoDB for high-velocity semi-structured patterns.
Design polyglot persistence strategies balancing SQL/NoSQL for optimal access patterns.

Technical Ownership:

Pipeline Development: Python 3.10 PySpark Java Complex ELT patterns.
Cloud: Azure Databricks ADLS Gen2 ADF Synapse Analytics.
Orchestration: Airflow 2.7 NiFi Hamilton.
Data Processing: Polars Ibis Pandas (performance-optimized).
DevOps: Docker Kubernetes GitHub Actions Terraform.
Quality: Griffin Great Expectations Monte Carlo.
NoSQL: Cassandra MongoDB Atlas Cosmos DB.

Required Experience (6-10 years):

3 years production Polars/Ibis (memory-efficient joins lazy evaluation streaming).
2 years complex Airflow (dynamic DAGs XComs custom operators Celery Executor).
Azure Data Lake architecture (Delta Lake Unity Catalog ABFSS protocol optimization).
PySpark mastery (Delta Live Tables Adaptive Query Execution Photon engine).
NoSQL production experience (Cassandra data modeling MongoDB aggregation pipelines).
Long-term impact (1 year projects demonstrating sustained platform ownership).

Role: Senior/Lead Data Engineer. Location: Toronto ON (Onsite). Duration: Long Term Contract. Job Overview: Architect and build the high-performance data foundations that power enterprise analytics. This senior role demands software engineering rigor applied to data infrastructure-optimizing lat...