SeniorLead Data Engineer

VDart Inc

Not Interested
Bookmark
Report This Job

profile Job Location:

Toronto - Canada

profile Monthly Salary: Not Disclosed
Posted on: 2 hours ago
Vacancies: 1 Vacancy

Job Summary

Role: Senior/Lead Data Engineer.

Location: Toronto ON (Onsite).

Duration: Long Term Contract.

Job Overview:

  • Architect and build the high-performance data foundations that power enterprise analytics. This senior role demands software engineering rigor applied to data infrastructure-optimizing latency compute costs and scalability using cutting-edge tools like Polars Ibis and Griffin.

High-Performance Data Engineering:

  • Build optimized data structures using Polars and Ibis for sub-second query performance at scale.
  • Implement memory-efficient transformations that minimize compute costs by 50%.

Advanced Orchestration & Governance:

  • Design complex Airflow DAGs managing 100 dependencies with precise SLAs.
  • Deploy Griffin for automated data quality profiling anomaly detection and lineage tracking.

Cloud-Native Data Lake Architecture:

  • Architect Azure Data Lake Storage (ADLS Gen2) with hierarchical partitioning optimized for Databricks/Synapse.
  • Implement liquid clustering Z-ordering and predictive optimization for petabyte-scale workloads.

NoSQL & Hybrid Storage:

  • Evaluate/implement Cassandra or MongoDB for high-velocity semi-structured patterns.
  • Design polyglot persistence strategies balancing SQL/NoSQL for optimal access patterns.

Technical Ownership:

  • Pipeline Development: Python 3.10 PySpark Java Complex ELT patterns.
  • Cloud: Azure Databricks ADLS Gen2 ADF Synapse Analytics.
  • Orchestration: Airflow 2.7 NiFi Hamilton.
  • Data Processing: Polars Ibis Pandas (performance-optimized).
  • DevOps: Docker Kubernetes GitHub Actions Terraform.
  • Quality: Griffin Great Expectations Monte Carlo.
  • NoSQL: Cassandra MongoDB Atlas Cosmos DB.

Required Experience (6-10 years):

  • 3 years production Polars/Ibis (memory-efficient joins lazy evaluation streaming).
  • 2 years complex Airflow (dynamic DAGs XComs custom operators Celery Executor).
  • Azure Data Lake architecture (Delta Lake Unity Catalog ABFSS protocol optimization).
  • PySpark mastery (Delta Live Tables Adaptive Query Execution Photon engine).
  • NoSQL production experience (Cassandra data modeling MongoDB aggregation pipelines).
  • Long-term impact (1 year projects demonstrating sustained platform ownership).
Role: Senior/Lead Data Engineer. Location: Toronto ON (Onsite). Duration: Long Term Contract. Job Overview: Architect and build the high-performance data foundations that power enterprise analytics. This senior role demands software engineering rigor applied to data infrastructure-optimizing lat...
View more view more

Key Skills

  • Apache Hive
  • S3
  • Hadoop
  • Redshift
  • Spark
  • AWS
  • Apache Pig
  • NoSQL
  • Big Data
  • Data Warehouse
  • Kafka
  • Scala