Lead Data Engineer
Warsaw, IN - USA
Job Summary
Job Details:
Mandatory Skills:
Advanced expertise in ETL / ELT pipeline design
Handling:
o Batch data processing
o Near real time / streaming data
Experience with structured and semi structured data
Strong knowledge of:
o Incremental loading
o CDC (Change Data Capture)
Pipeline orchestration and dependency management
Strong programming skills in Python or Scala or Java (nice to have)
Performance optimization for large scale data processing
Solid understanding of:
o Dimensional modeling (Star / Snowflake)
o Normalized and denormalized models
Strong experience on: Azure AWS or GCP
Hands on with Data Warehouses (Snowflake Synapse BigQuery Redshift)
Data Architecture & Solution Design
Design end to end data engineering architectures
Define scalable solutions for:
o Data lakes / lakehouse
o Data warehouses
o Streaming and real time systems
Ensure alignment with enterprise architecture security and compliance standards
Review and approve technical designs
Data Pipeline Development & Management
Lead development of ETL / ELT pipelines
Handle:
o Batch and real time ingestion
o Structured and semi structured data
Optimize pipelines for performance reliability and cost
Manage schema evolution and data dependencies
Data Quality Reliability & Operations
Establish data quality standards and validation rules
Implement monitoring alerting and observability
Perform root cause analysis for data incidents
Drive operational excellence and stability
DevOps / DataOps Enablement
Build CI/CD pipelines for data workloads
Automate testing deployment and rollback
Improve reliability through automation
Mandatory Skills:
Advanced expertise in ETL / ELT pipeline design
Handling:
o Batch data processing
o Near real time / streaming data
Experience with structured and semi structured data
Strong knowledge of:
o Incremental loading
o CDC (Change Data Capture)
Pipeline orchestration and dependency management
Strong programming skills in Python or Scala or Java (nice to have)
Performance optimization for large scale data processing
Solid understanding of:
o Dimensional modeling (Star / Snowflake)
o Normalized and denormalized models
Strong experience on: Azure AWS or GCP
Hands on with Data Warehouses (Snowflake Synapse BigQuery Redshift)
Data Architecture & Solution Design
Design end to end data engineering architectures
Define scalable solutions for:
o Data lakes / lakehouse
o Data warehouses
o Streaming and real time systems
Ensure alignment with enterprise architecture security and compliance standards
Review and approve technical designs
Data Pipeline Development & Management
Lead development of ETL / ELT pipelines
Handle:
o Batch and real time ingestion
o Structured and semi structured data
Optimize pipelines for performance reliability and cost
Manage schema evolution and data dependencies
Data Quality Reliability & Operations
Establish data quality standards and validation rules
Implement monitoring alerting and observability
Perform root cause analysis for data incidents
Drive operational excellence and stability
DevOps / DataOps Enablement
Build CI/CD pipelines for data workloads
Automate testing deployment and rollback
Improve reliability through automation