We are seeking a highly experienced Lead ETL Test Engineer with strong expertise in data validation data pipelines and big data ecosystems. The ideal candidate will lead end-to-end testing strategies across data warehouses ETL pipelines and modern data platforms ensuring data quality integrity and performance at scale. This role requires deep technical proficiency in big data tools strong programming skills and experience with cloud-native environments.
Primary Skills
Expertise in ETL testing and data validation across complex pipelines
Strong proficiency in Python programming
Hands-on experience with PySpark and/or Scala for large-scale data processing
Advanced knowledge of SQL data modeling and database testing (SQL & NoSQL)
Experience with data warehouses and data lakes (e.g. Snowflake)
Strong experience with Databricks
Deep understanding of Big Data ecosystems
Experience in testing data pipelines and ETL workflows
Strong understanding of data modeling techniques
Input/output validation for ML pipelines
Good to Have Skills
Experience with Apache Flink (streaming and batch processing)
Exposure to CI/CD pipelines (Jenkins GitLab etc.)
Knowledge of Docker and Kubernetes for containerized environments
Working knowledge of Linux and shell scripting
Hands-on experience with AWS backend services
Responsibilities and Duties
Lead end-to-end testing strategy for ETL pipelines data warehouses and big data platforms
Design and implement scalable data validation frameworks
Perform database testing across SQL and NoSQL systems
Validate data across data lakes (e.g. Snowflake) and distributed systems
Review and improve test code and automation frameworks
Collaborate with data engineers to ensure data quality and pipeline reliability
Build and maintain automated testing solutions for batch and streaming pipelines
Optimize test processes for performance scalability and accuracy
Integrate testing workflows into CI/CD pipelines
Ensure compliance with data governance and quality standards
Keywords
ETL Testing Data Validation Data Pipelines Big Data Testing Databricks PySpark Scala Python SQL NoSQL Snowflake Data Lakes Data Warehousing Apache Flink CI/CD Jenkins GitLab Docker Kubernetes AWS Linux Shell Scripting Data Modeling
Project Details
Domain: Data Engineering / Big Data
Focus: Testing and validation of enterprise-scale data pipelines and data platforms
Environment: Cloud-based (AWS) distributed data systems and modern data stack
Scope: Build and lead end-to-end testing frameworks ensuring high data quality consistency and reliability across batch and streaming systems
Education Qualificaiton
BTech/BE/MTech/ME
Job Overview We are seeking a highly experienced Lead ETL Test Engineer with strong expertise in data validation data pipelines and big data ecosystems. The ideal candidate will lead end-to-end testing strategies across data warehouses ETL pipelines and modern data platforms ensuring data qu...
Job Overview
We are seeking a highly experienced Lead ETL Test Engineer with strong expertise in data validation data pipelines and big data ecosystems. The ideal candidate will lead end-to-end testing strategies across data warehouses ETL pipelines and modern data platforms ensuring data quality integrity and performance at scale. This role requires deep technical proficiency in big data tools strong programming skills and experience with cloud-native environments.
Primary Skills
Expertise in ETL testing and data validation across complex pipelines
Strong proficiency in Python programming
Hands-on experience with PySpark and/or Scala for large-scale data processing
Advanced knowledge of SQL data modeling and database testing (SQL & NoSQL)
Experience with data warehouses and data lakes (e.g. Snowflake)
Strong experience with Databricks
Deep understanding of Big Data ecosystems
Experience in testing data pipelines and ETL workflows
Strong understanding of data modeling techniques
Input/output validation for ML pipelines
Good to Have Skills
Experience with Apache Flink (streaming and batch processing)
Exposure to CI/CD pipelines (Jenkins GitLab etc.)
Knowledge of Docker and Kubernetes for containerized environments
Working knowledge of Linux and shell scripting
Hands-on experience with AWS backend services
Responsibilities and Duties
Lead end-to-end testing strategy for ETL pipelines data warehouses and big data platforms
Design and implement scalable data validation frameworks
Perform database testing across SQL and NoSQL systems
Validate data across data lakes (e.g. Snowflake) and distributed systems
Review and improve test code and automation frameworks
Collaborate with data engineers to ensure data quality and pipeline reliability
Build and maintain automated testing solutions for batch and streaming pipelines
Optimize test processes for performance scalability and accuracy
Integrate testing workflows into CI/CD pipelines
Ensure compliance with data governance and quality standards
Keywords
ETL Testing Data Validation Data Pipelines Big Data Testing Databricks PySpark Scala Python SQL NoSQL Snowflake Data Lakes Data Warehousing Apache Flink CI/CD Jenkins GitLab Docker Kubernetes AWS Linux Shell Scripting Data Modeling
Project Details
Domain: Data Engineering / Big Data
Focus: Testing and validation of enterprise-scale data pipelines and data platforms
Environment: Cloud-based (AWS) distributed data systems and modern data stack