Python Data Testing Lead Engineer

Not Interested
Bookmark
Report This Job

profile Job Location:

Bengaluru - India

profile Monthly Salary: Not Disclosed
Posted on: 2 hours ago
Vacancies: 1 Vacancy

Job Summary

Job Overview

We are seeking a highly experienced Lead ETL Test Engineer with strong expertise in data validation data pipelines and big data ecosystems. The ideal candidate will lead end-to-end testing strategies across data warehouses ETL pipelines and modern data platforms ensuring data quality integrity and performance at scale. This role requires deep technical proficiency in big data tools strong programming skills and experience with cloud-native environments.

Primary Skills

  • Expertise in ETL testing and data validation across complex pipelines
  • Strong proficiency in Python programming
  • Hands-on experience with PySpark and/or Scala for large-scale data processing
  • Advanced knowledge of SQL data modeling and database testing (SQL & NoSQL)
  • Experience with data warehouses and data lakes (e.g. Snowflake)
  • Strong experience with Databricks
  • Deep understanding of Big Data ecosystems
  • Experience in testing data pipelines and ETL workflows
  • Strong understanding of data modeling techniques
  • Input/output validation for ML pipelines

Good to Have Skills

  • Experience with Apache Flink (streaming and batch processing)
  • Exposure to CI/CD pipelines (Jenkins GitLab etc.)
  • Knowledge of Docker and Kubernetes for containerized environments
  • Working knowledge of Linux and shell scripting
  • Hands-on experience with AWS backend services

Responsibilities and Duties

  • Lead end-to-end testing strategy for ETL pipelines data warehouses and big data platforms
  • Design and implement scalable data validation frameworks
  • Perform database testing across SQL and NoSQL systems
  • Validate data across data lakes (e.g. Snowflake) and distributed systems
  • Review and improve test code and automation frameworks
  • Collaborate with data engineers to ensure data quality and pipeline reliability
  • Build and maintain automated testing solutions for batch and streaming pipelines
  • Optimize test processes for performance scalability and accuracy
  • Integrate testing workflows into CI/CD pipelines
  • Ensure compliance with data governance and quality standards

Keywords

ETL Testing Data Validation Data Pipelines Big Data Testing Databricks PySpark Scala Python SQL NoSQL Snowflake Data Lakes Data Warehousing Apache Flink CI/CD Jenkins GitLab Docker Kubernetes AWS Linux Shell Scripting Data Modeling

Project Details

  • Domain: Data Engineering / Big Data
  • Focus: Testing and validation of enterprise-scale data pipelines and data platforms
  • Environment: Cloud-based (AWS) distributed data systems and modern data stack
  • Tools & Technologies: Databricks Snowflake PySpark/Scala Apache Flink Python CI/CD tools containerized environments
  • Scope: Build and lead end-to-end testing frameworks ensuring high data quality consistency and reliability across batch and streaming systems
Education Qualificaiton

BTech/BE/MTech/ME

Job Overview We are seeking a highly experienced Lead ETL Test Engineer with strong expertise in data validation data pipelines and big data ecosystems. The ideal candidate will lead end-to-end testing strategies across data warehouses ETL pipelines and modern data platforms ensuring data qu...
View more view more