Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via emailResponsibilities:
Design develop and maintain automated data quality checks for complex data pipelines handling petabyte scale datasets.
Implement scalable data validation frameworks using PySpark PyTest Python SQL and Hive ensuring comprehensive test coverage.
Collaborate with Data Engineers and DevOps teams to integrate automated tests into CI CD workflows.
Analyze test results identify data anomalies and provide actionable insights to resolve data quality issues.
Develop monitoring and alerting solutions for data quality in production environments.
Document test processes standards and best practices mentor junior engineers on data quality automation.
Continuously improve test frameworks and processes to optimize performance scalability and reliability.
This is a hybrid position. Expectation of days in office will be confirmed by your hiring manager.
Qualifications :
Basic Qualifications
-Bachelors or Masters degree in Computer Science Engineering or related field.
-1 years of experience in test engineering or data engineering roles with a focus on data quality and automation.
-Bachelors or Masters degree in Computer Science Engineering or related field.
-1 years of experience in test engineering or data engineering roles with a focus on data quality and automation.
-Advanced proficiency in PySpark and Python for building scalable data processing and testing solutions.
-Strong experience with SQL and Hive for querying validating and profiling data in large datasets.
-Hands on exposure to Hive and HDFS.
-Solid understanding of data pipeline architectures ETL processes and best practices for big data environments.
-Experience implementing CI CD pipelines for automated data testing.
-Strong analytical problem solving and communication skills.
-Ability to work independently and in a collaborative fast paced team environment.
Preferred Qualifications
-Experience with data quality frameworks (e.g. Deequ Great Expectations).
-Familiarity with workflow orchestration tools (e.g. Airflow Step Functions).
-Exposure to data cataloging data lineage and metadata management
Additional Information :
Visa is an EEO Employer. Qualified applicants will receive consideration for employment without regard to race color religion sex national origin sexual orientation gender identity disability or protected veteran status. Visa will also consider for employment qualified applicants with criminal histories in a manner consistent with EEOC guidelines and applicable local law.
Remote Work :
No
Employment Type :
Full-time
Full-time