Position: Data Engineer
Location: Norwell MA ***Day 1 Onsite***
Duration: 1 Years
| Python PySpark CI/CD pipelines ARM templates Great Expectations | Can you explain your experience with building data pipelines using PySpark How do you ensure data consistency and scalability across datasets Have you worked with ARM templates for automated infrastructure provisioning How do you integrate data quality checks into CI/CD workflows Can you provide an example of a data quality rule you have defined and automated | | Job Summary: Key Responsibilities: Design and implement Silver and Gold layer data models following medallion architecture best practices. Perform data cleansing standardization enrichment and aggregation to support analytics and reporting. Build optimized PySpark-based transformations for large-scale data processing. Ensure data consistency performance and scalability across datasets. Build and maintain CI/CD pipelines using Git-based workflows (Azure DevOps / GitHub). Use ARM templates (or IaC equivalents) for automated infrastructure provisioning. Enable automated deployment of data pipelines notebooks and configurations. Follow DevOps best practices for version control branching and release management. Create modular maintainable and testable Python code. Support automation of metadata logging alerting and operational tasks. Implement data quality libraries such as Great Expectations. Define and automate data quality rules (completeness accuracy freshness consistency). Monitor log and troubleshoot data quality issues proactively. Work closely with data architects analysts QA and business stakeholders. Translate business and analytical requirements into robust data engineering solutions. Participate in Agile ceremonies and support sprint-based delivery. Required Skills and Qualifications: Strong hands-on experience with Silver & Gold layer development Python (automation and data processing) and PySpark. Also with Great Expectations (data quality framework) and Experience with CI/CD pipelines using Git-based tools. Hands-on experience with ARM templates or infrastructure-as-code concepts. Strong understanding of data modeling and medallion architecture. Experience working with large datasets in distributed environments. Good to Have: Microsoft Certified: Azure Data Engineer Associate or Azure Enterprise Data Analyst Associate. Wast management or oil and gas domain knowledge |
Position: Data Engineer Location: Norwell MA ***Day 1 Onsite*** Duration: 1 Years Python PySpark CI/CD pipelines ARM templates Great Expectations Can you explain your experience with building data pipelines using PySpark How do you ensure data consistency and scala...
Position: Data Engineer
Location: Norwell MA ***Day 1 Onsite***
Duration: 1 Years
| Python PySpark CI/CD pipelines ARM templates Great Expectations | Can you explain your experience with building data pipelines using PySpark How do you ensure data consistency and scalability across datasets Have you worked with ARM templates for automated infrastructure provisioning How do you integrate data quality checks into CI/CD workflows Can you provide an example of a data quality rule you have defined and automated | | Job Summary: Key Responsibilities: Design and implement Silver and Gold layer data models following medallion architecture best practices. Perform data cleansing standardization enrichment and aggregation to support analytics and reporting. Build optimized PySpark-based transformations for large-scale data processing. Ensure data consistency performance and scalability across datasets. Build and maintain CI/CD pipelines using Git-based workflows (Azure DevOps / GitHub). Use ARM templates (or IaC equivalents) for automated infrastructure provisioning. Enable automated deployment of data pipelines notebooks and configurations. Follow DevOps best practices for version control branching and release management. Create modular maintainable and testable Python code. Support automation of metadata logging alerting and operational tasks. Implement data quality libraries such as Great Expectations. Define and automate data quality rules (completeness accuracy freshness consistency). Monitor log and troubleshoot data quality issues proactively. Work closely with data architects analysts QA and business stakeholders. Translate business and analytical requirements into robust data engineering solutions. Participate in Agile ceremonies and support sprint-based delivery. Required Skills and Qualifications: Strong hands-on experience with Silver & Gold layer development Python (automation and data processing) and PySpark. Also with Great Expectations (data quality framework) and Experience with CI/CD pipelines using Git-based tools. Hands-on experience with ARM templates or infrastructure-as-code concepts. Strong understanding of data modeling and medallion architecture. Experience working with large datasets in distributed environments. Good to Have: Microsoft Certified: Azure Data Engineer Associate or Azure Enterprise Data Analyst Associate. Wast management or oil and gas domain knowledge |
View more
View less