Hello
Position: Senior Data Engineer
Location: Pittsburgh PA (On-site)
Duration: 6months
Job Description:
Experience: 8-10 years in required skills
- Must Have: Strong proficiency in Python (including Py-Spark) and SQL is essential with additional experience in Java or Scala being a plus.
- A Senior Data Engineer focusing on Python and Hadoop is responsible for designing building and maintaining robust data pipelines and infrastructure using the Hadoop ecosystem and advanced Python programming.
- This role involves leading technical projects ensuring data quality and scalability and collaborating with cross-functional teams.
Key Responsibilities:
- Data Pipeline Development Design build and maintain scalable ETL/ ELT processes and data pipelines using Python SQL and big data technologies (Hadoop Spark Hive Kafka).
- Big Data Management Work within the Hadoop technology stack including HDFS Hive Yarn Impala and HBase to manage and store large datasets.
- Performance Optimization Automation Troubleshoot tune and optimize data processing jobs and database performance while identifying opportunities for automation in testing and deployment processes (CICD).
- Architecture Design Lead in developing data solutions and designing data service infrastructure contributing to overall data architecture decisions.
- Collaborate with data scientists analysts and business stakeholders to understand data requirements and provide technical guidance and mentorship to junior team members.
- Data Quality Governance Ensure data accuracy integrity and security by implementing validation checks and adhering to data governance standards.
Experience:
- Typically requires 5 years of experience in data engineering or a related role with a proven track record of deploying and managing large-scale distributed systems.
Programming Languages:
- Strong proficiency in Python (including Py-Spark) and SQL is essential with additional experience in Java or Scala being a plus.
- Big Data Technologies Expertise in the Hadoop ecosystem and components along with distributed computing frameworks like Apache Spark and Kafka is crucial.
- Databases Cloud Platforms Experience with relational (e.g. PostgreSQL MySQL) and NoSQL databases and familiarity with cloud services (AWS GCP or Azure).
- Problem-Solving High-level analytical and problem-solving skills to resolve complex technical data issues.
Hello Position: Senior Data Engineer Location: Pittsburgh PA (On-site) Duration: 6months Job Description: Experience: 8-10 years in required skills Must Have: Strong proficiency in Python (including Py-Spark) and SQL is essential with additional experience in Java or Scala being a plus...
Hello
Position: Senior Data Engineer
Location: Pittsburgh PA (On-site)
Duration: 6months
Job Description:
Experience: 8-10 years in required skills
- Must Have: Strong proficiency in Python (including Py-Spark) and SQL is essential with additional experience in Java or Scala being a plus.
- A Senior Data Engineer focusing on Python and Hadoop is responsible for designing building and maintaining robust data pipelines and infrastructure using the Hadoop ecosystem and advanced Python programming.
- This role involves leading technical projects ensuring data quality and scalability and collaborating with cross-functional teams.
Key Responsibilities:
- Data Pipeline Development Design build and maintain scalable ETL/ ELT processes and data pipelines using Python SQL and big data technologies (Hadoop Spark Hive Kafka).
- Big Data Management Work within the Hadoop technology stack including HDFS Hive Yarn Impala and HBase to manage and store large datasets.
- Performance Optimization Automation Troubleshoot tune and optimize data processing jobs and database performance while identifying opportunities for automation in testing and deployment processes (CICD).
- Architecture Design Lead in developing data solutions and designing data service infrastructure contributing to overall data architecture decisions.
- Collaborate with data scientists analysts and business stakeholders to understand data requirements and provide technical guidance and mentorship to junior team members.
- Data Quality Governance Ensure data accuracy integrity and security by implementing validation checks and adhering to data governance standards.
Experience:
- Typically requires 5 years of experience in data engineering or a related role with a proven track record of deploying and managing large-scale distributed systems.
Programming Languages:
- Strong proficiency in Python (including Py-Spark) and SQL is essential with additional experience in Java or Scala being a plus.
- Big Data Technologies Expertise in the Hadoop ecosystem and components along with distributed computing frameworks like Apache Spark and Kafka is crucial.
- Databases Cloud Platforms Experience with relational (e.g. PostgreSQL MySQL) and NoSQL databases and familiarity with cloud services (AWS GCP or Azure).
- Problem-Solving High-level analytical and problem-solving skills to resolve complex technical data issues.
View more
View less