Data Engineer with PySpark & DPL

Purple Drive

Not Interested
Bookmark
Report This Job

profile Job Location:

Jersey, NJ - USA

profile Monthly Salary: Not Disclosed
Posted on: Yesterday
Vacancies: 1 Vacancy

Job Summary

Required Skills & Qualifications

  • Strong hands-on expertise with:
    • PySpark(RDD Data Frames Spark SQL performance tuning)
    • DPL(Data Pipeline Language / relevant tool-specific DPL)
  • Proficiency in Pythonfor data engineering workflows.
  • Experience with distributed computing and big data technologies (Spark Hadoop Delta Lake).
  • Strong SQL skills and experience with relational and NoSQL databases.
  • Experience building ETL/ELT pipelines on cloud platforms (AWS / Azure / GCP).
  • Familiarity with CI/CD Git and containerization (Docker/Kubernetes) is a plus.
  • Bachelors or Masters in Computer Science Engineering or related field.

Preferred Skills

  • Experience with orchestration tools (Airflow ADF Argo Prefect).
  • Knowledge of data warehousing concepts (Star schema SCD normalization).
  • Experience with streaming platforms (Kafka Kinesis Spark Streaming).
  • Exposure to data governance security and compliance frameworks.
  • Experience working in Agile environments.
Required Skills & Qualifications Strong hands-on expertise with: PySpark(RDD Data Frames Spark SQL performance tuning) DPL(Data Pipeline Language / relevant tool-specific DPL) Proficiency in Pythonfor data engineering workflows. Experience with distributed computing and big data technologies (Sp...
View more view more

Key Skills

  • Apache Hive
  • S3
  • Hadoop
  • Redshift
  • Spark
  • AWS
  • Apache Pig
  • NoSQL
  • Big Data
  • Data Warehouse
  • Kafka
  • Scala