Big Data Engineer

Smart IT Frame

Not Interested
Bookmark
Report This Job

profile Job Location:

Irving, TX - USA

profile Monthly Salary: Not Disclosed
Posted on: 30+ days ago
Vacancies: 1 Vacancy

Job Summary

Job Title: PySpark Consultant // Big Data Engineer

Location: Irving TX ( 3 days from Office )

Job Type: Only Fulltime

Experience Required: Minimum 10 Years

Responsibilities:

  • Experience with big data processing and distributed computing systems like Spark.
  • Implement ETL pipelines and data transformation processes.
  • Ensure data quality and integrity in all data processing workflows.
  • Troubleshoot and resolve issues related to PySpark applications and workflows.
  • Understand source dependencies and data flow from converted PySpark code.
  • Strong programming skills in Python and SQL.
  • Experience with big data technologies like Hadoop Hive and Kafka.
  • Understanding of data warehousing concepts and relational databases like SQL.
  • Demonstrate and document code lineage.
  • Integrate PySpark code with frameworks such as Ingestion Framework DataLens etc.
  • Ensure compliance with data security privacy regulations and organizational standards.
  • Knowledge of CI/CD pipelines and DevOps practices.
  • Strong problem-solving and analytical skills.
  • Excellent communication and leadership abilities.

Qualifications:

  • 4 years of experience in big data development Hadoop Hive & Spark framework.
  • Good to have experience in SAS.
  • Strong Python PySpark Development and SQL knowledge.
  • Certification in big data or cloud technologies is preferred.
  • Excellent communication collaboration and problem-solving skills.
Job Title: PySpark Consultant // Big Data Engineer Location: Irving TX ( 3 days from Office ) Job Type: Only Fulltime Experience Required: Minimum 10 Years Responsibilities: Experience with big data processing and distributed computing systems like Spark. Implement ETL pipelines and data t...
View more view more

Key Skills

  • Apache Hive
  • S3
  • Hadoop
  • Redshift
  • Spark
  • AWS
  • Apache Pig
  • NoSQL
  • Big Data
  • Data Warehouse
  • Kafka
  • Scala