Data Engineer


Job Location:

Irving, TX - USA

Monthly Salary: Not Disclosed
Posted on: 3 days ago
Vacancies: 1 Vacancy

Job Summary

Hi
I hope youre doing well. I had a chance to review your profile and wanted to discuss a full-time hire position with our client a major Systems Integrator.
Please review the JD below and let me know if you would be interested in exploring the opportunity.



Job Title: Data Engineer

Location: Irving TX Onsite In person Interview

Duration: Fulltime

Job Description:

We are seeking a highly skilled and motivated Data Engineer to play a pivotal role in designing building and optimizing our next-generation scalable data pipelines. This position requires expertise in processing massive datasets using cutting-edge technologies like Apache Spark PySpark and Hive within a dynamic cloud environment. Your primary objective will be to ensure the utmost data reliability speed and efficiency providing a robust foundation for downstream business intelligence and advanced analytics initiatives.

Roles & Responsibilities:

Data Pipeline Development & Maintenance: Design build and maintain highly scalable and efficient ETL/ELT data pipelines utilizing PySpark and Spark SQL for complex data transformations.

Cloud Data Infrastructure Management: Deploy manage and scale critical data infrastructure components on leading cloud platforms such as Amazon Web Services (AWS) (e.g. EMR Glue) Microsoft Azure (e.g. Databricks Synapse) or Google Cloud Platform (GCP).

Data Warehousing & Storage Optimization: Strategically manage data layout partitioning and indexing within Apache Hive and various cloud data lake solutions to optimize performance and accessibility.

Performance Tuning & Optimization: Proactively identify and resolve performance bottlenecks in Spark jobs leveraging Spark UI for in-depth analysis effectively managing data skewness and optimizing memory utilization.

Diverse Data Integration: Develop robust solutions for ingesting high-volume and diverse datasets from both structured relational databases and unstructured flat files into our data ecosystem.

Automated Workflow Orchestration: Implement and manage automated data workflows using industry-standard scheduling tools like Apache Airflow or platform-native schedulers ensuring timely and reliable data delivery.

Strategic Collaboration: Partner closely with data scientists business analysts and cross-functional enterprise teams to translate complex business requirements into technically sound and efficient data solutions.

Qualifications:

Big Data Frameworks Expertise: Demonstrated high proficiency in Apache Spark architecture including a deep understanding of drivers executors and Directed Acyclic Graphs (DAGs).

Advanced Programming: Exceptional coding skills in Python and extensive experience with the PySpark API for developing intricate data transformations and processing logic.

Querying & Schema Management: Strong command of HiveQL and ANSI SQL coupled with expertise in data partitioning techniques and effective schema definition.

Optimized Storage Formats: In-depth understanding and practical experience with optimized big data storage file formats such as Parquet ORC and Avro.

Cloud Ecosystem Development: Hands-on development experience utilizing cloud-native big data utilities (e.g. AWS EMR Azure Databricks) with in major cloud platforms.

Data Warehousing Fundamentals: Solid foundation in Dimensional Data Modeling including Star and Snowflake schemas and practical experience with Data Lakes concepts and implementation.

Preferred Qualifications

CI/CD & DevOps Automation: Experience with Continuous Integration/Continuous Deployment (CI/CD) practices and automation tools like Git Jenkins or Ansible.

NoSQL Database Integration: Exposure to and experience with NoSQL databases such as HBase Cassandra or MongoDB.

Professional Cloud Certifications: Relevant professional cloud certifications (e.g. AWS Certified Data Engineer Microsoft Certified: Azure Data Engineer Associate) are highly valued

Thanks & Regards

Sumit Goyal

Lead Recruiter

Hi I hope youre doing well. I had a chance to review your profile and wanted to discuss a full-time hire position with our client a major Systems Integrator. Please review the JD below and let me know if you would be interested in exploring the opportunity. Job Title: Data Engineer Location: Irvin...