Data Engineer

Not Interested
Bookmark
Report This Job

profile Job Location:

Princeton, NJ - USA

profile Monthly Salary: Not Disclosed
Posted on: 6 hours ago
Vacancies: 1 Vacancy

Job Summary

Position: Data Engineer

Location: Princeton NJ***Day 1 Onsite***

Duration: 1 Years

Mandatory Skills

Key Skills & Technologies
Programming Languages: Python (primary) SQL
Cloud Platforms: AWS (S3 Glue Lambda Redshift EC2 EMR)
Data Tools: Apache Spark Pandas PySpark Airflow
Databases: PostgreSQL MySQL NoSQL (e.g. DynamoDB)
ETL & Workflow Orchestration: AWS Glue Apache Airflow
Version Control: Git
DevOps & CI/CD: Basic understanding of CI/CD pipelines and infrastructure as code (e.g. Terraform CloudFormation)

JD

Data Pipeline Development - Design build and maintain scalable and reliable data pipelines to ingest process and transform data from various sources.
Data Integration & Management - Integrate structured and unstructured data from internal and external systems.
Ensure data quality consistency and availability across platforms.
Cloud-Based Data Engineering- Leverage AWS services (e.g. S3 Lambda Glue Redshift EMR) to build cloud-native data solutions.
Optimize cloud resources for performance and cost-efficiency.
Programming & Automation - Use Python for data manipulation ETL workflows and automation of data tasks.
Develop reusable scripts and modules for data processing.
Collaboration & Stakeholder Engagement
Work closely with data scientists analysts and business teams to understand data needs.
Translate business requirements into technical solutions.
Monitoring & Optimization - Monitor data pipelines and troubleshoot issues proactively.
Continuously improve performance scalability and reliability of data systems.

Position: Data Engineer Location: Princeton NJ***Day 1 Onsite*** Duration: 1 Years Mandatory Skills Key Skills & Technologies Programming Languages: Python (primary) SQL Cloud Platforms: AWS (S3 Glue Lambda Redshift EC2 EMR) Data Tools: Apache Spark Pandas PySpark Airfl...
View more view more

Key Skills

  • Apache Hive
  • S3
  • Hadoop
  • Redshift
  • Spark
  • AWS
  • Apache Pig
  • NoSQL
  • Big Data
  • Data Warehouse
  • Kafka
  • Scala