Hi
I hope youre doing well. I had a chance to review your profile and wanted to discuss a full-time hire position with our client a major Systems Integrator.
Please review the JD below and let me know if you would be interested in exploring the opportunity.
Job Title: Data Engineer
Location: New York City - Onsite
Duration: Fulltime
Job Description
Must Have Technical/Functional Skills
Hands-on experience in building ETL using Databricks SaaS infrastructure.
Experience in developing data pipeline solutions to ingest and exploit new and existing data
sources.
Expertise in leveraging SQL programming language like Python and ETL tools like
Databricks
Perform code reviews to ensure requirements optimal execution patterns and adherence to
established standards.
Expertise in AWS Compute (EC2 EMR) AWS Storage (S3 EBS) AWS Databases (RDS
DynamoDB) AWS Data Integration (Glue).
Advanced understanding of Container Orchestration services including Docker and
Kubernetes and a variety of AWS tools and services.
Good understanding of AWS Identify and Access management AWS Networking and AWS
Monitoring tools.
Proficiency in CI/CD and deployment automation using GITLAB pipeline.
Proficiency in Cloud infrastructure provisioning tools e.g. Terraform.
Proficiency in one or more programming languages e.g. Python Scala.
Experience in Starburst Trino and building SQL queries in federated architecture.
Good knowledge of Lake house architecture.
Design develop and optimize scalable ETL/ELT pipelines using Databricks and Apache
Spark (PySpark and Scala).
Build data ingestion workflows from various sources (structured semi-structured
Develop reusable components and frameworks for efficient data processing.
Implement best practices for data quality validation and governance.
Collaborate with data architects analysts and business stakeholders to understand data requirements.
Tune Spark jobs for performance and scalability in a cloud-based environment.
Maintain robust data lake or Lakehouse architecture.
Ensure high availability security and integrity of data pipelines and platforms.
Support troubleshooting debugging and performance optimization in production workloads.
Roles & Responsibilities
Work on migrating applications from an on-premises location to the cloud service providers.
Develop products and services on the latest technologies through contributions in
development enhancements testing and implementation.
Develop modify extend code for building cloud infrastructure and automate using CI/CD
pipeline.
Partners with business and peers in the pursuit of solutions that achieve business goals
through an agile software development methodology.
Perform problem analysis data analysis reporting and communication.
Work with peers across the system to define and implement best practices and standards.
Assess applications and help determine the appropriate application infrastructure patterns.
Use the best practices and knowledge of internal or external drivers to improve products or services.
Thanks & Regards
Sumit Goyal
Sr. Technical Recruiter
Hi I hope youre doing well. I had a chance to review your profile and wanted to discuss a full-time hire position with our client a major Systems Integrator. Please review the JD below and let me know if you would be interested in exploring the opportunity. Job Title: Data Engineer Location: New Y...
Hi
I hope youre doing well. I had a chance to review your profile and wanted to discuss a full-time hire position with our client a major Systems Integrator.
Please review the JD below and let me know if you would be interested in exploring the opportunity.
Job Title: Data Engineer
Location: New York City - Onsite
Duration: Fulltime
Job Description
Must Have Technical/Functional Skills
Hands-on experience in building ETL using Databricks SaaS infrastructure.
Experience in developing data pipeline solutions to ingest and exploit new and existing data
sources.
Expertise in leveraging SQL programming language like Python and ETL tools like
Databricks
Perform code reviews to ensure requirements optimal execution patterns and adherence to
established standards.
Expertise in AWS Compute (EC2 EMR) AWS Storage (S3 EBS) AWS Databases (RDS
DynamoDB) AWS Data Integration (Glue).
Advanced understanding of Container Orchestration services including Docker and
Kubernetes and a variety of AWS tools and services.
Good understanding of AWS Identify and Access management AWS Networking and AWS
Monitoring tools.
Proficiency in CI/CD and deployment automation using GITLAB pipeline.
Proficiency in Cloud infrastructure provisioning tools e.g. Terraform.
Proficiency in one or more programming languages e.g. Python Scala.
Experience in Starburst Trino and building SQL queries in federated architecture.
Good knowledge of Lake house architecture.
Design develop and optimize scalable ETL/ELT pipelines using Databricks and Apache
Spark (PySpark and Scala).
Build data ingestion workflows from various sources (structured semi-structured
Develop reusable components and frameworks for efficient data processing.
Implement best practices for data quality validation and governance.
Collaborate with data architects analysts and business stakeholders to understand data requirements.
Tune Spark jobs for performance and scalability in a cloud-based environment.
Maintain robust data lake or Lakehouse architecture.
Ensure high availability security and integrity of data pipelines and platforms.
Support troubleshooting debugging and performance optimization in production workloads.
Roles & Responsibilities
Work on migrating applications from an on-premises location to the cloud service providers.
Develop products and services on the latest technologies through contributions in
development enhancements testing and implementation.
Develop modify extend code for building cloud infrastructure and automate using CI/CD
pipeline.
Partners with business and peers in the pursuit of solutions that achieve business goals
through an agile software development methodology.
Perform problem analysis data analysis reporting and communication.
Work with peers across the system to define and implement best practices and standards.
Assess applications and help determine the appropriate application infrastructure patterns.
Use the best practices and knowledge of internal or external drivers to improve products or services.
Thanks & Regards
Sumit Goyal
Sr. Technical Recruiter
View more
View less