Data Engineer

CAI

Not Interested
Bookmark
Report This Job

profile Job Location:

Bengaluru - India

profile Monthly Salary: Not Disclosed
Posted on: 30+ days ago
Vacancies: 1 Vacancy

Job Summary

Data Engineer

Req number:

R6413

Employment type:

Full time

Worksite flexibility:

Remote

Who we are

CAI is a global technology services firm with over 8500 associates worldwide and a yearly revenue of $1 billion. We have over 40 years of excellence in uniting talent and technology to power the possible for our clients colleagues and communities. As a privately held company we have the freedom and focus to do what is rightwhatever it takes. Our tailor-made solutions create lasting results across the public and commercial sectors and we are trailblazers in bringing neurodiversity to the enterprise.

Job Summary

We are seeking a motivated Data Engineer to join our dynamic team. As a Data Engineer you will play a crucial role in building cloud-based data lake and analytics architectures using AWS and Databricks and is proficient in Python programming for data processing and automation. This is a Full-time and Remote position.

Job Description

We are looking for a Data Engineer that has experience in building data products using Databricks and related technologies. This position will be Full-time and Remote position.

What Youll Do

  • Design develop and maintain data lakes and data pipelines on AWS using ETL frameworks and Databricks

  • Integrate and transform large-scale data from multiple heterogeneous sources into a centralized data lake environment

  • Implement and manage Delta Lake architecture using Databricks Delta or Apache Hudi

  • Develop end-to-end data workflows using PySpark Databricks Notebooks and Python scripts for ingestion transformation and enrichment

  • Design and develop data warehouses and data marts for analytical workloads using Snowflake Redshift or similar systems

  • Design and evaluate data models (Star Snowflake Flattened) for analytical and transactional systems

  • Optimize data storage query performance and cost across the AWS and Databricks ecosystem

  • Build and maintain CI/CD pipelines for Databricks notebooks jobs and Python-based data processing scripts

  • Collaborate with data scientists analysts and stakeholders to deliver high-performance reusable data assets

  • Maintain and manage code repositories (Git) and promote best practices in version control testing and deployment

  • Participate in making major technical and architectural decisions for data engineering initiatives

  • Monitor and troubleshoot Databricks clusters Spark jobs and ETL processes for performance and reliability

  • Coordinate with business and technical teams through all phases of the software development life cycle

What Youll Need

Required

  • 5 years of experience building and managing Data Lake Architecture on AWS Cloud

  • 3 years of experience with AWS Data services such as S3 Glue Lake Formation EMR Kinesis RDS DMS and Redshift

  • 3 years of experience building Data Warehouses on Snowflake Redshift HANA Teradata or Exasol

  • 3 years of hands-on experience working with Apache Spark or PySpark on Databricks

  • 3 years of experience implementing Delta Lakes using Databricks Delta or Apache Hudi

  • 3 years of experience in ETL development using Databricks AWS Glue or other modern frameworks

  • Proficiency in Python for data engineering automation and API integrations.

  • Experience in Databricks Jobs Workflows and Cluster Management

  • Experience with CI/CD pipelines and Infrastructure as Code (IaC) tools like Terraform or CloudFormation is a plus

  • Bachelors degree in computer science Information Technology Data Science or related field

Physical Demands

  • This role involves mostly sedentary work with occasional movement around the office to attend meetings etc.

  • Ability to perform repetitive tasks on a computer using a mouse keyboard and monitor

Reasonable accommodation statement

If you require a reasonable accommodation in completing this application interviewing completing any pre-employment testing or otherwise participating in the employment selection process please direct your inquiries to or (888).

Data EngineerReq number:R6413Employment type:Full timeWorksite flexibility:RemoteWho we areCAI is a global technology services firm with over 8500 associates worldwide and a yearly revenue of $1 billion. We have over 40 years of excellence in uniting talent and technology to power the possible for o...
View more view more

Key Skills

  • Apache Hive
  • S3
  • Hadoop
  • Redshift
  • Spark
  • AWS
  • Apache Pig
  • NoSQL
  • Big Data
  • Data Warehouse
  • Kafka
  • Scala

About Company

Company Logo

CAI helps organizations leverage technology, people, and processes to solve business problems, enable savings, and spur innovation.

View Profile View Profile