Role Overview
We are looking for a Senior Big Data Engineer with strong hands-on experience in the Hadoop ecosystem and Scala-based development. The candidate will be responsible for building optimizing and maintaining large-scale distributed data processing systems including batch and real-time streaming pipelines to support high-performance analytics and data-driven solutions.
Roles & Responsibilities Big Data & Hadoop Development
-
Design develop and maintain large-scale Big Data solutions using the Hadoop ecosystem.
-
Work extensively with HDFS MapReduce HBase Hive Spark Kafka Presto and related technologies.
-
Develop high-performance data processing jobs using Scala (mandatory) and Spark.
-
Optimize Hadoop jobs for performance scalability reliability and cost efficiency.
Data Engineering & ETL Pipelines
-
Design and build end-to-end data pipelines and ETL workflows ingesting data from heterogeneous sources.
-
Implement data ingestion using Kafka Flume Sqoop Spark Streaming and batch ingestion frameworks.
-
Ensure data quality consistency and fault tolerance across ingestion and transformation layers.
Streaming & Real-Time Processing
-
Build and maintain real-time stream processing systems using Spark Streaming and Kafka.
-
Process high-velocity data streams and enable near-real-time analytics use cases.
-
Handle event-driven architectures and ensure low-latency data delivery.
Automation CI/CD & Cloud
-
Implement CI/CD pipelines for Big Data applications using modern DevOps practices.
-
Automate build deployment and monitoring processes for distributed data systems.
-
Work with cloud-based deployments and hybrid architectures.
Open-Source & Advanced Tools
Collaboration & Delivery
-
Collaborate with cross-functional teams including data scientists architects QA and business stakeholders.
-
Participate in design discussions code reviews and architectural decisions.
-
Troubleshoot production issues and provide root cause analysis and permanent fixes.
-
Ensure adherence to coding standards security practices and enterprise data governance.
Mandatory Skills (Top 3 Must Have)
-
Hands-on experience with Hadoop ecosystem
-
Data Pipeline & ETL Development
-
Strong Programming & Automation Skills
-
Expert-level coding in Scala (mandatory)
-
Comfortable with Python or Java
-
Strong development automation and scripting skills
Desired / Good-to-Have Skills
-
Python
-
Java
-
Experience with Druid Elasticsearch Logstash
-
Cloud-based Big Data deployments
-
DevOps and CI/CD integration
Key Competencies
-
Strong problem-solving and analytical skills
-
Ability to adapt quickly to new Big Data tools and frameworks
-
Experience working in fast-paced enterprise-scale environments
-
Excellent communication and collaboration skills
Role Overview We are looking for a Senior Big Data Engineer with strong hands-on experience in the Hadoop ecosystem and Scala-based development. The candidate will be responsible for building optimizing and maintaining large-scale distributed data processing systems including batch and real-time str...
Role Overview
We are looking for a Senior Big Data Engineer with strong hands-on experience in the Hadoop ecosystem and Scala-based development. The candidate will be responsible for building optimizing and maintaining large-scale distributed data processing systems including batch and real-time streaming pipelines to support high-performance analytics and data-driven solutions.
Roles & Responsibilities Big Data & Hadoop Development
-
Design develop and maintain large-scale Big Data solutions using the Hadoop ecosystem.
-
Work extensively with HDFS MapReduce HBase Hive Spark Kafka Presto and related technologies.
-
Develop high-performance data processing jobs using Scala (mandatory) and Spark.
-
Optimize Hadoop jobs for performance scalability reliability and cost efficiency.
Data Engineering & ETL Pipelines
-
Design and build end-to-end data pipelines and ETL workflows ingesting data from heterogeneous sources.
-
Implement data ingestion using Kafka Flume Sqoop Spark Streaming and batch ingestion frameworks.
-
Ensure data quality consistency and fault tolerance across ingestion and transformation layers.
Streaming & Real-Time Processing
-
Build and maintain real-time stream processing systems using Spark Streaming and Kafka.
-
Process high-velocity data streams and enable near-real-time analytics use cases.
-
Handle event-driven architectures and ensure low-latency data delivery.
Automation CI/CD & Cloud
-
Implement CI/CD pipelines for Big Data applications using modern DevOps practices.
-
Automate build deployment and monitoring processes for distributed data systems.
-
Work with cloud-based deployments and hybrid architectures.
Open-Source & Advanced Tools
Collaboration & Delivery
-
Collaborate with cross-functional teams including data scientists architects QA and business stakeholders.
-
Participate in design discussions code reviews and architectural decisions.
-
Troubleshoot production issues and provide root cause analysis and permanent fixes.
-
Ensure adherence to coding standards security practices and enterprise data governance.
Mandatory Skills (Top 3 Must Have)
-
Hands-on experience with Hadoop ecosystem
-
Data Pipeline & ETL Development
-
Strong Programming & Automation Skills
-
Expert-level coding in Scala (mandatory)
-
Comfortable with Python or Java
-
Strong development automation and scripting skills
Desired / Good-to-Have Skills
-
Python
-
Java
-
Experience with Druid Elasticsearch Logstash
-
Cloud-based Big Data deployments
-
DevOps and CI/CD integration
Key Competencies
-
Strong problem-solving and analytical skills
-
Ability to adapt quickly to new Big Data tools and frameworks
-
Experience working in fast-paced enterprise-scale environments
-
Excellent communication and collaboration skills
View more
View less