Data Engineer

Not Interested
Bookmark
Report This Job

profile Job Location:

Colombo - Sri Lanka

profile Monthly Salary: Not Disclosed
Posted on: 5 hours ago
Vacancies: 1 Vacancy

Job Summary

Key Responsibilities

Data Operations & Pipeline Support

  • Assist in ingesting collecting validating and storing structured/unstructured batch data coming through Edge nodes or direct DB connections
  • Support ETL/ELT jobs running on Hadoop Hive Impala and Spark
  • Monitor daily data loads troubleshoot failures and ensure data availability for analytics use cases
  • Maintain HDFS directory structure Hive tables and data partitions
  • Perform file-level data quality checks and checksum validations and table level validations for data consistency

Platform & Infrastructure Operations

  • Support the operation of on-prem Hadoop clusters (Cloudera)
  • Assist in OS-level tasks: log checks service restarts disk usage monitoring user/permission handling
  • Assist in regular Big Data cluster health checks
  • Support platform upgrades patches configuration changes and security hardening efforts managed by the senior engineer
  • Work with network and system teams during installation troubleshooting or hardware issues

Tools & Technologies

  • Assist in running and maintaining data flows involving Hive Impala HDFS Spark Kafka (basic) HBase (basic) and Linux environments
  • Use tools like NiFi/SFTP for data movement with NiFi flow development & NiFi cluster management
  • Support API-based data push/pull if required for integrations

Data Governance & Documentation

  • Maintain metadata data dictionary updates and platform documentation
  • Ensure compliance with Kerberos/LDAP authentication and Cloudera Navigator governance processes
  • Record operational runbooks and incident logs

Collaboration & Support

  • Work under the senior engineer to ensure continuous operations of the client environment
  • Participate in joint troubleshooting with Client team during data-source onboarding
  • Provide L1/L2 support for data ingestion cluster operations and daily job executions

Work Complexity and Role Expectation

  • Work on assigned operational tasks within the Big Data platform under guidance
  • Support development testing and automation of simple data flows
  • Involve in routine batch workloads testbed validations
  • Participate as a team member in platform enhancements monitoring improvements and data integration activities

Person Specifications

Education

  • Bachelors degree in computer science IT Electronics/Telecom Engineering or a related field

Technical Skills

  • Basic knowledge of Hadoop ecosystem: HDFS Hive Spark Yarn (hands-on exposure is an added benefit)
  • Familiarity with Linux shell commands; ability to navigate logs and services
  • Good understanding of SQL able to write and troubleshoot complex queries
  • Exposure to Python/Scala/Java is an added advantage
  • Basic understanding of data pipelines ETL processes and batch data workflows
  • Exposure to Cloudera platform is a plus

Experience

  • 12 years of experience in Data Engineering Database operations or Big Data platform support
  • Experience in telecom domain or enterprise data environments is an added advantage

Soft Skills

  • Good analytical and troubleshooting mindset
  • Ability to collaborate with senior engineers and follow structured operational practices
  • Effective communication and willingness to learn complex distributed systems
Key Responsibilities Data Operations & Pipeline Support Assist in ingesting collecting validating and storing structured/unstructured batch data coming through Edge nodes or direct DB connectionsSupport ETL/ELT jobs running on Hadoop Hive Impala and SparkMonitor daily data loads troubleshoot failure...
View more view more

Key Skills

  • Apache Hive
  • S3
  • Hadoop
  • Redshift
  • Spark
  • AWS
  • Apache Pig
  • NoSQL
  • Big Data
  • Data Warehouse
  • Kafka
  • Scala