As part of the Data Engineer team you will be responsible for design development and operations of large-scale data systems operating at petabytes scale. You will be focusing on real-time data pipelines streaming analytics distributed big data and machine learning infrastructure. You will interact with the engineers product managers BI developers and architects to provide scalable robust technical solutions.
- Experience in agile models
- Design develop implement and tune large-scale distributed systems and pipelines that process large volume of data; focusing on scalability low -latency and fault-tolerance in every system built.
- Experience with Java Python to write data pipelines and data processing layers
- Experience in Airflow & Github.
-
- Experience in writing map-reduce jobs.
- Demonstrates expertise in writing complex highly-optimized queries across large data sets
- Proven working expertise with Big Data Technologies Hadoop Hive Kafka Presto Spark HBase.
- Highly Proficient in SQL.
- Experience with Cloud Technologies ( GCP Azure)
- Experience with relational model memory data stores desirable ( Oracle Cassandra Druid)
- Provides and supports the implementation and operations of the data pipelines and analytical solutions
- Performance tuning experience of systems working with large data sets
- Experience in REST API data service Data Consumption
- Retail experience is a huge plus.
Qualifications :
Bachelors degree or engineering degree in one of the following areas:
- Computer Systems
- Computer Science
- Information Technology
- Software Engineering
- Applied Mathematics
- Statistics
- or related field
OR At least 4 years of verifiable experience in roles related to:
- Data engineering
- Big Data development experience
- Large-scale data processing
- Data pipeline development
- Use of technologies such as Hadoop Spark Kafka Hive etc.
- English language at conversational level
Remote Work :
Yes
Employment Type :
Full-time