Should have a minimum 7 years in Data Engineering Data Analytics platform.
Should have strong hands-on design and engineering background in AWS across a wide range of
AWS services with the ability to demonstrate working on large engagements.
Should be involved in Requirements Gathering and transforming them to into Functionally and
technical design.
Maintain and optimize the data infrastructure required for accurate extraction transformation and
loading of data from a wide variety of data sources.
Design build and maintain batch or real-time data pipelines in production.
Develop ETL/ELT Data pipeline (extract transform load) processes to help extract and manipulate
data from multiple sources.
Automate data workflows such as data ingestion aggregation and ETL processing and should
have good experience with different types of data ingestion techniques: File-based API-based
streaming data sources (OLTP OLAP ODS etc) and heterogeneous databases.
Prepare raw data in Data Warehouses into a consumable dataset for both technical and non-
technical stakeholders.
Strong experience and implementation of Data lakes Data warehousing Data Lakehousing
architectures.
Ensure data accuracy integrity privacy security and compliance through quality control
procedures.
Monitor data systems performance and implement optimization strategies.
Leverage data controls to maintain data privacy security compliance and quality for allocated
areas of ownership.
Experience of AWS tools (AWS S3 EC2 Athena Redshift Glue EMR Lambda RDS Kinesis
DynamoDB QuickSight etc.).
Strong experience with Python SQL pySpark Scala Shell Scripting etc.
Strong experience with workflow management & Orchestration tools (Airflow
Should hold decent experience and understanding of data manipulation/wrangling techniques.
Demonstrable knowledge of applying Data Engineering best practices (coding practices to DS unit
testing version control code review).
Big Data Eco-Systems Cloudera/Hortonworks AWS EMR etc.
Snowflake Data Warehouse/Platform.
Streaming technologies and processing engines Kinesis Kafka Pub/Sub and Spark Streaming.
Experience of working with CI/CD technologies Git Jenkins Spinnaker Ansible etc
Experience building and deploying solutions to AWS Cloud.
Good experience on NoSQL databases like Dynamo DB Redis Cassandra MongoDB or Neo4j
etc.
Experience with working on large data sets and distributed computing (e.g.
Hive/Hadoop/Spark/Presto/MapReduce).
Good to have working knowledge on Data Visualization tools like Tableau Amazon QuickSight
Power BI QlikView etc.
Experience in Insurance domain preferred.