Junior Data Engineer

Chennai - India

Monthly Salary: Not Disclosed

Posted on: 5 hours ago

Vacancies: 1 Vacancy

Job Summary

What awaits you/ Job Profile	Provide estimates for requirements analyses and develop as per the requirement. Developing and maintaining data pipelines and ETL (Extract Transform Load) processes to extract data efficiently and reliably from various sources transform it into a usable format and load it into the appropriate data repositories. Creating and maintaining logical and physical data models that align with the organizations data architecture and business needs. This includes defining data schemas tables relationships and indexing strategies for optimal data retrieval and analysis. Collaborating with cross-functional teams and stakeholders to ensure data security privacy and compliance with regulations. Collaborate with downstream application to understand their needs and build the data storage and optimize as per their need. Working closely with other stakeholders and Business to understand data requirements and translate them into technical solutions. Familiar with Agile methodologies and have prior experience working with Agile teams using Scrum/Kanban Lead Technical discussions with customers to find the best possible solutions. Proactively identify and implement opportunities to automate tasks and develop reusable frameworks. Optimizing data pipelines to improve performance and cost while ensuring a high quality of data within the data lake. Monitoring services and jobs for cost and performance ensuring continual operations of data pipelines and fixing of defects. Constantly looking for opportunities to optimize data pipelines to improve performance
What should you bring along	Must Have: Hand on Expertise of 4-5 years in AWS services like S3 Lambda Glue Athena RDS Step functions SNS SQS API Gateway Security Access and Role permissions Logging and monitoring Services. Good hand on knowledge on Python Spark Hive and Unix AWS CLI Prior experience in working with streaming solution like Kafka Prior experience in implementing different file storage types like Delta-lake / Ice-berg. Excellent knowledge in Data modeling and Designing ETL pipeline. Must have strong knowledge in using different databases such as MySQL Oracle and Writing complex queries. Strong experience working in a continuous integration and Deployment process. Nice to Have: Hand on experience in the Terraform GIT GIT Actions. CICD pipeline and Amazon Q.
Must have technical skill	Pyspark AWS SQL Kafka Glue IAM. S3 Lambda Step Function Athena
Good to have Technical skills	Terraform GIT GIT Actions. CICD pipeline AI

Required Experience:

Senior IC

What awaits you/ Job ProfileProvide estimates for requirements analyses and develop as per the requirement.Developing and maintaining data pipelines and ETL (Extract Transform Load) processes to extract data efficiently and reliably from various sources transform it into a usable format and load it ...

What awaits you/ Job Profile	Provide estimates for requirements analyses and develop as per the requirement. Developing and maintaining data pipelines and ETL (Extract Transform Load) processes to extract data efficiently and reliably from various sources transform it into a usable format and load it into the appropriate data repositories. Creating and maintaining logical and physical data models that align with the organizations data architecture and business needs. This includes defining data schemas tables relationships and indexing strategies for optimal data retrieval and analysis. Collaborating with cross-functional teams and stakeholders to ensure data security privacy and compliance with regulations. Collaborate with downstream application to understand their needs and build the data storage and optimize as per their need. Working closely with other stakeholders and Business to understand data requirements and translate them into technical solutions. Familiar with Agile methodologies and have prior experience working with Agile teams using Scrum/Kanban Lead Technical discussions with customers to find the best possible solutions. Proactively identify and implement opportunities to automate tasks and develop reusable frameworks. Optimizing data pipelines to improve performance and cost while ensuring a high quality of data within the data lake. Monitoring services and jobs for cost and performance ensuring continual operations of data pipelines and fixing of defects. Constantly looking for opportunities to optimize data pipelines to improve performance
What should you bring along	Must Have: Hand on Expertise of 4-5 years in AWS services like S3 Lambda Glue Athena RDS Step functions SNS SQS API Gateway Security Access and Role permissions Logging and monitoring Services. Good hand on knowledge on Python Spark Hive and Unix AWS CLI Prior experience in working with streaming solution like Kafka Prior experience in implementing different file storage types like Delta-lake / Ice-berg. Excellent knowledge in Data modeling and Designing ETL pipeline. Must have strong knowledge in using different databases such as MySQL Oracle and Writing complex queries. Strong experience working in a continuous integration and Deployment process. Nice to Have: Hand on experience in the Terraform GIT GIT Actions. CICD pipeline and Amazon Q.
Must have technical skill	Pyspark AWS SQL Kafka Glue IAM. S3 Lambda Step Function Athena
Good to have Technical skills	Terraform GIT GIT Actions. CICD pipeline AI