- Design develop and maintain data pipelines and ETL processes using PySpark or Scala.
- Work extensively with Python to build scalable data solutions.
- Should be strong in SQL
- Develop and manage data workflows in Azure Data Factory (ADF).
- Implement and optimize data solutions on cloud platforms such as Azure or AWS.
- Ensure data integrity performance and security across data pipelines.
- Collaborate with data scientists analysts and other stakeholders to support business needs.
- Lead and mentor a team of junior data engineers (if applicable).
Requirements
- 6 to 10 years of experience in data engineering or related roles.
- Strong handson experience with PySpark or Scala.
- Proficiency in Python for data processing and automation.
- Experience working with Azure Data Factory (ADF).
- Handson experience with cloud platforms (Azure or AWS).
- Strong understanding of ETL processes data warehousing and big data technologies.
- Experience in leading a team is a plus.
Proficiency in multiple authentication mechanisms, including JWT and Signature-based authentication. Understanding of data encryption, API security, and secure authentication. Strong expertise in both SQL and NoSQL databases. Experience with query optimization and indexing. CI/CD pipeline knowledge is mandatory. Must have experience with Azure or AWS, particularly in deployment. Experience with unit testing (Jest, Mocha, etc.). Knowledge of integration and end-to-end testing. Ability to write testable code and work with CI/CD pipelines for automated testing. Problem-solving mindset with the ability to optimize existing solutions. Must be able to provide accurate project timeline estimations. Expected to deliver high-quality, well-documented code within deadlines.