We are seeking a highly skilled and experienced Data Engineering Lead with a strong background in the retail domain and exceptional programming abilities. As a Lead you will play a pivotal role in implementing and optimizing data architecture to support our retail business operations and analytics initiatives. Your expertise in Spark programming optimization techniques and familiarity with Databricks and CI/CD practices will be instrumental in ensuring the efficient and effective management of our dataecosystem.
Imp Pointers as below skills should be in the CV
Data Platform (Data Engineers)
- Data Lake:AWS S3 / Azure / Dell Cloud Data Lake with Delta Lake format
- Data Warehouse:Databricks IOMETE Google BigQuery for analytical workloads
- Batch Processing /Stream Processing:Batch/real-time processing of data pipelines
- Database:PostgreSQL for transactional data NoSQL (ex. MongoDB Cassendra) for document storage
- Design and develop data models data integration processes and data pipelines to capture transform and load structured and unstructured data from various retail sources.
- Hands-on programming in Spark to develop and optimize data processing applications and analytics workflows.
- Apply optimization techniques to enhance the performance and efficiency of data processing and analytical tasks.
- Evaluate and implement appropriate tools and technologies including Databricks to streamline data operations and ensure scalability and reliability.
- Work closely with other team members to ensure data integrity consistency and accessibility across the organization.
- Define and enforce best practices for data governance and data management including data quality metadata management and data security.
- Collaborate with DevOps teams to establish and maintain CI/CD pipelines for data engineering and analytics workflows.
- Peer Review of team members deliverables
- Stay updated with the latest advancements and trends in the retail domain data architecture and programming languages to drive continuous improvement.
Requirements:
- At least 5 years of experience in the data engineering domain.
- Proven experience as a senior Data Engineer /Lead preferably within the retail industry.
- Strong programming skills with expertise in PySpark programming and optimization techniques.
- Hands-on experience with Databricks Deltalake and its components for data processing and analytics.
- Hands-on experience in data modelling data integration and ETL/ELT processes.
- Experience in working with Gitlab pipelines and an in-depth understanding of CI/CD pipeline designs.
- Experience with data governance data quality and metadata management.
- Strong analytical and problem-solving abilities with a detail-oriented mindset.
- Excellent communication and collaboration skills to work effectively with cross-functional teams.
- Ability to adapt to a fast-paced and evolving environment while managing multiple priorities.
- Good to have experience in at least one of the Cloud Vendor (AWS / Azure / GCP)
- Good to have experience with streaming technologies as well
Required Experience:
Senior IC