Location: Pan India
Mode: Hybrid General Shift (09:00 AM 06:15 PM)
Experience: 6 Years
We are seeking a skilled Data Engineer with strong experience in building scalable data platforms and ETL pipelines. The ideal candidate will work on consolidating processing and optimizing large-scale datasets to enable analytics and real-time insights for retail business use cases.
Key ResponsibilitiesDesign develop and maintain robust ETL/ELT pipelines for structured and unstructured data
Build and manage scalable data lakes data warehouses and databases
Integrate and consolidate data from multiple heterogeneous sources into a unified data model
Perform data cleansing validation enrichment and standardization to ensure high data quality
Optimize data storage structures for efficient querying and analytics
Migrate manage and optimize data systems on cloud platforms (AWS)
Tune databases and data processing jobs for performance and cost efficiency
Implement data governance integrity privacy and compliance policies (e.g. GDPR)
Develop pipelines and frameworks to support near real-time and real-time data processing
Automate data workflows and operational processes for reliability and scalability
ETL pipeline development
Big Data technologies
Python (PySpark)
SQL
Cloud platforms: AWS
Python libraries: Pandas NumPy Scikit-learn Regex
Databases: Amazon Redshift PostgreSQL
Experience in Retail domain data models and analytics
Exposure to data governance and compliance frameworks