Position Overview:
We are seeking an experienced Data Engineer to join our innovative data team data team and help build scalable data infrastructure software consultancy and development services that powers business intelligence analytics and machine learning initiatives. The ideal candidate will design develop and maintain robust high-performance data pipelines and solutions while ensuring data quality reliability and accessibility across the organization working with cutting-edge technologies like Python Microsoft Fabric Snowflake Dataiku SQL Server Oracle PostgreSQL etc.
Experience: 5 to 7 years
Location: Bengaluru
Employment Type: Full-time / Permanent
Key Responsibilities:
Data Pipeline Development
Design build and maintain highly efficient and scalable ETL/ELT pipelines to process large volumes of structured and unstructured data
Implement real-time and batch data processing solutions using modern data engineering tools (e.g. Fabric Spark Kafka Airflow )
Optimize data workflows for performance reliability and cost-effectiveness
Monitor pipeline health and implement automated alerting and recovery mechanisms Data Architecture and amp; Infrastructure
Collaborate with architects to design and implement data warehouse and Data Warehouse Data Lake and Lakehouse solutions (e.g. Fabric Snowflake).
Build and maintain cloud-based data infrastructure (Azure or AWS GCP)
Implement data governance frameworks and ensure compliance with data privacy regulations
Design schema and data models for optimal query performance and scalability Data Quality and amp; Integration
Develop data validation frameworks and implement quality checks throughout pipelines
Integrate data from multiple sources including APIs databases streaming platforms and third-party services
Create and maintain data documentation lineage tracking and metadata management
Troubleshoot data quality issues and implement corrective measures Collaboration and amp; Support
Partner with data scientists analysts and business stakeholders to understand data requirements
Support analytics teams with data access modeling and optimization
Participate in code reviews and maintain high standards for data engineering practices
Mentor junior team members and contribute to team knowledge sharing
Skills
Knowledge of machine learning workflows and MLOps practices
Familiarity with data visualization tools (Tableau Looker Power BI)
Experience with stream processing and real-time analytics
Experience with data governance and compliance frameworks (GDPR CCPA)
Contributions to open-source data engineering projects
Relevant Cloud certifications (e.g. Microsoft Certified: Azure Data Engineer Associate AWS Certified Data Engineer Google Cloud Professional Data Engineer).
Specific experience or certifications in Microsoft Fabric or Dataiku Snowflake.
Required Qualifications
Technical Skills
Programming Languages: Proficiency in Python
Cloud Platforms: Hands-on experience with Azure (Fabric Synapse Data Factory Event Hubs)
3 years of experience in data engineering or related roles with 5 years of Software Engineering Experience.
Databases: Strong SQL skills and experience with both relational (Microsoft SQL Server PostgreSQL MySQL) and NoSQL (MongoDB Cassandra) databases
Version Control: Proficiency with Git and collaborative development workflows
Proven track record of building production-grade data pipelines handling large-scale data or solutions.
Desired experience with containerization (Docker) and orchestration (Kubernetes) technologies.