Data Engineer

Pune - India

Monthly Salary: Not Disclosed

Posted on: 6 hours ago

Vacancies: 1 Vacancy

Job Summary

Role	Data Engineer Enterprise Data Platforms & Pipelines
Visit our websiteto know more. Follow us onLinkedInIInstagramIFacebook IXfor the exciting updates.
About the UNIT/ Unit Overview	This role supports DECO in designing building and operating scalable enterprise data platforms and data pipelines that enable analytics machine learning and AI use cases across the PDM domain. You will ensure reliable ingestion transformation semantic consistency and availability of highquality data across business production logistics and quality domains.
Location
Experience:	Experience in Data Engineering Backend Engineering or Data Platform Development preferably in AWSbased environments.
Number of openings
What awaits you/ Job Profile	Data Engineering & Pipelines Design build and maintain scalable batch and streaming data pipelines. Ingest data from heterogeneous sources (databases APIs files events). Implement data transformations enrichment and validation logic. Ensure data quality consistency and reliability across the full data lifecycle. Data Platforms & Architecture Develop and operate AWSbased data lake / lakehouse architectures. Use Amazon S3 as central data lake storage. Implement ETL/ELT pipelines using AWS Glue. Enable analytics and adhoc queries via Amazon Athena. Work with structured and semistructured data (SQL JSON Parquet Avro). Optimize data models for analytics reporting and downstream ML / AI use cases. Collaborate closely with Data Scientists and ML Engineers to enable efficient model training and inference. Streaming & Integration Build and operate eventdriven and streaming pipelines using Apache Kafka. Integrate enterprise systems (ERP MES PLM logs telemetry). Ensure robust error handling retries and operational monitoring. Cloud MLOps & Operations Deploy and operate data pipelines in cloudnative and containerized environments. Collaborate on CI/CD pipelines for data workloads. Monitor performance costs data freshness and operational KPIs. Data Modeling & Ontology Design Design and maintain enterprisegrade data models for analytical and operational use cases. Develop conceptual logical and physical data models aligned with business domains. Define and manage domain models canonical data models and schemas. Design ontologies and semantic models to ensure consistent meaning relationships and interoperability of data. Collaborate with domain experts to translate business concepts into formal data and ontology structures. Ensure data models support analytics ML AI and LLMbased use cases including feature generation and knowledgebased reasoning. Maintain versioning documentation and governance of data models and ontologies. Delivery & Support model: Standard Business Hours 5x9 (Mo - Fr 08:00 17:00 (CET/CEST DST included). Ops activities in standard business hours during India public holidays at least one colleague must be on call.
What should you bring along	Expected skill Sets and experience that the candidates should bring along. Key Qualifications and Skills Strong experience in data engineering and backend development. Solid understanding of data modeling data quality and pipeline design. Experience operating enterprisescale data platforms. Structured reliabilityfocused mindset with strong ownership. Ability to collaborate across analytics ML and platform teams.
Must have technical skill	Python (advanced) SQL (advanced) Data pipeline development (batch & streaming) Apache Kafka (topics partitions consumer groups schema evolution) AWS data technologies: Amazon S3 AWS Glue Amazon Athena Amazon Redshift Data lake / lakehouse architectures Cloud native data engineering CI/CD and automation for data workloads Data modeling (conceptual logical physical) Ontology modeling & semantic data modeling Domain driven design (DDD) concepts Schema design and evolution (Avro JSON Schema Parquet) SQL based data modeling for analytics Normalization and denormalization strategies
Good to have technical skills	Kafka ecosystem (Kafka Connect Schema Registry) Apache Spark / Flink on AWS Workflow orchestration (Airflow Dagster) Infrastructure as Code (Terraform CloudFormation) Observability & data quality tooling Automotive or enterprisescale data domains Support for ML & AI use cases (feature stores training data pipelines) Knowledge graphs and semantic technologies (RDF OWL SPARQL) Metadata management and data catalogs Master Data Management (MDM) concepts Semantic modeling to support LLM / AI / RAG use cases Data governance and data ownership concepts

Required Experience:

Manager

RoleData Engineer Enterprise Data Platforms & PipelinesVisit our websiteto know more.Follow us onLinkedInIInstagramIFacebookIXfor the exciting updates.About the UNIT/ Unit OverviewThis role supports DECO in designing building and operating scalable enterprise data platforms and data pipelines that ...

Role	Data Engineer Enterprise Data Platforms & Pipelines
Visit our websiteto know more. Follow us onLinkedInIInstagramIFacebook IXfor the exciting updates.
About the UNIT/ Unit Overview	This role supports DECO in designing building and operating scalable enterprise data platforms and data pipelines that enable analytics machine learning and AI use cases across the PDM domain. You will ensure reliable ingestion transformation semantic consistency and availability of highquality data across business production logistics and quality domains.
Location
Experience:	Experience in Data Engineering Backend Engineering or Data Platform Development preferably in AWSbased environments.
Number of openings
What awaits you/ Job Profile	Data Engineering & Pipelines Design build and maintain scalable batch and streaming data pipelines. Ingest data from heterogeneous sources (databases APIs files events). Implement data transformations enrichment and validation logic. Ensure data quality consistency and reliability across the full data lifecycle. Data Platforms & Architecture Develop and operate AWSbased data lake / lakehouse architectures. Use Amazon S3 as central data lake storage. Implement ETL/ELT pipelines using AWS Glue. Enable analytics and adhoc queries via Amazon Athena. Work with structured and semistructured data (SQL JSON Parquet Avro). Optimize data models for analytics reporting and downstream ML / AI use cases. Collaborate closely with Data Scientists and ML Engineers to enable efficient model training and inference. Streaming & Integration Build and operate eventdriven and streaming pipelines using Apache Kafka. Integrate enterprise systems (ERP MES PLM logs telemetry). Ensure robust error handling retries and operational monitoring. Cloud MLOps & Operations Deploy and operate data pipelines in cloudnative and containerized environments. Collaborate on CI/CD pipelines for data workloads. Monitor performance costs data freshness and operational KPIs. Data Modeling & Ontology Design Design and maintain enterprisegrade data models for analytical and operational use cases. Develop conceptual logical and physical data models aligned with business domains. Define and manage domain models canonical data models and schemas. Design ontologies and semantic models to ensure consistent meaning relationships and interoperability of data. Collaborate with domain experts to translate business concepts into formal data and ontology structures. Ensure data models support analytics ML AI and LLMbased use cases including feature generation and knowledgebased reasoning. Maintain versioning documentation and governance of data models and ontologies. Delivery & Support model: Standard Business Hours 5x9 (Mo - Fr 08:00 17:00 (CET/CEST DST included). Ops activities in standard business hours during India public holidays at least one colleague must be on call.
What should you bring along	Expected skill Sets and experience that the candidates should bring along. Key Qualifications and Skills Strong experience in data engineering and backend development. Solid understanding of data modeling data quality and pipeline design. Experience operating enterprisescale data platforms. Structured reliabilityfocused mindset with strong ownership. Ability to collaborate across analytics ML and platform teams.
Must have technical skill	Python (advanced) SQL (advanced) Data pipeline development (batch & streaming) Apache Kafka (topics partitions consumer groups schema evolution) AWS data technologies: Amazon S3 AWS Glue Amazon Athena Amazon Redshift Data lake / lakehouse architectures Cloud native data engineering CI/CD and automation for data workloads Data modeling (conceptual logical physical) Ontology modeling & semantic data modeling Domain driven design (DDD) concepts Schema design and evolution (Avro JSON Schema Parquet) SQL based data modeling for analytics Normalization and denormalization strategies
Good to have technical skills	Kafka ecosystem (Kafka Connect Schema Registry) Apache Spark / Flink on AWS Workflow orchestration (Airflow Dagster) Infrastructure as Code (Terraform CloudFormation) Observability & data quality tooling Automotive or enterprisescale data domains Support for ML & AI use cases (feature stores training data pipelines) Knowledge graphs and semantic technologies (RDF OWL SPARQL) Metadata management and data catalogs Master Data Management (MDM) concepts Semantic modeling to support LLM / AI / RAG use cases Data governance and data ownership concepts