Data Engineer

Bengaluru - India

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

Job Description
Cloud-Native Data Engineering on AWS

Strong hands-on expertise in AWS native data services: S3 Glue (Schema Registry Data Catalog) Step Functions Lambda Lake Formation Athena MSK/Kinesis EMR (Spark) SageMaker (inc. Feature Store)
Comfort designing and optimizing pipelines for both batch (Step Functions) and streaming (Kinesis/MSK) ingestion.
Data Mesh & Distributed Architectures
Deep understanding of data mesh principles: including domain-oriented ownership treating data as a product and the use of federated governance models
Experience enabling self-service platforms decentralized ingestion and transformation workflows.
Data Contracts & Schema Management
Advanced knowledge of schema enforcement evolution and validation (preferably AWS Glue Schema Registry/JSON/Avro)
Data Transformation & Modelling
Proficiency with modern ELT/ETL stack: Spark (EMR) dbt AWS Glue and Python (pandas)

AI/ML Data Enablement

Designing and supporting vector stores (OpenSearch) feature stores (SageMaker Feature Store) and integrating with MLOps/data pipelines for AI/semantic search and RAG-type workloads
Metadata Catalog and Lineage
Familiarity with central cataloging lineage solutions and data discovery (Glue Data Catalog Collibra Atlan Amundsen etc.)
Implementing end-to-end lineage auditability and governance processes.
Security Compliance and Data Governance
Design and implementation of data security: row/column-level security (Lake Formation) KMS encryption role-based access using AuthN/AuthZ standards (JWT/OIDC) GDPR/SOC2/ISO 27001-aligned policies
Orchestration & Observability
Experience with pipeline orchestration (AWS Step Functions Apache Airflow/MWAA) and monitoring (CloudWatch X-Ray) in large-scale environments.

APIs & Integration

API design for both batch and real-time data delivery (REST GraphQL endpoints for AI/reporting/BI consumption)

Job Responsibilities

Design build and maintain ETL/ELT pipelines to extract transform and load data from various sources into cloud-based data platforms.
Develop and manage data architectures data lakes and data warehouses on AWS (e.g. S3 Redshift Glue Athena).
Collaborate with data scientists analysts and business stakeholders to ensure data accessibility quality and security.
Optimize performance of large-scale data systems and implement monitoring logging and alerting for pipelines.
Work with both structured and unstructured data ensuring reliability and scalability.
Implement data governance security and compliance standards.
Continuously improve data workflows by leveraging automation CI/CD and Infrastructure-as-Code (IaC)

Job DescriptionCloud-Native Data Engineering on AWSStrong hands-on expertise in AWS native data services: S3 Glue (Schema Registry Data Catalog) Step Functions Lambda Lake Formation Athena MSK/Kinesis EMR (Spark) SageMaker (inc. Feature Store)Comfort designing and optimizing pipelines for both batch...

Job Description
Cloud-Native Data Engineering on AWS

Strong hands-on expertise in AWS native data services: S3 Glue (Schema Registry Data Catalog) Step Functions Lambda Lake Formation Athena MSK/Kinesis EMR (Spark) SageMaker (inc. Feature Store)
Comfort designing and optimizing pipelines for both batch (Step Functions) and streaming (Kinesis/MSK) ingestion.
Data Mesh & Distributed Architectures
Deep understanding of data mesh principles: including domain-oriented ownership treating data as a product and the use of federated governance models
Experience enabling self-service platforms decentralized ingestion and transformation workflows.
Data Contracts & Schema Management
Advanced knowledge of schema enforcement evolution and validation (preferably AWS Glue Schema Registry/JSON/Avro)
Data Transformation & Modelling
Proficiency with modern ELT/ETL stack: Spark (EMR) dbt AWS Glue and Python (pandas)

AI/ML Data Enablement

Designing and supporting vector stores (OpenSearch) feature stores (SageMaker Feature Store) and integrating with MLOps/data pipelines for AI/semantic search and RAG-type workloads
Metadata Catalog and Lineage
Familiarity with central cataloging lineage solutions and data discovery (Glue Data Catalog Collibra Atlan Amundsen etc.)
Implementing end-to-end lineage auditability and governance processes.
Security Compliance and Data Governance
Design and implementation of data security: row/column-level security (Lake Formation) KMS encryption role-based access using AuthN/AuthZ standards (JWT/OIDC) GDPR/SOC2/ISO 27001-aligned policies
Orchestration & Observability
Experience with pipeline orchestration (AWS Step Functions Apache Airflow/MWAA) and monitoring (CloudWatch X-Ray) in large-scale environments.

APIs & Integration

API design for both batch and real-time data delivery (REST GraphQL endpoints for AI/reporting/BI consumption)

Job Responsibilities

Design build and maintain ETL/ELT pipelines to extract transform and load data from various sources into cloud-based data platforms.
Develop and manage data architectures data lakes and data warehouses on AWS (e.g. S3 Redshift Glue Athena).
Collaborate with data scientists analysts and business stakeholders to ensure data accessibility quality and security.
Optimize performance of large-scale data systems and implement monitoring logging and alerting for pipelines.
Work with both structured and unstructured data ensuring reliability and scalability.
Implement data governance security and compliance standards.
Continuously improve data workflows by leveraging automation CI/CD and Infrastructure-as-Code (IaC)

Key Skills

Apache Hive
S3
Hadoop
Redshift
Spark
AWS
Apache Pig
NoSQL
Big Data
Data Warehouse
Kafka
Scala

Apply Now

About Company

Datamaxis

Job Summary: Able to provide guidance in all areas relating to information security in order to align and establish information security and strategy with business requirements. Primary Job Responsibilities: Cloud Security and/or Experience is preferred Automation, Scripting, Powe ... View more

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click