Data Platform Architect

Not Interested
Bookmark
Report This Job

profile Job Location:

Bangalore - India

profile Salary: Not Disclosed
profile Experience Required: 5years
Posted on: 5 hours ago
Vacancies: 1 Vacancy

Job Summary

Role Overview

  • We are looking for a seasoned Data Platform Architect to own the design and delivery of three foundational pillars: a resilient multi-cloud data platform an enterprise MCP (Model Context Protocol) Server layer that connects AI workloads to governed data assets and a high-throughput message bus
  • Capable of sustaining millions of events per second across distributed consumers. This is a senior architecture role that combines deep hands-on engineering with cross-functional influence across data AI and infrastructure teams.
MUST HAVEs

  • Multi-Cloud Enablement
  • MCP Server Foundation
  • High-Throughput Message Bus

Key Responsibilities

1. Multi-Cloud Data Platform Enablement
  • Architect a cloud-agnostic data platform that operates seamlessly across AWS Azure and GCP with unified identity governance and cost controls
  • Define the reference architecture for lakehouse deployments (Delta Lake / Iceberg / Hudi) on each cloud ensuring format interoperability and zero-lock-in data portability
  • Design cross-cloud data movement patterns including replication federation and active-active topologies using tools such as Debezium Airbyte and cloud-native transfer services
  • Establish a cloud-agnostic Unity Catalog or open metadata layer for consistent lineage access control and discoverability across all cloud zones
  • Drive FinOps practices: right-sizing compute storage tiering and reserved capacity planning across cloud providers
MCP Server Foundation

  • Architect and build the enterprise MCP Server layer that exposes governed data assets query interfaces and tool APIs to LLM-driven agents and copilots
  • Define the MCP resource taxonomy: which data assets surface as Resources which operations
    become Tools and which contextual feeds become Prompts
  • Implement authentication and authorization at the MCP boundary ensuring AI agents operate
    within row-level column-level and dataset-level access policies
  • Design the MCP Server for multi-tenancy supporting concurrent agent workloads with rate
    limiting audit logging and observability hooks
  • Collaborate with AI/ML teams to validate that MCP-served context materially reduces hallucination.
    rates and improves retrieval grounding quality
  • Produce the MCP Server SDK integration guide for internal engineering teams building AI-
    powered applications
High-Throughput Message Bus Architecture

  • Design and own the enterprise message bus architecture targeting sustained throughput of 1M events/sec with sub-50ms end-to-end latency at P99
  • Evaluate and select the appropriate messaging backbone (Apache Kafka Confluent Platform Redpanda AWS Kinesis or Azure Event Hubs) based on workload profiles
  • Define partitioning strategies topic compaction policies retention tiers and tiered storage configurations aligned to IoT telemetry CDC and operational event patterns
  • Architect Schema Registry governance including schema evolution contracts (Avro / Protobuf / JSON Schema) and compatibility enforcement pipelines
  • Design consumer group topologies for stream processing frameworks (Flink Spark Structured Streaming Delta Live Tables) and ensure back pressure and offset management are production-grade
  • Integrate the message bus with the multi-cloud lakehouse as a bronze ingestion layer enforcing idempotency and exactly-once delivery guarantees
Platform Governance & Engineering Excellence

  • Define and enforce platform-wide standards: naming conventions tagging taxonomy SLA tiers DR objectives and run-book templates
  • Champion Infrastructure-as-Code practices across Terraform Pulumi or Bicep for all cloud resources and data platform components
  • Lead architecture review boards (ARBs) and own the technical decision log (ADRs) for all major platform choices
  • Mentor senior data engineers and serve as escalation point for platform-level production incidents



Requirements

Required Qualifications

  • 10 years in data engineering with at least 3 years in platform or solutions architecture roles
  • Hands-on experience architecting production lakehouse platforms on two or more of: AWS (S3
  • Glue Athena EMR Kinesis) Azure (ADLS Gen2 Databricks Event Hubs Synapse) GCP (BigQuery Dataflow Pub/Sub)
  • Deep expertise in Apache Kafka or equivalent message bus: cluster sizing partition leadership consumer lag management and MirrorMaker 2 / replication topologies
  • Strong command of open table formats: Delta Lake Apache Iceberg or Apache Hudi including time travel merge-on-read vs. copy-on-write trade-offs and OPTIMIZE / VACUUM strategies
  • Proficiency in Python and PySpark for platform automation ingestion framework development and schema validation pipelines
  • Demonstrated experience with metadata management: Apache Atlas Unity Catalog DataHub or equivalent open metadata solutions
  • Familiarity with MCP specification or equivalent AI tool-use protocols; experience building or integrating API layers consumed by LLM agents is a strong plus
  • Infrastructure-as-Code fluency (Terraform Pulumi or equivalent) and CI/CD pipeline design for data platform deployments
  • Strong written communication skills: ability to produce architecture decision records RFP responses and client-facing implementation guides

Preferred Qualifications

  • Experience with Delta Live Tables (DLT) in Databricks including CDC pipeline design and Liquid Clustering optimization
  • Exposure to vector databases (Pinecone Weaviate pgvector) and RAG pipeline architecture for grounding LLMs in enterprise data
  • Familiarity with Redpanda or Confluent Cloud as managed Kafka alternatives and their cost/performance trade-offs at scale
  • Knowledge of data mesh operating models: domain ownership data products and federated governance
  • Experience in regulated industries (energy manufacturing IoT telemetry) where data quality auditability and retention policies are mission-critical
  • Cloud certifications: AWS Data Analytics Specialty Azure Data Engineer Associate GCP Professional Data Engineer or Databricks Certified Data Engineer Professional
  • Prior consulting or multi-client engagement experience; comfort navigating multiple concurrent stakeholder environments



Required Skills:

Data Platform Architect


Required Education:

Masters

Role Overview We are looking for a seasoned Data Platform Architect to own the design and delivery of three foundational pillars: a resilient multi-cloud data platform an enterprise MCP (Model Context Protocol) Server layer that connects AI workloads to governed data assets and a high-throughput mes...
View more view more