Data Platform Architect

Minutes To Seconds Pty Ltd

Job Location:

Bangalore - India

Salary: Not Disclosed

Experience Required: 5years

Posted on: 5 hours ago

Vacancies: 1 Vacancy

Job Summary

Role Overview

We are looking for a seasoned Data Platform Architect to own the design and delivery of three foundational pillars: a resilient multi-cloud data platform an enterprise MCP (Model Context Protocol) Server layer that connects AI workloads to governed data assets and a high-throughput message bus
Capable of sustaining millions of events per second across distributed consumers. This is a senior architecture role that combines deep hands-on engineering with cross-functional influence across data AI and infrastructure teams.

MUST HAVEs

Multi-Cloud Enablement
MCP Server Foundation
High-Throughput Message Bus

Key Responsibilities

1. Multi-Cloud Data Platform Enablement

Architect a cloud-agnostic data platform that operates seamlessly across AWS Azure and GCP with unified identity governance and cost controls
Define the reference architecture for lakehouse deployments (Delta Lake / Iceberg / Hudi) on each cloud ensuring format interoperability and zero-lock-in data portability
Design cross-cloud data movement patterns including replication federation and active-active topologies using tools such as Debezium Airbyte and cloud-native transfer services
Establish a cloud-agnostic Unity Catalog or open metadata layer for consistent lineage access control and discoverability across all cloud zones
Drive FinOps practices: right-sizing compute storage tiering and reserved capacity planning across cloud providers

MCP Server Foundation

Architect and build the enterprise MCP Server layer that exposes governed data assets query interfaces and tool APIs to LLM-driven agents and copilots
Define the MCP resource taxonomy: which data assets surface as Resources which operations
become Tools and which contextual feeds become Prompts
Implement authentication and authorization at the MCP boundary ensuring AI agents operate
within row-level column-level and dataset-level access policies
Design the MCP Server for multi-tenancy supporting concurrent agent workloads with rate
limiting audit logging and observability hooks
Collaborate with AI/ML teams to validate that MCP-served context materially reduces hallucination.
rates and improves retrieval grounding quality
Produce the MCP Server SDK integration guide for internal engineering teams building AI-
powered applications

High-Throughput Message Bus Architecture

Design and own the enterprise message bus architecture targeting sustained throughput of 1M events/sec with sub-50ms end-to-end latency at P99
Evaluate and select the appropriate messaging backbone (Apache Kafka Confluent Platform Redpanda AWS Kinesis or Azure Event Hubs) based on workload profiles
Define partitioning strategies topic compaction policies retention tiers and tiered storage configurations aligned to IoT telemetry CDC and operational event patterns
Architect Schema Registry governance including schema evolution contracts (Avro / Protobuf / JSON Schema) and compatibility enforcement pipelines
Design consumer group topologies for stream processing frameworks (Flink Spark Structured Streaming Delta Live Tables) and ensure back pressure and offset management are production-grade
Integrate the message bus with the multi-cloud lakehouse as a bronze ingestion layer enforcing idempotency and exactly-once delivery guarantees

Platform Governance & Engineering Excellence

Define and enforce platform-wide standards: naming conventions tagging taxonomy SLA tiers DR objectives and run-book templates
Champion Infrastructure-as-Code practices across Terraform Pulumi or Bicep for all cloud resources and data platform components
Lead architecture review boards (ARBs) and own the technical decision log (ADRs) for all major platform choices
Mentor senior data engineers and serve as escalation point for platform-level production incidents

Requirements

Required Qualifications

10 years in data engineering with at least 3 years in platform or solutions architecture roles
Hands-on experience architecting production lakehouse platforms on two or more of: AWS (S3
Glue Athena EMR Kinesis) Azure (ADLS Gen2 Databricks Event Hubs Synapse) GCP (BigQuery Dataflow Pub/Sub)
Deep expertise in Apache Kafka or equivalent message bus: cluster sizing partition leadership consumer lag management and MirrorMaker 2 / replication topologies
Strong command of open table formats: Delta Lake Apache Iceberg or Apache Hudi including time travel merge-on-read vs. copy-on-write trade-offs and OPTIMIZE / VACUUM strategies
Proficiency in Python and PySpark for platform automation ingestion framework development and schema validation pipelines
Demonstrated experience with metadata management: Apache Atlas Unity Catalog DataHub or equivalent open metadata solutions
Familiarity with MCP specification or equivalent AI tool-use protocols; experience building or integrating API layers consumed by LLM agents is a strong plus
Infrastructure-as-Code fluency (Terraform Pulumi or equivalent) and CI/CD pipeline design for data platform deployments
Strong written communication skills: ability to produce architecture decision records RFP responses and client-facing implementation guides

Preferred Qualifications

Experience with Delta Live Tables (DLT) in Databricks including CDC pipeline design and Liquid Clustering optimization
Exposure to vector databases (Pinecone Weaviate pgvector) and RAG pipeline architecture for grounding LLMs in enterprise data
Familiarity with Redpanda or Confluent Cloud as managed Kafka alternatives and their cost/performance trade-offs at scale
Knowledge of data mesh operating models: domain ownership data products and federated governance
Experience in regulated industries (energy manufacturing IoT telemetry) where data quality auditability and retention policies are mission-critical
Cloud certifications: AWS Data Analytics Specialty Azure Data Engineer Associate GCP Professional Data Engineer or Databricks Certified Data Engineer Professional
Prior consulting or multi-client engagement experience; comfort navigating multiple concurrent stakeholder environments

Required Skills: