AI Presales Architect

Mumbai Suburban - India

Monthly Salary: Not Disclosed

Posted on: 9 hours ago

Vacancies: 1 Vacancy

Job Summary

Senior AI Presales (Principal Consultant) (Full-Stack & Generative AI)

Location: Mumbai (only)

Type of Hire: Full-Time

Min Experience : 15 yrs

Job Summary

We are seeking a high-impact strategic AI Presales Consultant to join our elite team. This is not a standard presales role. You will engage upstream with our most strategic clients acting as their primary technical and strategic advisor on their end-to-end AI journey from initial AI curiosity to a fully architected and scalable MLOps platform. You will design the how of their AI strategy

Your mission is to position our entire full-stack AI portfolio translating complex business challenges into fully architected solutions. You will be the expert who connects the business use case to the underlying supercomputing hardware with a strong emphasis on our AI Platform. You will guide clients through the complexities of modern AI-from data pipelines and RAG architectures to model selection inference optimization and precise infrastructure sizing. If you are passionate about building the factory for AI not just the product this role is for you.

What We Dont Expect (Focus of the Role)

You are not expected to be a hardware specialist (e.g. designing server racks or comparing GPU silicon).
You are not expected to be a domain-specific data scientist (e.g. building the final fraud detection model or NLP algorithm).
Your focus is the platform that enables these two ends of the spectrum.

Key Responsibilities

Strategic Client Advisory: Lead executive-level Art of the Possible workshops and technical discovery sessions to understand a clients business goals data readiness and AI maturity.
Full-Stack Solution Architecture: Design holistic end-to-end AI solutions that synergize our supercomputing hardware AI software platform and MLOps capabilities to meet specific client needs.
Generative AI & LLM Expertise: Act as the subject matter expert on Generative AI. Architect and evangelize scalable data ingestion and preparation pipelines specializing in Retrieval-Augmented Generation (RAG) frameworks.
Infrastructure Sizing & Performance Modelling: Analyse customer workloads (data volume model complexity training frequency inference throughput) to accurately size the required platform infrastructure including Kubernetes clusters data storage and software licenses. This includes calculating compute storage and network requirements based on key performance metrics like model parameters token performance (tokens/sec) desired latency and concurrent user load.
Model & Software Consultation:
- Advise clients on AI model selection comparing the trade-offs of open-source vs. proprietary LLMs fine-tuning vs. foundation models and model quantization.
- Position and demonstrate our proprietary AI software platform MLOps tools and libraries integrating them into the clients ecosystem.
Inference Optimization: Design and architect robust low-latency and high-throughput inference solutions for complex AI models including large-scale LLM serving.
User Experience (UX) Advocacy: Collaborate with client teams to define the end-user experience ensuring the solution delivers tangible business value and a seamless interface for data scientists analysts and application users.
Sales Cycle Enablement: Own the technical narrative throughout the sales cycle. Build and deliver compelling presentations custom demonstrations and Proofs of Concept (PoCs). Lead the technical response to complex RFIs/RFPs.

Required Skills & Qualifications

Experience: 7 years in a customer-facing technical role (e.g. Presales Solutions Architecture AI Specialist or Technical Consulting) with a proven track record of designing large-scale AI ML or HPC solutions.
Generative AI Expertise: Deep hands-on understanding of LLM architectures. Must be able to architect explain and build PoCs for RAG pipelines including vector databases (e.g. Milvus Pinecone Chroma) embedding models and data ingestion strategies.
Critical Sizing & Hardware Acumen:
- Direct experience in sizing AI infrastructure. Must be able to perform napkin math and detailed calculations for GPU CPU memory and network requirements.
- Must be able to fluently discuss performance metrics (tokens/second latency throughput TFLOPS) and their relationship to hardware choice (e.g. NVIDIA H100 vs. A100 memory bandwidth interconnects like NVLink/InfiniBand).
AI Platform & MLOps: Expertise in the AI software stack. Strong understanding of MLOps principles (Kubeflow MLflow) Kubernetes (K8s) for AI workloads and model serving platforms (NVIDIA Triton KServe or similar).
Model Landscape Knowledge: Strong current knowledge of the AI model landscape (e.g. Llama family Mistral GPT-family foundation models). Ability to discuss fine-tuning techniques quantization and pruning.
Consultative & Communication Skills: Exceptional communication whiteboarding and presentation skills. Ability to translate executive-level business needs into detailed technical architecture and build a compelling C-level value proposition.
Education: Bachelors or Masters degree in Computer Science AI Data Science or a related engineering field.

Preferred Qualifications

Direct experience working for an AI hardware (GPU CPU Supercomputer) or major cloud AI platform provider.
Hands-on experience with parallel computing frameworks (CUDA MPI).
Experience in scientific computing research or other HPC domains.
Active contributor to the AI/ML community (e.g. publications conference talks open-source projects).

Senior AI Presales (Principal Consultant) (Full-Stack & Generative AI) Location: Mumbai (only) Type of Hire: Full-Time Min Experience : 15 yrs Job Summary We are seeking a high-impact strategic AI Presales Consultant to join our elite team. This is not a standard presales role. You will engage u...