- Design develop and manage robust ETL/ELT pipelines to ingest transform and load data from diverse sources.
- Build and maintain data workflows using tools like Apache Airflow for orchestration and scheduling.
- Integrate data from Segment CRM systems marketing platforms and product usage logs.
- Develop Snowflake data models to support realtime and batch analytics.
- Enable LLMpowered applications by building data pipelines for RetrievalAugmented Generation (RAG) vector databases and prompt optimization.
- Support AI data pipelines for finetuning embeddings and context enrichment.
- Work with AI/ML engineers to ensure training data is accurate relevant and wellstructured.
- Create and manage interactive datarich dashboards using Tableau and Power BI .
- Work with business stakeholders to define KPIs and build realtime reports for marketing sales and operations teams .
- Optimize data models and queries to improve dashboard performance and usability.
- Build techdriven marketing pipelines that automate customer segmentation campaign analytics and attribution modeling.
- Integrate event streams web/app analytics and user behavior tracking to drive personalization strategies.
- Collaborate with growth and marketing teams to enable databacked decisionmaking .
- Ensure data quality integrity and compliance across platforms.
- Implement monitoring logging and alerting for critical pipelines.
- Optimize query performance resource usage and cost efficiency in Snowflake and other platforms.
- Maintain documentation of data sources transformations and dependencies.
Requirements
- Experience: 10 years in data engineering with a focus on ETL analytics and data infrastructure .
- ETL & Orchestration Tools: Expertise in Apache Airflow dbt Talend or Fivetran .
- Cloud Data Platforms: Strong experience with Snowflake including schema design performance tuning and cost optimization.
- Marketing & Analytics Integration: Experience with Segment attribution models and product analytics.
- AI/LLM Experience: Familiarity with RAG architectures vector stores (Pinecone Weaviate FAISS) and LLM data pipelines .
- Dashboards: Handson experience with Power BI and Tableau for business reporting and dashboarding.
- Scripting & Querying: Proficiency in SQL Python and working with APIs.
- Version Control & CI/CD: Familiarity with Git data CI/CD workflows and data testing strategies.
Preferred Certifications: Snowflake SnowPro Core Certification Microsoft Certified: Data Analyst Associate (Power BI) Tableau Desktop Specialist
AWS Data Engineer Certifications (Preferred)
Architecting AI-Powered Systems: Design, develop, and optimize AI-driven applications, integrating LLMs, Retrieval-Augmented Generation (RAG), NLP, and Generative AI. Implement AI/ML pipelines, working with OpenAI, Anthropic (Claude), Google Gemini, Mistral, and Meta AI. Architect scalable microservices with AI-powered automation, chatbots, and recommendation systems. Cloud & DevOps Leadership: Oversee AWS-based cloud architecture, including EC2, EKS (Kubernetes), Lambda, RDS, IAM, and security best practices. Implement Infrastructure as Code (IaC) using Terraform, AWS CloudFormation for scalable deployments. Ensure high availability, fault tolerance, and auto-scaling of cloud services. Lead containerization strategies with Docker and Kubernetes (K8s), optimizing workloads in AWS EKS. Full-Stack System Design & API Development: Architect and oversee development across React.js, Next.js, Flutter, Flutter Web, and Python-based APIs. Ensure secure and efficient API development in Python (FastAPI, Flask), Node.js, Laravel. Optimize database performance using Vitess (MySQL Sharding), PostgreSQL, Redis, and Snowflake. Monitoring, Security, and Performance Optimization: Set up Grafana, Prometheus, and OpenTelemetry for observability and real-time system monitoring. Implement security best practices, ensuring compliance with SOC 2, ISO 27001, GDPR. Manage zero-downtime deployments using Blue-Green Deployment, Canary Releases, and Feature Flags. Drive performance optimizations across frontend, backend, and AI workloads. Cross-Team Collaboration & Leadership: Work closely with AI engineers, DevOps specialists, full-stack developers, and product teams to design scalable systems. Lead Agile methodologies, including Scrum, Sprint Planning, and Code Reviews. Mentor engineers and establish architectural best practices for long-term product scalability. Requirements: Technical Skills & Experience: 10+ years of experience in full-stack development, cloud architecture, and AI-driven applications. Expertise in AI & ML systems: RAG, LLM fine-tuning, LangChain, Hugging Face, OpenAI APIs. Cloud & DevOps: Deep expertise in AWS services, Terraform, Kubernetes (EKS), Docker, CI/CD pipelines. Full-Stack Development: Experience in React.js, Next.js, Flutter, Flutter Web, and Python (FastAPI, Flask, Django). Database Expertise: MySQL (Vitess), PostgreSQL, Redis, Snowflake, and data warehousing solutions. Monitoring & Security: Strong knowledge of Grafana, Prometheus, OpenTelemetry, IAM policies, and cloud security.