Senior Full Stack AI Engineer
Job Summary
About the Role
If youre the kind of engineer who gets excited about building AI systems that actually work in production not just demos this might be the role for you.
Were growing our team and looking for a Senior Full Stack AI Engineer who can move across the full stack with confidence: from crafting clean responsive frontends to architecting backend systems that scale to wiring together LLMs vector databases and inference pipelines that power real products used by real people.
This isnt a role where youll be handed a neat spec and told to execute. Youll help shape the architecture make meaningful technical decisions and work with people who care deeply about doing things right.
Role Details
- Experience: 5 years in professional software engineering
- Employment: Full-Time
- Location: Remote-first
- Focus: AI/LLM systems Full Stack development Cloud infrastructure
What Youll Be Working On
Day to day youll be involved in a mix of the following:
- Designing and building scalable AI-powered web applications end to end
- Creating clean fast frontends with and
- Writing solid backend services and APIs with FastAPI built for performance and reliability
- Building RAG pipelines AI agents and LLM orchestration systems that work at scale
- Architecting semantic search and vector retrieval systems that return meaningful results
- Setting up distributed event-driven systems with proper queue and caching layers
- Deploying and managing cloud infrastructure including GPU workloads for AI inference
- Profiling and optimizing systems for speed cost and resilience
- Contributing to architectural decisions and helping the team level up technically
What Were Looking For
We care more about what you can do than how perfectly your resume matches a checklist. That said here are the areas where youll need to be genuinely strong:
Frontend
- TypeScript
- Tailwind CSS
- Redux / Zustand
- Server-Side Rendering (SSR)
- Static Site Generation (SSG)
- WebSocket Integration
- Performance Optimization
- CDN & Asset Delivery
Backend & Systems
- Python & FastAPI
- RESTful APIs
- Async Programming
- Microservices
- PostgreSQL / MongoDB
- Redis
- RabbitMQ / Kafka
- Celery / BullMQ
- Event-Driven Architecture
- Distributed Systems
- API Gateway Design
- Auth & RBAC
AI & LLM Engineering
- LangChain / LangGraph
- LlamaIndex
- RAG Pipelines
- Prompt Engineering
- AI Agent Frameworks
- Embeddings & Semantic Search
- Streaming LLM Integrations
- Context & Memory Management
- OpenAI / Claude / Llama
- Mistral / Gemini / Open-Source LLMs
Model Fine-Tuning & Inference
- Hugging Face Transformers
- LoRA / QLoRA / PEFT
- Model Quantization
- GPU Inference Optimization
- vLLM / Ollama / TGI
- Batch Inference Pipelines
- CUDA Fundamentals
Vector Databases
- Pinecone
- Weaviate
- Qdrant
- ChromaDB
- FAISS
- Milvus
Cloud Infrastructure & DevOps
- Docker & Kubernetes
- AWS / GCP / Azure
- CI/CD Pipelines
- Nginx & Linux Admin
- GPU Infrastructure
- Cloudflare / CloudFront
- S3 / GCS Object Storage
- Prometheus / Grafana
- ELK Stack
- Terraform / Ansible
Nice to Have (But Not Required)
These arent dealbreakers but theyll definitely get our attention:
- Experience with OCR Computer Vision or multimodal AI
- Hands-on work with YOLO or real-time inference systems
- Background in high-concurrency or real-time platforms
- Familiarity with web scraping document extraction or data ingestion pipelines
- Experience running AI infrastructure at enterprise scale