Data Engineer
Year of Experience – 3-4 Years
Location – Hybrid Pune
Shift Timings – 02:30 PM to 11:30 AM (Mon-Fri)
Department: Information Technology
You’ll join a fast-moving innovation-driven team solving real-world problems using the latest in Python-based data stacks scalable APIs modern databases and cloud-native infrastructure.
If you enjoy solving complex data challenges and working with emerging AI technologies the Data Engineer role at Bold Penguin is for you. You’ll collaborate with our engineering and analytics teams to build intelligent data pipelines and AI-driven systems that enhance how small business owners shop quote and bind commercial insurance. Your work will directly influence end-user experiences in production environments.
Responsibilities
Collaborate with Senior Data Engineers Data Analysts and AI specialists to connect extract and transform data from diverse structured and unstructured sources.
Design and optimize large-scale data processing pipelines to support both traditional analytics and AI-driven workloads.
Build APIs and microservices that power intelligent systems leveraging LLMs embeddings and AI agents.
Implement and maintain vector databases and semantic search pipelines for retrieval-augmented generation (RAG) and contextual chat experiences.
Contribute to prompt engineering model evaluation and orchestration workflows that enable autonomous AI agents to interact with data and tools.
Integrate AI frameworks and orchestration tools (e.g. LangChain LlamaIndex CrewAI or similar) into production data services.
Continuously improve reliability and scalability by identifying enhancements to processing parallelization and search pipelines.
Troubleshoot and resolve production issues to exceed SLAs.
Qualifications
Bachelor’s degree in Computer Science Computer Engineering Data Science or a related field.
2 years of experience in software or data engineering roles.
Strong proficiency in Python and one other language (Java or Go); with experience developing APIs in a microservices architecture
Strong understanding of data modeling and database technologies including PostgreSQL and MongoDB.
Experience building or integrating with AI/LLM-powered applications including prompt templates embeddings and RAG pipelines.
Hands-on experience with AI agent frameworks (e.g. LangChain CrewAI OpenDevin or AutoGen) and vector databases (e.g. Pinecone FAISS Weaviate or Chroma).
Working knowledge of CI/CD pipelines and cloud infrastructure (AWS or CNCF toolkits).
Excellent written and verbal communication skills with the ability to document and present complex systems clearly.
(Preferred) Experience with data orchestration frameworks like Apache Airflow or Prefect.
(Preferred) Familiarity with insurance or financial data ecosystems.
Data EngineerYear of Experience – 3-4 YearsLocation – Hybrid PuneShift Timings – 02:30 PM to 11:30 AM (Mon-Fri)Department: Information Technology You’ll join a fast-moving innovation-driven team solving real-world problems using the latest in Python-based data stacks scalable APIs modern databases a...
Data Engineer
Year of Experience – 3-4 Years
Location – Hybrid Pune
Shift Timings – 02:30 PM to 11:30 AM (Mon-Fri)
Department: Information Technology
You’ll join a fast-moving innovation-driven team solving real-world problems using the latest in Python-based data stacks scalable APIs modern databases and cloud-native infrastructure.
If you enjoy solving complex data challenges and working with emerging AI technologies the Data Engineer role at Bold Penguin is for you. You’ll collaborate with our engineering and analytics teams to build intelligent data pipelines and AI-driven systems that enhance how small business owners shop quote and bind commercial insurance. Your work will directly influence end-user experiences in production environments.
Responsibilities
Collaborate with Senior Data Engineers Data Analysts and AI specialists to connect extract and transform data from diverse structured and unstructured sources.
Design and optimize large-scale data processing pipelines to support both traditional analytics and AI-driven workloads.
Build APIs and microservices that power intelligent systems leveraging LLMs embeddings and AI agents.
Implement and maintain vector databases and semantic search pipelines for retrieval-augmented generation (RAG) and contextual chat experiences.
Contribute to prompt engineering model evaluation and orchestration workflows that enable autonomous AI agents to interact with data and tools.
Integrate AI frameworks and orchestration tools (e.g. LangChain LlamaIndex CrewAI or similar) into production data services.
Continuously improve reliability and scalability by identifying enhancements to processing parallelization and search pipelines.
Troubleshoot and resolve production issues to exceed SLAs.
Qualifications
Bachelor’s degree in Computer Science Computer Engineering Data Science or a related field.
2 years of experience in software or data engineering roles.
Strong proficiency in Python and one other language (Java or Go); with experience developing APIs in a microservices architecture
Strong understanding of data modeling and database technologies including PostgreSQL and MongoDB.
Experience building or integrating with AI/LLM-powered applications including prompt templates embeddings and RAG pipelines.
Hands-on experience with AI agent frameworks (e.g. LangChain CrewAI OpenDevin or AutoGen) and vector databases (e.g. Pinecone FAISS Weaviate or Chroma).
Working knowledge of CI/CD pipelines and cloud infrastructure (AWS or CNCF toolkits).
Excellent written and verbal communication skills with the ability to document and present complex systems clearly.
(Preferred) Experience with data orchestration frameworks like Apache Airflow or Prefect.
(Preferred) Familiarity with insurance or financial data ecosystems.
View more
View less