2026MSEMT2-VMDataEngineerSrPythonSpecialist
Job Summary
We are seeking a Data Backbone Engineer / Architect to design and build robust scalable data systems that integrate data from multiple sources into a cohesive backbone. This role combines strong data engineering fundamentals with AI/ML and LLM integration enabling intelligent context-aware data ResponsibilitiesDesign and build a centralized data backbone/platform integrating data from diverse sources (APIs databases files streaming systems)Develop and maintain scalable data pipelines using Python for ingestion transformation and processingCreate and manage data models and database schemas optimized for performance and scalabilityDesign and maintain data dependency graphs / lineage systems to track data flow and relationshipsEnsure data consistency integrity and quality across systemsCollaborate with domain experts to translate business logic into robust data models and AI-ready datasetsBuild reusable frameworks for data orchestration and workflow managementIntegrate AI/ML pipelines into the data backbone for training inference and feature engineeringDesign and manage feature stores for machine learning modelsEnable LLM-based applications by structuring and curating high-quality datasets for retrieval (RAG pipelines embeddings vector databases)Implement pipelines for data preprocessing labeling and augmentation for ML use casesOptimize data storage querying and processing performanceImplement monitoring logging and alerting for both data and ML pipelinesSupport model lifecycle workflows (training evaluation deployment versioning)Incorporate feedback loops for continuous model and data improvementDocument data architecture flows dependencies and AI pipeline integrations clearlyRequired Skills & QualificationsStrong proficiency in Python for data engineering (Pandas NumPy PySpark etc.)Solid experience in data modeling and database design (relational and/or NoSQL)Hands-on experience with SQL and performance tuningExperience building ETL/ELT pipelinesStrong understanding of data structures dependency graphs (DAGs) and workflow orchestration (Airflow-like systems)Experience working with heterogeneous data sources (structured semi-structured unstructured)Good understanding of data validation and quality frameworksHands-on experience with ML frameworks (e.g. scikit-learn TensorFlow PyTorch)Understanding of feature engineering and model evaluation techniquesExperience with LLM ecosystems (e.g. embeddings prompt engineering vector databases like Pinecone/FAISS RAG architectures)Familiarity with LLM orchestration frameworks (e.g. LangChain LlamaIndex or similar)Knowledge of model deployment and serving (APIs batch/real-time inference)Strong problem-solving and analytical skills
Qualifications :
B.E
Additional Information :
6
Remote Work :
No
Employment Type :
Full-time
About Company
Bosch first started in Vietnam with a representative office in 1994. Bosch has its main office in Ho Chi Minh City, with branch offices in Hanoi and Da Nang, and a Powertrain Solutions plant in the Dong Nai province to manufacture pushbelt for continuously variable transmissions (CVT) ... View more