Data Engineer

Ho Chi Minh City - Vietnam

Monthly Salary: $ 2500 - 3500

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

Our client is a fast growing Property Tech AI company

About the role

They are seeking a versatile Data & AI Engineer to build deploy & maintain end-to-end data pipelines for downstream Gen AI applications. Youll design data models and transformations build scalable ETL/ELT workflows while learning fast and working on the AI agent space.

Key Responsibilities

Data Modeling & Pipeline development

Automate data ingestion from diverse sources (Databases APIs files Sharepoint/ document management tools URLs). Most files are expected to be unstructured documents with different file formats tables charts process flows schedules construction layouts/drawings etc.
Own chunking strategy embedding indexing all unstructured & structured data for efficient retrieval by downstream RAG/agent systems
Build test and maintain robust ETL/ELT workflows using Spark (batch & streaming)
Define and implement logical/physical data models and schemas. Develop schema mapping and data dictionary artifacts for cross-system consistency

Gen AI Integration

Instrument data pipelines to surface real-time context into LLM prompts
Implement prompt engineering and RAG for varied workflows within the RE/Construction industry vertical

Observability & Governance

Implement monitoring alerting and logging (data quality latency errors)
Apply access controls and data privacy safeguards (e.g. Unity Catalog IAM)

CI/CD & Automation

Develop automated testing versioning and deployment (Azure DevOps GitHub Actions Prefect/Airflow)
Maintain reproducible environments with infrastructure as code (Terraform ARM templates)

Required Skills & Experience

5 years in Data Engineering or similar role with at least 12-24 months of exposure to building pipelines for unstructured data extraction including document processing with OCR cloud-native solutions and chunking indexing etc. for downstream consumption by RAG/ Gen AI applications.
Proficiency in Python dlt for ETL/ELT pipeline duckDB or equivalent tools for analytical in-process analysis dvc for managing large files efficiently.
Solid SQL skills and experience designing and scaling relational databases. Familiarity with non-relational column based databases is preferred.
Familiarity with Prefect is preferred or others (e.g. Azure Data Factory)
Proficiency with the Azure ecosystem. Should have worked on Azure services in production.
Familiarity with RAG indexing chunking and storage across file types for efficient retrieval.
Strong Dev Ops/Git workflows and CI/CD (CircleCI / Azure DevOps)
Experience deploying ML artifacts using MLflow Docker or Kubernetes is good to have.

Bonus skillsets:

Experience with Computer vision based extraction or experience in building ML models for production
Knowledge of agentic AI system design - memory tools context orchestration
Knowledge of data governance privacy laws (GDPR) and enterprise security patterns

They are an early-stage startup so you are expected to wear many hats working with things out of your comfort zone but with real and direct impact in production.

Why our client

Fast-growing revenue-generating proptech startup
Flat no BS environment high autonomy for the right talent
Steep learning opportunities in real world enterprise production use-cases
Remote work with quarterly meet-ups
Multi-market multi-cultural client exposure

Our client is a fast growing Property Tech AI company About the role They are seeking a versatile Data & AI Engineer to build deploy & maintain end-to-end data pipelines for downstream Gen AI applications. Youll design data models and transformations build scalable ETL/ELT workflows while learning f...