Data Engineer

Hyred

Not Interested
Bookmark
Report This Job

profile Job Location:

Ho Chi Minh City - Vietnam

profile Monthly Salary: $ 2500 - 3500
Posted on: 8 hours ago
Vacancies: 1 Vacancy

Job Summary

Our client is a fast growing Property Tech AI company

About the role

They are seeking a versatile Data & AI Engineer to build deploy & maintain end-to-end data pipelines for downstream Gen AI applications. Youll design data models and transformations build scalable ETL/ELT workflows while learning fast and working on the AI agent space.

Key Responsibilities

Data Modeling & Pipeline development

  • Automate data ingestion from diverse sources (Databases APIs files Sharepoint/ document management tools URLs). Most files are expected to be unstructured documents with different file formats tables charts process flows schedules construction layouts/drawings etc.
  • Own chunking strategy embedding indexing all unstructured & structured data for efficient retrieval by downstream RAG/agent systems
  • Build test and maintain robust ETL/ELT workflows using Spark (batch & streaming)
  • Define and implement logical/physical data models and schemas. Develop schema mapping and data dictionary artifacts for cross-system consistency

Gen AI Integration

  • Instrument data pipelines to surface real-time context into LLM prompts
  • Implement prompt engineering and RAG for varied workflows within the RE/Construction industry vertical

Observability & Governance

  • Implement monitoring alerting and logging (data quality latency errors)
  • Apply access controls and data privacy safeguards (e.g. Unity Catalog IAM)

CI/CD & Automation

  • Develop automated testing versioning and deployment (Azure DevOps GitHub Actions Prefect/Airflow)
  • Maintain reproducible environments with infrastructure as code (Terraform ARM templates)

Required Skills & Experience

  • 5 years in Data Engineering or similar role with at least 12-24 months of exposure to building pipelines for unstructured data extraction including document processing with OCR cloud-native solutions and chunking indexing etc. for downstream consumption by RAG/ Gen AI applications.
  • Proficiency in Python dlt for ETL/ELT pipeline duckDB or equivalent tools for analytical in-process analysis dvc for managing large files efficiently.
  • Solid SQL skills and experience designing and scaling relational databases. Familiarity with non-relational column based databases is preferred.
  • Familiarity with Prefect is preferred or others (e.g. Azure Data Factory)
  • Proficiency with the Azure ecosystem. Should have worked on Azure services in production.
  • Familiarity with RAG indexing chunking and storage across file types for efficient retrieval.
  • Strong Dev Ops/Git workflows and CI/CD (CircleCI / Azure DevOps)
  • Experience deploying ML artifacts using MLflow Docker or Kubernetes is good to have.

Bonus skillsets:

  • Experience with Computer vision based extraction or experience in building ML models for production
  • Knowledge of agentic AI system design - memory tools context orchestration
  • Knowledge of data governance privacy laws (GDPR) and enterprise security patterns

They are an early-stage startup so you are expected to wear many hats working with things out of your comfort zone but with real and direct impact in production.

Why our client

  • Fast-growing revenue-generating proptech startup
  • Flat no BS environment high autonomy for the right talent
  • Steep learning opportunities in real world enterprise production use-cases
  • Remote work with quarterly meet-ups
  • Multi-market multi-cultural client exposure
Our client is a fast growing Property Tech AI company About the role They are seeking a versatile Data & AI Engineer to build deploy & maintain end-to-end data pipelines for downstream Gen AI applications. Youll design data models and transformations build scalable ETL/ELT workflows while learning f...
View more view more

Key Skills

  • Apache Hive
  • S3
  • Hadoop
  • Redshift
  • Spark
  • AWS
  • Apache Pig
  • NoSQL
  • Big Data
  • Data Warehouse
  • Kafka
  • Scala