Data Engineer Data Enablement with AI for AI

CAST Software

Not Interested
Bookmark
Report This Job

profile Job Location:

Meudon - France

profile Monthly Salary: Not Disclosed
Posted on: 30+ days ago
Vacancies: 1 Vacancy

Job Summary

CAST a Software Company based in Meudon is the market leader in Software Intelligence.

Working at CAST R&D means being an important part of a highly-talented fast-paced multicultural and Agile team .

Overview

Were building the foundation to ground AI with AAA Software Intelligence Aggregated

Accurated and Augmented sourced from real-world software and technology projects. This

role goes beyond manual curation: its about using AI to empower AI. You will leverage LLMs

embeddings and NLP tools to clean enrich and validate data enabling AI systems and

autonomous agents to rely on it for training and contextual understanding.

Responsibilities

Aggregate and structure data from software ecosystems (codebases APIs tickets

documentation architecture specs).

Apply LLMs embeddings and NLP tools to automate: data cleaning entity extraction

metadata tagging and semantic annotation.

Build and maintain semantic pipelines for LLM fine-tuning and RAG (Retrieval-Augmented

Generation).

Organize datasets into formats suitable for Agent-to-Agent (A2A) interactions: APIs vector

DBs knowledge graphs etc.

Collaborate with AI teams to evolve schemas prompts labeling strategies and evaluation

data.

Ensure strong data lineage reproducibility and version control.

Requirements

3 years in data engineering ML data ops or structured data curation.

Proficient in Python with strong data pipeline skills (Pandas PyArrow regex Airflow).

Experience with LLMs or NLP tools (e.g. Hugging Face spaCy LangChain).

Ability to use AI to clean enrich classify and organize technical content.

Strong understanding of tokenization chunking and model input preparation.

Experience working with software project data: Git repos APIs technical documentation etc.

Bonus Skills

Knowledge of vector DBs (FAISS Qdrant Weaviate) or knowledge graphs (Neo4j RDF

SPARQL).

CAST a Software Company based in Meudon is the market leader in Software Intelligence.Working at CAST R&D means being an important part of a highly-talented fast-paced multicultural and Agile team .OverviewWere building the foundation to ground AI with AAA Software Intelligence AggregatedAccurated...
View more view more

Key Skills

  • Apache Hive
  • S3
  • Hadoop
  • Redshift
  • Spark
  • AWS
  • Apache Pig
  • NoSQL
  • Big Data
  • Data Warehouse
  • Kafka
  • Scala

About Company

Company Logo

Instant insight into your applications. Whenever you need to know, improve, transform, control.

View Profile View Profile