Data Architect Knowledge Engineer – Euro-BioImaging (AI4Access)

Not Interested
Bookmark
Report This Job

profile Job Location:

Heidelberg - Germany

profile Monthly Salary: Not Disclosed
Posted on: Yesterday
Vacancies: 1 Vacancy

Job Summary

Euro-BioImaging ERIC is a European research infrastructure providing life scientists open access to advanced imaging technologies expertise training and data services through almost 300 imaging facilities distributed among 40 Nodes across Europe. The Euro-BioImaging Hub is distributed across three sites: the legal Seat (Finland) the Med-Hub (Italy) and the Bio-Hub hosted by EMBL in Heidelberg.

The Euro-BioImaging Access Portal is the infrastructure interface that enables users to browse request and access imaging and image data technologies from an extensive and rich service portfolio. It is also the platform through which Euro-BioImaging and the service providing facilities manage the incoming access requests and user AI4Accessproject will develop a Research Navigator (anLLM application)integrated into the Euro-BioImagingAccessPortal to guide researchers to the right services workflows and resources across the infrastructure. A key foundation is astructured maintained ontology-alignedknowledge basethat can bequeriedreliablybythe Navigator.

Your role

We are looking for a technically strong colleague who will design set up and maintain thedatabase and data ingestion pipelineunderlying the AI4Access ontology-drivenknowledge basework with the Euro-BioImaging Bio-Hub team and community on the development of relevant ontologies and knowledge graphsanddeveloprobust technical interfaces between the knowledge base andtheResearchNavigator together with the Euro-BioImaging Seat team andwith the Euro-BioImaging Nodes own information and management systems.

Responsibilities include but are not limited to:

  • Design and implementan ontology-aligneddata modelits technical representation (database schema versioning documentation)and theminimum viable service.
  • Work with the Bio-Hub imaging specialists and Node-facing colleagues to translate real service descriptions into a maintainable structured form via community-agreed ontologies and/or controlled vocabularies.
  • Support integration of semantic and metadata elements (controlled vocabularies identifiers/PIDs provenance) together with domain experts.
  • Develop and maintainautomated ETL workflows to handle dataingestion pipelines(data collection inputs annotations transformations validation/QA rules repeatable updates)adapted to a diverse range of data and data sources.
  • Implementrobust APIsandaccess layers optimizedfor LLM applications toquery services capabilities and constraints with high reliabilityand transparency.
  • Collaborate with the Seat team on integration testing release planning and technical documentation.

You have

  • We are looking for a motivated structured and hands-on colleague who enjoys building reliable systems that will be used by a broad scientific community.
  • Degree (MSc / PhD or equivalent experience) in a relevant field (data/computer science bioinformatics information systems or similar).
  • Familiarity withmetadata ontologies controlled vocabularies and persistent identifiers.
  • Advance Python programming skills and proven experience designing andmaintaining production-readydatabasesfor structured content (e.g. Postgres/graph stores) and buildingAPIsfor downstream applications.
  • Experience withCI/CDdata pipelines(ingestion transformation validation) and data quality practicesincluding experience with version control frameworks (GitHub/GitLab)
  • Ability to work effectively with multiple stakeholders (technical and non-technical) in an international setting.
  • Fluency in written and spoken English.

You may also have

  • Experience in bioimaging or life sciences data and metadata handling
  • Knowledge graphandsemantic web experience working withLinked data standards (RDFSPARQL SHACL) and/or hybrid retrieval approaches (GraphRAG)used in AI applications.
  • Experience with graph databases (Neo4J)
  • Experience with research infrastructures FAIR data practices or management of service catalogues in scientific environments.
  • Hands-onexperience with containerization (Docker) and orchestration (Kubernetes).

Contract length: 3 years

Salary: Grade 5-6 depending on relevant experience monthly salary from 4.031 EUR after tax and before the 13% EMBL social security deductions plus financial allowances based on family circumstances.


Required Experience:

Staff IC

Euro-BioImaging ERIC is a European research infrastructure providing life scientists open access to advanced imaging technologies expertise training and data services through almost 300 imaging facilities distributed among 40 Nodes across Europe. The Euro-BioImaging Hub is distributed across three s...
View more view more

About Company

Company Logo

With 29 member states, laboratories at six locations across Europe and thousands of scientists and engineers working together, the European Molecular Biology Laboratory is a powerhouse of biological expertise. The intergovernmental organisation, headquartered in Heidelberg, was founde ... View more

View Profile View Profile