At Johnson & Johnsonwe believe health is everything. Our strength in healthcare innovation empowers us to build aworld where complex diseases are prevented treated and curedwhere treatments are smarter and less invasive andsolutions are our expertise in Innovative Medicine and MedTech we are uniquely positioned to innovate across the full spectrum of healthcare solutions today to deliver the breakthroughs of tomorrow and profoundly impact health for more at
Job Function:
Data Analytics & Computational SciencesJob Sub Function:
Data ScienceJob Category:
Scientific/TechnologyAll Job Posting Locations:
Cornellà de Llobregat Barcelona Spain Madrid SpainJob Description:
Johnson and Johnson Innovative Medicine (J&J IM) a pharmaceutical company of Johnson & Johnson is recruiting for a Vector Data Engineer. This position has a primary location of Barcelona Spain. The secondary location is Madrid. This is a hybrid role.
Our expertise in Innovative Medicine is informed and inspired by patients whose insights fuel our science-based advancements. Visionaries like you work in teams that save lives by developing the medicines of tomorrow.
Join us in developing treatments finding cures and pioneering the path from lab to life while championing patients every step of the way. Learn more at
Position Summary:
The Principal Vector Data Engineer is a technical and strategic leader operating at the intersection of AI digital health and therapeutic R&D. This role leads the development of multimodal vector embedding pipelines and foundation model architectures supporting longitudinal data integration disease progression modeling and digital biomarker discovery across Neuroscience Oncology and Immunology. The successful candidate will guide enterprise-scale vectorization efforts while ensuring compliance with clinical regulatory and GxP data standards.
Key Responsibilities:
Technical Leadership
Lead the design development and optimization of vector embedding models for diverse biomedical modalities including clinical regulatory imaging (MRI PET) and digital health data.
Architect scalable compliant embedding pipelines using modern vector database technologies (FAISS Pinecone Weaviate Milvus Chroma etc.).
Establish robust quality-control frameworks for mobile-captured images and convert pixel-level data into high-fidelity vector representations.
Drive the adaptation of state-of-the-art academic methods into production-ready GxP-aware foundation models.
Oversee multimodal data integration efforts to enable semantic search retrieval-augmented analysis and clinical insight generation.
Cross-Functional & Regulatory Leadership
Collaborate with data scientists clinicians engineering teams and regulatory/QA partners to ensure models and data pipelines align with GxP clinical governance and documentation standards.
Contribute to digital biomarker discovery and predictive modeling for neurodegenerative neuropsychiatric oncologic and immunologic conditions.
Mentor junior engineers and contribute to technical roadmap planning architectural reviews and AI strategy development.
Qualifications:
MS/PhD in Computer Science Electrical Engineering Biomedical Engineering or related discipline.
3 years of experience in multimodal ML vector representation learning biomedical signal processing or large-scale embedding systems.
Expertise in Python PyTorch/TensorFlow Hugging Face and multimodal embedding architectures (CLIP MedCLIP BioBERT TimeSformer etc.).
Hands-on experience with vector indexing/search systems (FAISS Pinecone Weaviate Milvus Odrant Chroma).
Familiarity with sentence-transformers LangChain or LlamaIndex for semantic search and RAG workflows.
Understanding of clinical trial data structures longitudinal monitoring GxP system requirements and compliant data lifecycle management.
Strategic Impact:
Enterprise biomedical data transformed into vectorized interoperable assets powering scientific AI and semantic intelligence.
Improved data governance lineage and GxP alignment across foundation models and vector pipelines.
Accelerated discovery of digital biomarkers and predictive patterns across therapeutic areas.
Scalable vector infrastructure enabling next-generation clinical and translational AI research.
#JRDDS #JNJDataScience
Required Skills:
Preferred Skills:
Advanced Analytics Coaching Critical Thinking Data Analysis Data Privacy Standards Data Quality Data Reporting Data Savvy Data Science Data Visualization Digital Fluency Econometric Models Organizing Process Improvements Strategic Thinking Technical Credibility Workflow AnalysisRequired Experience:
Staff IC
About Johnson & Johnson A t Johnson & Johnson, we believe good health is the foundation of vibrant lives, thriving communities and forward progress. That’s why for more than 130 years, we have aimed to keep people well at every age and every stage of life. Today, as the world’s larges ... View more