Vision Research Intern-1

Redmond - USA

Hourly Salary: $ 30 - 50

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

About Centific

Centific is a frontier AI data foundry that curates diverse high-quality data using our purpose-built technology platforms to empower the Magnificent Seven and our enterprise clients with safe scalable AI deployment. Our team includes more than 150 PhDs and data scientists along with more than 4000 AI practitioners and engineers. We harness the power of an integrated solution ecosystemcomprising industry-leading partnerships and 1.8 million vertical domain experts in more than 230 marketsto create contextual multilingual pre-trained datasets; fine-tuned industry-specific LLMs; and RAG pipelines supported by vector databases. Our zero-distance innovation solutions for GenAI can reduce GenAI costs by up to 80% and bring solutions to market 50% faster.

Our mission is to bridge the gap between AI creators and industry leaders by bringing best practices in GenAI to unicorn innovators and enterprise customers. We aim to help these organizations unlock significant business value by deploying GenAI at scale helping to ensure they stay at the forefront of technological advancement and maintain a competitive edge in their respective markets.

About Job

Internship: Vision AI / VLM / Physical AI (Ph.D. Research Intern)

Company: Centific

Location: Seattle WA (or Remote)

Type: Fulltime Internship

Hours: 40

Build the Future of Perception & Embodied Intelligence

Are you pushing the frontier of computer vision multimodal large models and embodied/physical AIand have the publications to show it Join us to translate cuttingedge research into production systems that perceive reason and act in the real world.

The Mission

We are building stateoftheart Vision AI across 2D/3D perception egocentric/360 understanding and multimodal reasoning. As a Ph.D. Research Intern you will own highleverage experiments from paper prototype deployable module in our platform.

What Youll Do

Advance Visual Perception: Build and finetune models for detection tracking segmentation (2D/3D) pose & activity recognition and scene understanding (incl. 360 and multiview).
Multimodal Reasoning with VLMs: Train/evaluate visionlanguage models (VLMs) for grounding dense captioning temporal QA and tooluse; design retrievalaugmented and agentic loops for perceptionaction tasks.
Physical AI & Embodiment: Prototype perceptionintheloop policies that close the gap from pixels to actions (simulation real data). Integrate with planners and task graphs for manipulation navigation or safety workflows.
Data & Evaluation at Scale: Curate datasets author highsignal evaluation protocols/KPIs and run ablations that make results irreproducible impossible.
Systems & Deployment: Package research into reliable services on a modern stack (Kubernetes Docker Ray FastAPI) with profiling telemetry and CI for reproducible science.
Agentic Workflows: Orchestrate multiagent pipelines (e.g. LangGraphstyle graphs) that combine perception reasoning simulation and codegeneration to selfcheck and selfcorrect.

Example Problems You Might Tackle

Longhorizon video understanding (events activities causality) from egocentric or 360 video.
3D scene grounding: linking language queries to objects affordances and trajectories.
Fast privacypreserving perception for ondevice or edge inference.
Robust multimodal evaluation: temporal consistency openset detection uncertainty.
Visionconditioned policy evaluation in sim (Isaac/MuJoCo) with sim2real stress tests.

Minimum Qualifications

Ph.D. student in CS/EE/Robotics (or related) actively publishing in CV/ML/Robotics (e.g. CVPR/ICCV/ECCV NeurIPS/ICML/ICLR CoRL/RSS).
Strong PyTorch (or JAX) and Python; comfort with CUDA profiling and mixedprecision training.
Demonstrated research in computer vision and at least one of: VLMs (e.g. LLaVAstyle videolanguage models) embodied/physical AI 3D perception.
Proven ability to move from paper code ablation result with rigorous experiment tracking.

Preferred Qualifications

Experience with video models (e.g. TimeSFormer/MViT/VideoMAE) diffusion or 3D GS/NeRF pipelines or SLAM/scene reconstruction.
Prior work on multimodal grounding (referring expressions spatial language affordances) or temporal reasoning.
Familiarity with ROS2 DeepStream/TAO or edge inference optimizations (TensorRT ONNX).
Scalable training: Ray distributed data loaders sharded checkpoints.
Strong software craft: testing linting profiling containers and reproducibility.
Public code artifacts (GitHub) and firstauthor publications or strong opensource impact.

Our Stack (youll touch a subset)

Modeling: PyTorch torchvision/lightning Hugging Face OpenMMLab xFormers
Perception: YOLO/Detectron/MMDet SAM/Mask2Former CLIPstyle backbones optical flow
VLM / LMM: Vision encoders LLMs RAG for video toolformer/agent loops
3D / Sim: Open3D PyTorch3D Isaac/MuJoCo COLMAP/SLAM NeRF/3DGS
Systems: Python FastAPI Ray Kubernetes Docker Triton/TensorRT Weights & Biases
Pipelines: LangGraphlike orchestration data versioning artifact stores

What Success Looks Like

A publishable or opensourced outcome (with company approval) or a productionready module that measurably moves a product KPI (latency accuracy robustness).
Clean reproducible code with documented ablations and an evaluation report that a teammate can rerun endtoend.
A demo that clearly communicates capabilities limits and next steps.

Why Centific

Real impact: Your research shipspowering core features in our MVPs and products.
Mentorship: Work closely with our Principal Architect and senior engineers/researchers.
Velocity Rigor: We balance toptier research practices with pragmatic product focus.

Rate: $30-$50 Per hour

How to Apply

Email your CV publication list/Google Scholar and GitHub (or artifacts/videos) to with the subject line:

Vision AI / VLM / Physical AI Ph.D. Research Intern.

Centific is an equal-opportunity employer. All qualified applicants will receive consideration for employment without regard to race color religion national origin ancestry citizenship status age mental or physical disability medical condition sex (including pregnancy) gender identity or expression sexual orientation marital status familial status veteran status or any other characteristic protected by applicable law. We consider qualified applicants regardless of criminal histories consistent with legal requirements.

Required Experience:

Intern

About CentificCentific is a frontier AI data foundry that curates diverse high-quality data using our purpose-built technology platforms to empower the Magnificent Seven and our enterprise clients with safe scalable AI deployment. Our team includes more than 150 PhDs and data scientists along with m...