Research Engineer — SearchIR

Firecrawl

Not Interested
Bookmark
Report This Job

profile Job Location:

San Francisco, CA - USA

profile Monthly Salary: $ 180 - 290
Posted on: 4 days ago
Vacancies: 1 Vacancy

Job Summary

Research Engineer Search/IR

Research Engineer (Focused on Search/IR)

Youll own the search and information retrieval systems at the core of Firecrawl the infrastructure that determines how we find rank index and serve web content at scale. Retrieval quality is Firecrawls deepest moat. As AI agents increasingly depend on multi-step search and enrichment the gap between good retrieval and great retrieval compounds. Youre the person who closes that gap and widens it against every competitor. This is a full-stack search role where youll build and operate everything from ingestion pipelines to serving layers. If youve built search indexes at massive scale and care deeply about ranking quality freshness and retrieval speed this is the role.

Salary Range: $180000$290000/year (Range shown is for U.S.-based employees. Compensation outside the U.S. is adjusted fairly based on your countrys cost of living. You can explore how we calculate this here: Range: Up to 0.15%

Location: San Francisco CA or Remote (Americas UTC-3 to UTC-10)

Job Type: Full-Time

Experience: 3 years building search/IR systems at scale

Visa: US Citizenship/Visa required for SF; N/A for Remote

About Firecrawl

Firecrawl is the easiest way to extract data from the web. Developers use us to reliably convert URLs into LLM-ready markdown or structured data with a single API just a year weve hit 8 figures in ARR and 100k GitHub stars by building the fastest way for developers to get LLM-ready data.

Were a small fast-moving technical team building essential infrastructure superintelligence will use to gather data on the web. We ship fast and deep.

What Youll Do

Build and operate search indexes at massive scale. Design build and maintain the indexing infrastructure that powers Firecrawls core product. Youll handle billions of documents and care about every millisecond of latency and every byte of storage.

Own the full stack from ingestion to serving. You dont just build one piece you own the entire pipeline. Ingestion processing indexing ranking query understanding and serving. When something breaks at 3am you know where to look because you built it.

Solve ranking relevance and query understanding. Make sure the right content surfaces for the right queries. Youll build and iterate on ranking models relevance scoring and query parsing systems that directly impact product quality.

Tackle freshness dedup and incremental indexing. The web changes constantly. Youll build systems that keep our index fresh without re-crawling everything deduplicate content intelligently and handle incremental updates at scale without rebuilding from scratch.

Run experiments and ship results to production. You design experiments measure results rigorously and ship winners to production fast. You dont need someone to tell you what to try next you have a backlog of ideas and the judgment to prioritize them.

Collaborate closely with the team. Work directly with the RL-focused Research Engineer and the engineering team to connect search/IR improvements with model training and the broader product roadmap.

What Were Looking For

Has built search indexes at massive scale. Not a tutorial project real indexes serving real traffic with real latency requirements. Youve dealt with the hard problems: sharding strategies index compaction schema evolution and the operational complexity of keeping billions of documents queryable and fast.

Hands-on with ranking relevance and query understanding. Youve built or meaningfully improved ranking systems. You understand BM25 learned ranking embedding-based retrieval and when to use which. You can reason about relevance tradeoffs and youve shipped ranking changes that moved metrics in production.

Owns the full stack: ingestion index serving. Youre not a specialist who only touches one layer. Youve built and operated the entire search pipeline from how documents enter the system to how results get served. You understand the dependencies between layers and make good architectural decisions because you see the whole picture.

Has solved freshness dedup and incremental indexing problems. You know that building the initial index is the easy part. Keeping it accurate fresh and deduplicated at scale is where the real engineering lives. Youve built systems that handle continuous updates without full rebuilds and youve debugged the subtle correctness issues that come with incremental processing.

Self-directed experimenter who ships without handholding. You generate your own hypotheses design your own experiments and ship your own code. You dont wait for a roadmap or a sprint planning meeting. You see what needs to improve you try something you measure it and you ship it if it works.

Backgrounds that tend to do well: Search engineers at companies with large-scale indexes web search e-commerce document search. IR researchers whove shipped their work to production. Infrastructure engineers whove built and operated real-time indexing pipelines. Engineers from Elasticsearch Algolia Vespa or similar search infrastructure teams who got frustrated that they could only tune the knobs and wanted to build the engine.

What Were NOT Looking For

Search users not search builders. If your experience is configuring Elasticsearch or tuning Solr queries but you havent built search infrastructure from scratch this isnt the right role. We need someone who builds the engine.

Researchers who dont ship. If your best search/IR work lives in a paper and youve never deployed a ranking model to production this isnt it. Every experiment here ends with code running in prod.

Engineers who only work on one layer. If you only do indexing or only do ranking or only do serving and youre not interested in owning the full stack youll be frustrated here. We need someone who sees the whole pipeline and can work anywhere in it.

People who need clean infrastructure to be productive. The systems youll work on are evolving fast. If you need everything to be perfectly abstracted and well-documented before you can contribute youll stall. We need someone who can build and improve infrastructure while shipping on it.

A Note On Pace

We operate at an absurd level of urgency because the window for what were building wont stay open forever. If that excites you keep reading. If it doesnt no hard feelings but this role probably isnt for you.

Benefits & Perks

Available to all employees

  • Salary that makes sense $180000$290000/year based on impact not tenure

  • Own a piece Up to 0.15% equity in what youre helping build

  • Generous PTO 15 days mandatory anything after 24 days just ask (holidays excluded); take the time you need to recharge

  • Parental leave 12 weeks fully paid for moms and dads

  • Wellness stipend $100/month for the gym therapy massages or whatever keeps you human

  • Learning & Development Expense up to $1000/year toward anything that helps you grow professionally

  • Team offsites A change of scenery minus the trust falls

  • Sabbatical 3 paid months off after 4 years do something fun and new

Available to US-based full-time employees

  • Full coverage no red tape Medical dental and vision (100% for employees 50% for spouse/kids) no weird loopholes just care that works

  • Life & Disability insurance Employer-paid short-term disability long-term disability and life insurance coverage for lifes curveballs

  • Supplemental options Optional accident critical illness hospital indemnity and voluntary life insurance for extra peace of mind

  • Doctegrity telehealth Talk to a doctor from your couch

  • 401(k) plan Retirement might be a ways off but future-you will thank you

  • Pre-tax benefits Access to FSAs and commuter benefits (US-only) to help your wallet out a bit

  • Pet insurance Because fur babies are family too

Available to SF-based employees

  • SF HQ perks Snacks drinks team lunches intense ping pong and peak startup energy

  • E-Bike transportation A loaner electric bike to get you around the city on us

Interview Process

Application Review Send us your work and a quick note on why this excites you. Show us what youve built search systems indexing pipelines ranking improvements. We care about what youve shipped not where you went to school.

Intro Chat (20 min) A quick conversation to get to know each other before we go deep. Well talk about what youve been working on what drew you to Firecrawl and what youre looking for in your next role. Time for your questions too.

Technical Deep Dive (60 min) Go deep on search/IR systems youve built: architecture decisions scale challenges ranking approaches and production tradeoffs. Well explore a live problem how youd approach a real search/indexing challenge at Firecrawls scale. Were looking for depth across the full stack production instincts and the ability to reason about tradeoffs under constraints.

Founder Chat (30 min) Culture pace ownership and how you like to work. Time for your questions too.

Paid Work Trial (12 weeks) Tackle a real search/IR problem with production implications. We evaluate on technical depth experimentation rigor and how fast you ship something meaningful.

Decision We move fast after the trial.

If youve built search systems at scale and want to work on one of the most interesting web data problems in AI infrastructure this is your shot.

Apply now.


Required Experience:

IC

Research Engineer Search/IRResearch Engineer (Focused on Search/IR)Youll own the search and information retrieval systems at the core of Firecrawl the infrastructure that determines how we find rank index and serve web content at scale. Retrieval quality is Firecrawls deepest moat. As AI agents in...
View more view more

About Company

Company Logo

The web crawling, scraping, and search API for AI. Built for scale. Firecrawl delivers the entire internet to AI agents and builders. Clean, structured, and ready to reason with.

View Profile View Profile