Mid-level AI EngineerData Engineer

Not Interested
Bookmark
Report This Job

profile Job Location:

Philadelphia, PA - USA

profile Monthly Salary: Not Disclosed
Posted on: 9 hours ago
Vacancies: 1 Vacancy

Job Summary

About the Role RECRUITERS MUST RUN CHECKLISTS KEYWORDS UNDERLINED
We are building a platform that converts unstructured financial data ( emails corporate actions index announcements ) into high-quality structured datasets used by financial institutions.
This is not a typical LLM wrapper role.
You will work on systems that:

  • Extract data from noisy inconsistent sources
  • Validate and reconcile outputs across multiple inputs
  • Ensure correctness traceability and auditability

The challenge is not just applying LLMs-its making them reliable in production for financial workflows.
What Youll Work On

  • Designing pipelines that process high-volume financial documents (batch near real-time)
  • Building LLM-powered extraction workflows ( classification parsing summarization )
  • Implementing validation layers (rule-based model-based) to reduce hallucinations
  • Developing retrieval systems using embeddings and vector search
  • Architecting end-to-end systems: ingestion processing storage serving
  • Ensuring data quality observability and fault tolerance
  • Collaborating with product to turn messy data into usable financial intelligence

Core Requirements

  • Strong Python and backend/data engineering experience
  • Experience building production data pipelines (ETL streaming or async systems)
  • Solid understanding of distributed systems and failure modes
  • Experience working with LLM-based systems in production:
    • Prompt design
    • Output validation
    • Retry/fallback strategies
    • Evaluation and monitoring
  • Experience with data storage systems (SQL NoSQL)
  • Familiarity with cloud infrastructure (AWS or similar)

Preferred Experience

  • Experience with RAG / vector search systems
  • Background in financial data or capital markets
  • Experience with streaming systems (Kafka etc.)
  • Experience building multi-step or agent-style workflows

What Makes This Role Interesting

  • Work on high-accuracy AI systems where correctness matters
  • Solve real problems around:
    • LLM reliability and hallucination mitigation
    • Data consistency across conflicting sources
    • Real-time vs correctness tradeoffs
  • Build systems used in financial decision-making workflows
  • High ownership over core architecture in an early-stage environment

Nice to Know (but not required)

  • Experience with orchestration tools ( Airflow etc.)
  • Exposure to evaluation frameworks for LLMs
  • Experience working with large-scale document processing

Tech Stack (Representative not exhaustive)

  • Python APIs async processing
  • LLM APIs embeddings
  • SQL / NoSQL databases
  • Cloud infrastructure (AWS)
  • Data pipelines and streaming systems
  • Vector Databases
* If they have 6-8 years of software development/engineering with AI and Data Engineering experience
* If they have worked in the investment management investment banking area processing FINANCIAL MARKET DATA pipelines RAG Vector databases
* If they are fluent with Python and API development and streaming systems like Kafka or similar
* Prefer people who have worked at BlackRock Fidelity Investments Vanugard State Street Global Advisors ETrade Charles Schwab etc.
at Vanguard Group an investment management company that deals with Mutual Funds Index Funds ETFs etc. So must come from this business domain or they wont understand what to do.
About the Role RECRUITERS MUST RUN CHECKLISTS KEYWORDS UNDERLINED We are building a platform that converts unstructured financial data ( emails corporate actions index announcements ) into high-quality structured datasets used by financial institutions. This is not a typical LLM wrapper role. You w...
View more view more