Staff Machine Learning Engineer, Data Collections AI & ML

PitchBook Data

Not Interested
Bookmark
Report This Job

profile Job Location:

Seattle, OR - USA

profile Monthly Salary: $ 260000 - 325000
Posted on: 8 hours ago
Vacancies: 1 Vacancy

Job Summary

At PitchBook a Morningstar company we are always looking forward. We continue to innovate evolve and invest in ourselves to bring out the best in everyone. Were deeply collaborative and thrive on the excitement energy and fun that reverberates throughout the company.

Our extensive learning programs and mentorship opportunities help us create a culture of curiosity that pushes us to always find new solutions and better ways of doing things. The combination of a rapidly evolving industry and our high ambitions means theres going to be some ambiguity along the way but we excel when we challenge ourselves. Were willing to take risks fail fast and do it all over again in the pursuit of excellence.

If you have a good attitude and are willing to roll up your sleeves to get things done PitchBook is the place for you.

About the Role:

The Data Collection AI/ML team builds intelligent systems that scale and improve PitchBooks data extraction enrichment and validation processes. The team applies advanced ML including classification entity/relationship extraction LLM-based parsing OCR and anomaly detection to ensure high accuracy coverage and timeliness of our proprietary datasets.

The Staff MLE role is a force multiplier for the team partnering with technical leadership to set best practices and design reusable ML architectures that support rapid innovation and operational excellence.

As a Staff Machine Learning Engineer on the Data Collection AI/ML team you will serve as the senior technical expert responsible for designing architecting and deploying advanced AI and machine learning systems that power PitchBooks data collection extraction and enrichment workflows. You will play a pivotal role in elevating the technical bar of the organization by setting engineering standards driving architectural decisions and supporting teams to build scalable production-grade ML systems.

Your work will focus on automating and enhancing PitchBooks ingestion and data quality pipelines across a wide variety of structured and unstructured sources drawing from domain areas such as document understanding OCR natural language processing entity resolution multimodal modeling retrieval systems and LLM-driven extraction. You will collaborate closely with Engineering Product and Data Operations partners to translate business requirements into robust high-impact AI solutions.

This role is ideal for someone who thrives as a deeply technical IC and wants to push the boundaries of document AI and data extraction technology shape long-term architectural direction and materially influence the future of data automation at PitchBook.

In addition to driving product impact this role offers an opportunity to shape PitchBooks growing presence and technical reputation in the AI and ML space. We are looking for individuals who are active contributors to the broader AI community through peer-reviewed research technical publications or open-source initiatives. Candidates who have authored conference papers or patents and who are excited to explore the frontiers of generative AI LLMs and applied NLP will be well-positioned to help us both advance our internal capabilities and deepen trust with our customers through thought leadership

Primary Job Responsibilities:

  • Serve as the key technical leader shaping system design ML architectures model lifecycles and scalable infrastructure for data extraction document understanding and structured data enrichment
  • Architect reusable frameworks and services for LLM-powered extraction entity recognition and resolution models and multimodal document processing
  • Partner with engineering leaders to ensure our systems meet the highest standards of reliability performance and cost efficiency
  • Design and build state-of-the-art ML models using transformers LLMs generative models graph-based approaches and OCR/Document AI frameworks
  • Identify opportunities to advance automation and accuracy across our ingestion stack including entity linking relationship inference classification and anomaly detection
  • Translate emerging research into practical production-ready capabilities
  • Contribute to PitchBooks growing technical reputation through experimentation publication or open-source work
  • Work closely with Product Engineering and Data Operations to ensure AI systems integrate smoothly into human-in-the-loop workflows and downstream pipelines
  • Provide technical expertise during prioritization discussions roadmap planning and long-term strategic design
  • Elevate engineering excellence through code reviews design reviews and technical guidance for ML engineers and scientists
  • Act as a multiplier by shaping best practices for experimentation model evaluation responsible AI and scalable ML engineering
  • Guide teams across the organization toward cohesive reusable and standards-aligned architectures
  • Own the lifecycle of mission-critical ML systems from data preparation to deployment monitoring and continuous improvement
  • Ensure strong standards for model governance explainability and data integrity across the AI/ML stack.
  • Partner with ML Ops and Platform Engineering teams along with other partner engineering groups to maintain high availability reliability and robustness for production ML systems

Skills and Qualifications:

  • Bachelors or Masters degree in Computer Science Mathematics Data Science or a related technical discipline (Masters degree preferred)
  • 8 years of experience in machine learning data science or AI-focused engineering with at least 4 years of experience leading technical teams
  • Proven success delivering AI-driven data extraction enrichment or document understanding systems at scale. Hands-on experience with parameter-efficient fine-tuning methods and expertise in document classification optimization preferred
  • Deep expertise in natural language processing document AI OCR entity resolution and large-scale data automation
  • Strong understanding of modern ML frameworks and infrastructure (e.g. PyTorch TensorFlow Hugging Face LangChain MLFlow)
  • Demonstrated ability to define and execute multi-year AI roadmaps with measurable business impact
  • Strong knowledge of cloud-native architecture distributed computing and scalable model deployment
  • Excellent communication collaboration and influencing skills including experience presenting to executive and cross-functional leadership
  • A track record of fostering technical excellence and innovation across global multidisciplinary teams
  • Experience in fintech data platforms or large-scale information extraction systems preferred
  • Contributions to the AI/ML research community (e.g. publications patents or open-source projects) are strongly preferred

Benefits Compensation at PitchBook:

Physical Health

  • Comprehensive health benefits
  • Additional medical wellness incentives
  • STD LTD AD&D and life insurance

Emotional Health

  • Paid sabbatical program after four years
  • Paid family and paternity leave
  • Annual educational stipend
  • Ability to apply for tuition reimbursement
  • CFA exam stipend
  • Robust training programs on industry and soft skills
  • Employee assistance program
  • Generous allotment of vacation days sick days and volunteer days

Social Health

  • Matching gifts program
  • Employee resource groups
  • Subsidized emergency childcare
  • Dependent Care FSA
  • Company-wide events
  • Employee referral bonus program
  • Quarterly team building events

Financial Health

  • 401k match
  • Shared ownership employee stock program
  • Monthly transportation stipend

*Please be aware the above PitchBook benefit and perk offerings are subject to corresponding plan and policy documents and may change during the course of your employment.

Compensation

  • Annual base salary: $260000-$325000
  • Target annual bonus percentage: 20%

Working Conditions:

At the heart of our company is a belief in the power of in-person collaboration. Being together in the office fuels our creativity strengthens our connections and drives the innovation that sets us apart. Our culture is built on spontaneous momentsthose hallway conversations whiteboard brainstorms and shared celebrations in each of our global officesthat simply cant be replicated remotely. This role is expected to be in the office 5 days a week.

The job conditions for this position are in a standard office setting. Employees in this position use PC and phone on an on-going basis throughout the day. Limited corporate travel may be required to remote offices or other business meetings and events.

Life At PB:

We are consistently recognized as a Best Place to Work and our culture is at the heart of our success. Its our fundamental belief that people do and create great things and that people are the cornerstone of prosperity. We believe that proactively seeking out different points of view listening to others learning and reflecting on what weve heard creates a sense of belonging within PitchBook and strengthens the PitchBook community.

We are excited to get to know you and your background. Concerned that you might not meet every requirement We encourage you to still apply as you might be the right candidate for the role or other roles at PitchBook.

#LI-

#LI-Onsite


Required Experience:

Staff IC

At PitchBook a Morningstar company we are always looking forward. We continue to innovate evolve and invest in ourselves to bring out the best in everyone. Were deeply collaborative and thrive on the excitement energy and fun that reverberates throughout the company.Our extensive learning programs a...
View more view more

Key Skills

  • Computer Science
  • Docker
  • Kubernetes
  • Python
  • VMware
  • C/C++
  • Go
  • System Architecture
  • gRPC
  • OS Kernels
  • Perl
  • Distributed Systems

About Company

Company Logo

PitchBook provides readers with the best private market data through the PitchBook Platform, a suite of award-winning software applications. Learn more now!

View Profile View Profile