The Opportunity:
Join Gloos AI Research & Data Science team and turn frontier LLM research into evidence-backed product decisions that help people grow. Youll own the end-to-end experimentation loopfrom designing causal tests and faith-aligned evaluation metrics to automating dashboards that keep model quality bias and cost in check. Pastors parents and everyday seekers will trust the insights we deliver because youll make sure the data behind them is rock-solid.
Your day-to-day blends hands-on analytics with strategic impact:
- Morning: run a power analysis on yesterdays live-traffic experiment then pull terabytes of telemetry from Snowflake to fine-tune a hallucination detector in PyTorch.
- Mid-day: pair with an ML engineer to wire your new faith-alignment score into the CI pipeline so every model checkpoint is auto-graded before it hits staging.
- Afternoon: present a causal-impact report to product and design translating statistical nuance into a clear go/no-go decision for next weeks launch.
You wont do it alone. Youll collaborate daily with research scientists backend engineers PMs and Design who care as deeply about human flourishing as they do about clean code. Together youll ship world-class eval suites trusted data pipelines and self-service insights that let others build on our work.
If youre ready to trade big-tech bureaucracy for autonomy mission and the chance to invent what doesnt exist yet wed love to meet you. This hybrid role is based in Pittsburgh PA or Palo Alto CA with quarterly summits in Boulder and the freedom to see your ideas move from prototype to productionfast.
What Youll Do:
Research & Technical Excellence
- Design gold-standard evaluation pipelines. Build offline online test harnesses that quantify accuracy hallucination bias latency cost and faith alignment for every new model checkpointthen light them up in CI so bad pushes never reach staging.
- Invent metrics that matter. Translate human flourishing into measurable signals (e.g. uplift scores pastoral-fit indices) and validate them with causal inference inter-rater reliability and power analyses.
- Champion testability at data scale. Instrument data pipelines to emit lineage privacy flags and drift stats; automate regression and shadow-traffic tests so research moves fast without breaking trust.
- Advance the craft. Publish internal whitepapers run journal clubs on causal ML and LLM evaluation and mentor engineers in statistical thinking and experiment design.
Delivery
- Ship decision-ready insights. Own the end-to-end loopfrom query in Snowflake or Weaviate through feature engineering in PyTorch to a Looker dashboard or Slack bot that PMs use daily.
- Accelerate cycles not risk. Template experiment frameworks (A/B switchback Bayesian bandits) so product teams can launch tests in hours; automate cleanup analysis and stop/keep/scale recommendations.
Raise the performance ceiling. Hunt bottlenecksslow ETL expensive inference noisy labelsand knock them down with better schemas caching or active-learning pipelines.
Strategic Clarity
- Align analytics with mission. Partner with product and research to frame how new metrics datasets or safety guardrails power coaching discipleship and broader human-flourishing outcomes.
- Model trade-offs. Quantify the ROI of additional GPU spend bigger context windows or stricter privacy filters then brief execs with clear yes / later / never options.
- Codify best practice. Keep living docs on metrics definitions experiment checklists and data-governance rules so every team can self-serve without reinventing the wheel.
Collaboration
- Default to cross-functional. Pair with infra on scalable data stores with security on PII governance and with UX on in-product instrumentation that captures the right events the first time.
- Create productive tension. Ask the hard statistical questions early surface risks and drive disagree-and-commit clarity so launches stay both fast and safe.
Leadership & Influence
- Mentor over manage. Coach junior analysts and engineers; review code notebooks and dashboards to raise the teams statistical bar.
- Lead from the data. When the numbers disagree with intuition speak upcandidly but constructivelythen rally the group around evidence-based solutions.
- Institutionalize learning. Run retros on every major experiment celebrate lessons (wins or fails) and bake micro-improvements into the next cycle.
In short: youll turn terabytes of raw signals into trustworthy metrics experiments and narratives that keep our AI honest and our mission on trackensuring every insight we ship genuinely helps someone flourish.
What We Are Looking For:
- Advanced degree or demonstrable equivalent (peer-reviewed research or high-impact data-science products).
- 5 years in product or research data-science roles designing experiments and shipping insights.
- Expert SQL and Python (Pandas NumPy); own end-to-end ETL analysis dashboard.
- Deep fluency in causal inference A/B and switchback testing and metric design.
- Proven ability to evaluate or partner on large-model analytics; familiarity with LLM eval best practices.
- Skilled communicator who turns statistical nuance into decisive recommendations.
Preferred Qualifications
- Hands-on LLM evaluation or bias-audit work; prompt analysis tooling.
- Modern MLOps familiarity (Ray Airflow Kubernetes) and GPU cost telemetry.
- Publications at NeurIPS/ICML/KDD or open-source repos > 500 stars.
- Prior work in mission-driven or faith-aligned settings.
Job Location:
- Hybrid in Sewickley PA
- Hybrid in Palo Alto CA
Compensation:
- Sewickley PA - $125000 - $175000
- Palo Alto CA - $175000 - $225000
Our Team Members Enjoy:
Competitive compensation and discretionary performance bonus commensurate with experience
- Flexible PTO policy and state-compliant sick leave to support your well-being
- Medical Dental and Vision plans with up to 90% coverage for employees
- Generous employer HSA contributions for HDHP elections
- Employer-sponsored 401k program with a 2% employer match
- Learning & Development stipend available after 6 months of employment
- Paid Parental Leave
- A dynamic talented team dedicated to changing the world and building an incredible business
- Onsite and virtual social events to keep us connected in our hybrid work environment
Applicants must be currently authorized to work in the United States on a full-time basis. At this time Gloo is only able to consider candidates who are U.S. Citizens or U.S. Permanent Residents.
Gloo is committed to providing an inclusive and accessible experience for all candidates. If you require a reasonable accommodation during the application or interview process please contact us at to let us know how we can support you.
Job is posted until filled.
Required Experience:
Senior IC