Back to all jobs

LLM Performance Researcher

Work from home Full-time role Hiring

Full-time • San Francisco

At Endeavor, we’re rebuilding ERP from first principles for $1B+ manufacturing and distribution companies. These companies run on PDFs, spreadsheets, and semi-structured chaos — and we’re building LLM-powered systems to parse, match, and reason through all of it with human-level reliability.

We’re looking for a researcher with deep experience in LLM performance on document tasks — especially extraction, entity linking, and record matching. You’ve likely published papers on it. You’ve probably run head-to-head evals on OpenAI, Claude, and open-source models. You’re fluent in both academic benchmarks and in the weird, grimy failure modes that only show up in production.

Your work will directly improve the core performance of our agentic ERP. You’ll prototype new techniques, run structured evals, improve few-shot + tool-augmented performance, and help shape how LLMs interface with structured business systems.

What You’ll Do
  • Design and run experiments to improve extraction, normalization, and matching across real-world documents

  • Evaluate LLM performance on noisy, multi-format inputs like scanned PDFs, OCR output, and Excel sheets

  • Improve model accuracy and reliability in the face of rare formats, abbreviations, bad formatting, and domain-specific vocab

  • Build and own our eval infrastructure for matching, linking, extraction, and schema alignment tasks

  • Work with the Applied AI Researcher and Backend Engineers to deploy improvements into production

  • Contribute to long-term strategy around fine-tuning, retrieval augmentation, tool use, or structured memory (if and when needed)

You Might Be a Fit If You
  • Have deep experience with document understanding and information extraction using LLMs

  • Have worked on schema alignment, record linking, or entity resolution at scale

  • Have published papers on LLM performance (e.g. extraction, evals, few-shot prompting, matching)

  • Understand both academic benchmarks and real-world weirdness

  • Know how to make evals meaningful, tight, and fast to iterate on

  • Want to work in a setting where research turns into production code fast

  • Have a PhD or equivalent research background in NLP, ML, or similar (but we care more about what you’ve done than what your title says)

Bonus Points
  • Experience with post-OCR workflows or noisy doc normalization

  • Deep intuition for failure modes in enterprise-scale matching/linking systems

  • Obsession with eval quality and reproducibility

  • Comfort implementing papers and benchmarking models at scale

  • Past work in procurement, invoicing, logistics, or any doc-heavy vertical

Apply to this Job

Related remote jobs

Senior Design Engineer

Work from home Full-time role

Senior Product Designer

Work from home Full-time role

Learning Designer (Part-Time)

Work from home Full-time role

Platform / DevOps / SRE Engineer

Work from home Full-time role

Fullstack Software Engineer

Work from home Full-time role

Senior Product Engineer (ML & Mobile)

Work from home Full-time role

Back End Engineer

Work from home Full-time role

Staff Frontend Software Engineer

Work from home Full-time role

Controller

Work from home Full-time role

Sales Executive

Work from home Full-time role

Experienced Healthcare Virtual Assistant for US Healthcare Industry - Remote Opportunity with Career Growth

Work from home Full-time role

Entry Level Information Security Role

Work from home Full-time role

Church Ambassador, Butler County, KS

Work from home Full-time role

Associate Dean, School of Health & Behavioral Sciences

Work from home Full-time role

Manager, Brand Strategy

Work from home Full-time role

VMS & Remote Payroll Supervisor

Work from home Full-time role

Senior Government Compliance Finance Analyst - Audit job at Honeywell in Clearwater, FL, Minneapolis, MN, Phoenix, AZ

Work from home Full-time role

Personal Financial Counselor; Assignment Ready Counselor, PFC, Southfield, MI

Work from home Full-time role

Work from Home Overnight Jobs Late Night Customer Experience Specialist $25-$35/hr Enhance customer experiences during late-night hours in a home-based role.

Work from home Full-time role

Senior Clinical Trial Manager - Remote Position

Work from home Full-time role