The LLM Data Company
“The fox knows many things, but the hedgehog knows one big thing.”
ISAIAH BERLIN
Every model is a purported generalist, yet its vocational training comes to bear when it’s pushed.
Scaling RL task data is tedious, and each lab is making its own tradeoffs in terms of what to prioritize. Further, multi-objective training necessitates tradeoffs at the policy level. Optimizing for SOTA on SWE-Bench implies instruction following to the point of sycophancy, which is dangerous for domains like mental health where the opposite is required. Frontier labs are focused on scaling verifiable domains like coding and tool-use, at the expense of the very critical, non-verifiable domains like medicine, law, finance and long-form writing.
These conflicting priorities will force models to speciate, and each lab will choose its place on the Pareto between coding, tool-use and so on. The LLM Data Company is training specific models for the jagged frontier and closing the verification gap in non-verifiable domains.
RESEARCH
Read our latest work:

