The LLM Data Company

Teaching models to learn and play at scale.

Home

Autograder

↗

Blog

↗

The LLM Data Company

Teaching models to learn and play at scale.

“With different training, the same ostensive teaching of these words would have effected a quite different understanding.”

LUDWIG WITTGENSTEIN

Pre-training teaches by exposure. The model is taught these words trillions of times, and in minimizing prediction error, it acquires a latent sense of syntax, causation, analogy, semantics and, soon after, physics, medicine, sports, and so on. What emerges are the largest computational artifacts ever built: tools of thought.

In post-training, the training objective is reoriented. We are no longer approximating the distribution of the pre-training corpus and minimizing loss against ground truth. Instead, we teach the model to achieve. We imbue agency, and as such, imitation is replaced by decision. Words become actions, and actions are taken to maximize reward within an environment.

Each environment defines a compact world of goals, constraints, and feedback, what Wittgenstein would call a form of life, in which decisions respond to the rules of the game. Most commercial and academic domains of consequence, such as medicine, finance, law, philosophy, and so on, are difficult to verify and require careful reward engineering.

The LLM Data Company designs worlds for non-verifiable domains. We build environments where the relation between decision and outcome is designed to mimic real world measures of success. Intelligence, in our view, is not a measure of what is learned but a measure of what is achieved.

The LLM Data Company

Teaching models to learn and play at scale.

Contact

Careers

Github

Legal

Disclaimers

The LLM Data Company

Teaching models to learn and play at scale.

Contact

Careers

Github

Legal

Disclaimers

The LLM Data Company

Teaching models to learn and play at scale.

Contact

Careers

Github

Legal

Disclaimers