The LLM Data Company
“With different training, the same ostensive teaching of these words would have effected a quite different understanding.”
LUDWIG WITTGENSTEIN
Pre-training teaches by exposure. The model is taught these words trillions of times, and in minimizing prediction error, it acquires a latent sense of syntax, causation, analogy, semantics and, soon after, physics, medicine, sports, and so on. What emerges are the largest computational artifacts ever built: tools of thought.
In post-training, the training objective is reoriented. We are no longer approximating the distribution of the pre-training corpus and minimizing loss against ground truth. Instead, we teach the model to achieve. We imbue agency, and as such, imitation is replaced by decision. Words become actions, and actions are taken to maximize reward within an environment.
Each environment defines a compact world of goals, constraints, and feedback, what Wittgenstein would call a form of life, in which decisions respond to the rules of the game. Most commercial and academic domains of consequence, such as medicine, finance, law, philosophy, and so on, are difficult to verify and require careful reward engineering.
The LLM Data Company designs worlds for non-verifiable domains. We build environments where the relation between decision and outcome is designed to mimic real world measures of success. Intelligence, in our view, is not a measure of what is learned but a measure of what is achieved.

