#scaling-laws
2 total
- Training Compute-Optimal Large Language Models: What Chinchilla Changed
The Chinchilla paper — why most large models were undertrained, and how to spend your compute budget wisely, with real Python code examples
- Scaling Laws for Neural Language Models: The Mathematics of Scale
The mathematics of scale — why bigger models are predictably better, with real Python code examples