介绍模型规模与性能的 Scaling Law 原理
参考资料
帖子
Scaling Laws for LLMs: From GPT-3 to o3arrow-up-right
论文
Scaling Laws for Neural Language Modelsarrow-up-right
Distillation Scaling Lawsarrow-up-right
Towards Greater Leverage: Scaling Laws for Efficient Mixture-of-Experts Language Modelsarrow-up-right
Last updated 46 minutes ago
Was this helpful?