介绍 Llama 系列模型的架构与训练方法
Llama 主要技术:GQA、RMSNorm、SwiGLU FFN、RoPE
参考资料
帖子
Llama 2 and FlashAttention 2 - by Sebastian Raschka, PhDarrow-up-right
Llama 4: The Challenges of Creating a Frontier-Level LLMarrow-up-right
论文
Llama:LLaMA: Open and Efficient Foundation Language Modelsarrow-up-right
Llama 2:Llama 2: Open Foundation and Fine-Tuned Chat Modelsarrow-up-right
Llama 3:The Llama 3 Herd of Modelsarrow-up-right
Llama 3.1:https://ai.meta.com/blog/meta-llama-3-1/
Llama 4:The Llama 4 herd: The beginning of a new era of natively multimodal AI innovationarrow-up-right
Last updated 46 minutes ago
Was this helpful?