介绍 Gemma 系列模型的的架构与训练方法
Gemma 主要技术:Sliding Window Attention、RMSNorm
参考资料
论文
Gemma:Gemma: Open Models Based on Gemini Research and Technologyarrow-up-right
Gemma 2:Gemma 2: Improving Open Language Models at a Practical Sizearrow-up-right
Gemma 3:Gemma 3 Technical Reportarrow-up-right
Gemma 3n:Introducing Gemma 3n: The developer guide - Google Developers Blogarrow-up-right
Last updated 46 minutes ago
Was this helpful?