HyperAI
Command Palette
Search for a command to run...
Llama 2:开放的基础模型和微调的聊天模型
Llama 2:开放的基础模型和微调的聊天模型
Hugo Touvron* Louis Martin† Kevin Stone† Peter Albert Amjad Almahairi Yasmine Babaei Nikolay Bashlykov Soumya Batra Prajjwal Bhargava Shruti Bhosale Dan Bikel Lukas Blecher Cristian Canton Ferrer Moya Chen Guillem Cucurull David Esiobu Jude Fernandes Jeremy Fu Wenying Fu Brian Fuller Cynthia Gao Vedanuj Goswami Naman Goyal Anthony Hartshorn Saghar Hosseini Rui Hou Hakan Inan Marcin Kardas Viktor Kerkez Madian Khabsa Isabel Kloumann Artem Korenev Punit Singh Koura Marie-Anne Lachaux Thibaut Lavril Jenya Lee Diana Liskovich Yinghai Lu Yuning Mao Xavier Martinet Todor Mihaylov Pushkar Mishra Igor Molybog Yixin Nie Andrew Poulton Jeremy Reizenstein Rashi Rungta Kalyan Saladi Alan Schelten Ruan Silva Eric Michael Smith Ranjan Subramanian Xiaoqing Ellen Tan Binh Tang Ross Taylor Adina Williams Jian Xiang Kuan Puxin Xu Zheng Yan Iliyan Zarov Yuchen Zhang Angela Fan Melanie Kambadur Sharan Narang Aurelien Rodriguez Robert Stojnic Sergey Edunov Thomas Scialom*
摘要
在本研究中,我们开发并发布了Llama 2,这是一系列预训练和微调的大规模语言模型(LLMs),参数规模从70亿到700亿不等。我们的微调模型称为Llama 2-Chat,专门针对对话应用场景进行了优化。在我们测试的大多数基准上,这些模型的表现优于开源聊天模型,并且根据我们在有用性和安全性方面的人类评估结果,它们可能成为闭源模型的合适替代品。我们详细描述了对Llama 2-Chat进行微调和安全改进的方法,以帮助社区在此基础上进一步发展,并促进大规模语言模型(LLMs)负责任的研发。