3.1.1. 常用¶

1. GPT-2 - OpenAI发布的大规模无监督语言模型,能生成人类语言: https://github.com/openai/gpt-2
1. BERT - Google发布的大规模语言表示模型,广泛用于NLP任务中: https://github.com/google-research/bert
1. ELMo - 语言模型,能学习上下文相关的语义表示: https://github.com/allenai/allennlp/blob/master/tutorials/how_to/elmo.md
1. ULMFiT - 通用语言表征模型,首次证明了通过微调语言模型来进行迁移学习的效果: https://github.com/fastai/fastai/blob/master/courses/dl2/lesson4.ipynb
1. Transformer - Google发布的神经网络模型,应用于机器翻译与语言理解: https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/models/transformer.py
1. XLNet - Google Brain和CMU发布的新型语言表示模型,能建模双向上下文: https://github.com/zihangdai/xlnet
1. RoBERTa - Facebook发布的强大的预训练语言模型,是BERT的强化版: https://github.com/pytorch/fairseq/tree/master/examples/roberta
1. T5 - Google的统一文本到文本框架,建立在Transformer之上,可用于多任务学习: https://github.com/google-research/text-to-text-transfer-transformer

开源大型语言模型排行榜2023:

Rank   Model     Elo Level  Description
 vicuna-13b       1169   LLaMA user dialog tuning assistant for LMSYS
 koala-13b        1082   BAIR academic research conversation model
 oast-pythia-12b  1065   LAION open chatbot for everyone
 alpaca-13b       1008   LLaMA instruction following micro-adjustment model in Stanford demo
 chatglm-6b       985    ClearHK double language conversation model
 fastchat-t5-3b   951    LMSYS fine-tuning conversational agent from FLAN-T5
 dolly-v2-12      944    Databricks directive optimization large language model
 llama-13b        932    Meta efficient base language model
 stablem-tuned-alpha-7b   858      Stability AI language model

基于 LLaMA 的 Alpaca 和 Vicuna
基于 Pythia 的 OpenAssistant 和 Dolly

演进¶

https://img.zhaoweiguo.com/uPic/2023/08/uPV2MJ.png — 语言大模型树进化树(from: https://arxiv.org/pdf/2304.13712.pdf)¶

https://img.zhaoweiguo.com/uPic/2023/08/qd2Chk.png — GPT Assistant training pipeline。训练 ChatGPT 需要经过预训练（Pretraining）、有监督的微调 (Supervised Fine-Tuning)、奖励模型 (Reward Modeling) 以及强化学习 (Reinforcement Learing) 四个步骤。而 ChatGLM 仅使用了预训练（Pretraining）、有监督的微调 (Supervised Fine-Tuning) 这两个步骤。（虽然 ChatGLM-6B 官网介绍说使用了和 ChatGPT 相似的技术，针对中文问答和对话进行了优化。经过约 1T 标识符的中英双语训练，辅以监督微调、反馈自助、人类反馈强化学习等技术的加持，但是没有相关的一些技术说明，比如：奖励模型从哪里来的、人工反馈的数据从哪里来等等。因此，存疑）¶

https://img.zhaoweiguo.com/uPic/2023/08/qNdPzf.png — 微软的 2023 年大会上，Andrej Karpathy 给出了该问题的答案。Andrej Karpathy 提到如下的榜单中：前三个是属于 RLHF 模型，而包括 ChatGLM 在内剩余的都是 SFT 模型或 Base 模型¶

CV:

VIT
Stable Diffusion
LayoutLM

Audio:

Whisper
XLS-R

CLIP¶

CLIP (Contrastive Language-Image Pre-Training)是一种对抗式多模态预训练模型,由OpenAI提出,主要用于图像和文本的联合表示学习。

其主要特点和创新点包括:

- 提出了一种对抗式学习框架,让图像和文本embedding在联合嵌入空间里相似的表示聚集,不同的表示分离开来。
- 构建了大规模图像-文本数据集进行预训练,包含400万张图像和120万个文本描述。
- 设计了浅层的图像模型(Vision Transformer)和文本模型(Transformers),通过大量数据预训练获得通用的多模态表示。
- 无需微调,预训练模型可以直接迁移到下游视觉和语言任务,达到或超过专业化模型的效果。
- 在多个视觉(图像分类、目标检测等)和语言(问答、摘要等)任务上都取得了state-of-the-art的结果。

CLIP的创新之处在于证明了大规模预训练的有效性,只需要简单的模型和大量数据就可以学习非常强大的联合嵌入表示。它为多模态领域打开了新的研究方向,也启发了后续一系列相关的预训练模型。CLIP被广泛应用于多模态任务中,是这个领域的重要基础技术之一。

资源¶

An awesome & curated list of best LLMOps tools for developers: https://github.com/tensorchord/Awesome-LLMOps
Awesome series for Large Language Model(LLM)s: https://github.com/KennethanCeyer/awesome-llm
📋 A list of open LLMs available for commercial use: https://github.com/eugeneyan/open-llms
transformers 系列的介绍以及在下游任务中的使用: https://www.cnblogs.com/dongxiong/p/12763923.html

LocalAI¶

https://localai.io/basics/getting_started/
https://github.com/go-skynet/LocalAI
Self-hosted, community-driven, local OpenAI compatible API. Drop-in replacement for OpenAI running LLMs on consumer-grade hardware. Free Open Source OpenAI alternative. No GPU required. Runs ggml, gguf, GPTQ, onnx, TF compatible models: llama, llama2, gpt4all, rwkv, whisper, vicuna, koala, cerebras, falcon, dolly, starcoder, and many others