论文¶
通用¶
Pipedream: Fast and efficient pipeline parallel dnn training. arXiv:1806.03377, 2018. Accurate, large minibatch SGD: training imagenet in 1 hour. CoRR, abs/1706.02677, 2017. Gpipe: Efficient training of giant neural networks using pipeline parallelism. CoRR, abs/1811.06965, 2018. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
Agents¶
- React
- Chat with the Environment
- Reflexion: Language Agents with Verbal Reinforcement Learning
- TaskMatrix.AI
- Generative Agents
- ChatDev: Communicative Agents for Software Development
- MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework
- AgentSims: An Open-Source Sandbox for Large Language Model Evaluation
- AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation
- AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors
- Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models
- Data Interpreter: An LLM Agent For Data Science
- Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence
- ADAS: Automated Design of Agentic Systems
- SELA: Tree-Search Enhanced LLM Agents for Automated Machine Learning
- AFlow: Automating Agentic Workflow Generation
- FACT: Examining the Effectiveness of Iterative Context Rewriting for Multi-fact Retrieval
大模型调优¶
- Prefix-Tuning: Optimizing Continuous Prompts for Generation
- p-tuning: GPT Understands, Too
- Prompt Tuning: The Power of Scale for Parameter-Efficient Prompt Tuning
- LoRA: Low-Rank Adaptation of Large Language Models
- QLoRA: Efficient Finetuning of Quantized LLMs
- Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
- DoRA: Weight-Decomposed Low-Rank Adaptation
- LoRA+: Efficient Low Rank Adaptation of Large Models
- GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
- LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
- 2305.20050_Let’s Verify Step by Step
- Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
- Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective
- 2203.02155_Training language models to follow instructions with human feedback(InstructGPT)
分布式模型¶
- 通用
- 1701.06538_MoE: Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
- 1806.03377_PipeDream: Fast and Efficient Pipeline Parallel DNN Training
- 1811.06965_GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism
- 1909.08053_Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
- 19xx_PipeDream: Generalized Pipeline Parallelism for DNN Training
- 2006.15704_PyTorch Distributed: Experiences on Accelerating Data Parallel Training
- 2006.16668_GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
- 2006.09503_PipeDream-2BW: Memory-Efficient Pipeline-Parallel DNN Training
- 2104.04473_Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM
- 2205.14135_FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
- 2307.08691_FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
NLP LLM¶
- GPT1: Improving Language Understanding by Generative Pre-Training
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
- GPT2: Language Models are Unsupervised Multitask Learners
- CPM: A Large-scale Generative Chinese Pre-trained Language Model
- LLaMA: Open and Efficient Foundation Language Models
- Llama 2: Open Foundation and Fine-Tuned Chat Models
- Qwen Technical Report
- DeepSeek-Coder: When the Large Language Model Meets Programming – The Rise of Code Intelligence
- MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies
- DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
- ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools
MoE LLM¶
Vision LLm¶
- 1506.02640_You Only Look Once: Unified, Real-Time Object Detection
- 1612.08242_YOLO9000: Better, Faster, Stronger
- 1804.02767_YOLOv3
- 2004.10934_YOLOv4: Optimal Speed and Accuracy of Object Detection
- 2207.02696_YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
- Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection
- 2304.08485_Visual Instruction Tuning
- Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond
- 2402.13616_YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information
- DeepSeek-VL: Towards Real-World Vision-Language Understanding
- 2405.14458_YOLOv10: Real-Time End-to-End Object Detection
LLMMultimodal¶
LLM强化学习¶
LLM 安全¶
数据集&数据蒸馏¶
Framework¶
- 1712.05889_Ray: A Distributed Framework for Emerging AI Applications
- 1910.02054_DeepSpeed_ZeRO: Memory Optimizations Toward Training Trillion Parameter Models
- PyTorch: An Imperative Style, High-Performance Deep Learning Library
- Transformers: State-of-the-Art Natural Language Processing
- 2210.XX_Ray v2 Architecture
- 2309.06180_Efficient Memory Management for Large Language Model Serving with PagedAttention
ML¶
- WebGPT: Browser-assisted question-answering with human feedback
- Teaching language models to support answers with verified quotes
- FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation
- Evaluating Verifiability in Generative Search Engines
- Citation: A Key to Building Responsible and Accountable Large Language Models
- HAGRID: A Human-LLM Collaborative Dataset for Generative Information-Seeking with Attribution
- Enabling Large Language Models to Generate Text with Citations
RAG¶
- 2005.11401_Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
- 2312.10997_Retrieval-Augmented Generation for Large Language Models: A Survey
- 2401.15884_CRAG: Corrective Retrieval Augmented Generation
- 2403.14403_Adaptive-RAG
- 2404.16130_From Local to Global: A Graph RAG Approach to Query-Focused Summarization
- 2405.16506_GRAG: Graph Retrieval-Augmented Generation
- GraphRAG 官方文档
- 2406.13213_Multi-Meta-RAG: Improving RAG for Multi-Hop Queries using Database Filtering with LLM-Extracted Metadata
- 2410.10450_KBLaM: Knowledge Base augmented Language Model
Tools¶
手机业务¶
AGI¶
others¶
Highlighting the top ML papers every week: https://github.com/dair-ai/ML-Papers-of-the-Week