AI Agent¶
通用 Agent¶
- 2210.03629_ReAct
- 2303.08268_Chat-with-the-Environment
- 2303.11366_Reflexion: Language Agents with Verbal Reinforcement Learning
- 2303.16434_TaskMatrix.AI
- 2304.03442_Generative-Agents
- 2307.07924_ChatDev: Communicative Agents for Software Development
- 2308.00352_MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework
- 2308.04026_AgentSims: An Open-Source Sandbox for Large Language Model Evaluation
- 2308.08155_AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation
- 2308.10848_AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors
- 2310.06117_Step-Back: Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models
- 2312.04511_LLMCompiler: An LLM Compiler for Parallel Function Calling
- 2402.18679_MetaGPT_DI: Data Interpreter: An LLM Agent For Data Science
- 2407.07061_IoA: Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence
- 2408.08435_ADAS: Automated Design of Agentic Systems
- 2408.08435_ADAS: Automating Agentic Workflow Generation
- 2410.17238_SELA: Tree-Search Enhanced LLM Agents for Automated Machine Learning
- 2410.21012_FACT: Examining the Effectiveness of Iterative Context Rewriting for Multi-fact Retrieval
- 2504.01990_Advances and Challenges in Foundation Agents
- 2506.12508_AgentOrchestra: A Hierarchical Multi-Agent Framework for General-Purpose Task Solving
- 2510.08842_Maple: A Multi-agent System for Portable Deep Learning across Clusters
DeepResearch¶
视觉 Agent&AIOS¶
- 2108.03353_ Screen2Words: Automatic Mobile UI Summarization with Multimodal Learning
- 2209.08199_ScreenQA: Large-Scale Question-Answer Pairs over Mobile App Screenshots
- 2212.06817_RT-1: ROBOTICS TRANSFORMER FOR REAL-WORLD CONTROL AT SCALE
- 2312.13771_AppAgent: Multimodal Agents as Smartphone Users
- 2401.10935_SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents
- 2402.04615_ScreenAI: A Vision-Language Model for UI and Infographics Understanding
- 2402.07939_UFO: A UI-Focused Agent for Windows OS Interaction
- 2403.16971_AIOS: LLM Agent Operating System
- 2406.01014_Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration
- 2411.00820_AutoGLM: Autonomous Foundation Agents for GUIs
- 2411.02059_TableGPT2: A Large Multimodal Model with Tabular Data Integration
- 2501.11733_Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks
- 2501.12326_UI-TARS: Pioneering Automated GUI Interaction with Native Agents
- 2502.14282_PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC
- 2504.14603_UFO2: The Desktop AgentOS
- 2508.04037_SEA: Self-Evolution Agent with Step-wise Reward for Computer Use