The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery¶

https://arxiv.org/abs/2408.06292
作者：Chris Lu， Cong Lu， Robert Tjarko Lange， Jakob Foerster， Jeff Clune和 David Ha
组织: Sakana AI，FLAIR，牛津大学，英属哥伦比亚大学，加拿大 Vector 研究所，加拿大CIFAR AI Chair
One of the grand challenges of artificial general intelligence is developing agents capable of conducting scientific research and discovering new knowledge. While frontier models have already been used as aides to human scientists, e.g. for brainstorming ideas, writing code, or prediction tasks, they still conduct only a small part of the scientific process. This paper presents the first comprehensive framework for fully automatic scientific discovery, enabling frontier large language models to perform research independently and communicate their findings. We introduce The AI Scientist, which generates novel research ideas, writes code, executes experiments, visualizes results, describes its findings by writing a full scientific paper, and then runs a simulated review process for evaluation. In principle, this process can be repeated to iteratively develop ideas in an open-ended fashion, acting like the human scientific community. We demonstrate its versatility by applying it to three distinct subfields of machine learning: diffusion modeling, transformer-based language modeling, and learning dynamics. Each idea is implemented and developed into a full paper at a cost of less than $15 per paper. To evaluate the generated papers, we design and validate an automated reviewer, which we show achieves near-human performance in evaluating paper scores. The AI Scientist can produce papers that exceed the acceptance threshold at a top machine learning conference as judged by our automated reviewer. This approach signifies the beginning of a new era in scientific discovery in machine learning: bringing the transformative benefits of AI agents to the entire research process of AI itself, and taking us closer to a world where endless affordable creativity and innovation can be unleashed on the world’s most challenging problems. Our code is open-sourced at this https URL
通用人工智能的重大挑战之一是开发能够进行科学研究和发现新知识的代理。虽然前沿模型已经被用作人类科学家的助手，例如用于头脑风暴、编写代码或预测任务，但它们仍然只执行科学过程的一小部分。本文提出了第一个全自动科学发现的综合框架，使前沿大型语言模型能够独立进行研究并交流其发现。我们介绍了 The AI Scientist，它产生新颖的研究思路、编写代码、执行实验、可视化结果、通过撰写完整的科学论文来描述其发现，然后运行模拟审查流程进行评估。原则上，可以重复此过程，以开放式方式迭代开发想法，就像人类科学界一样。我们通过将其应用于机器学习的三个不同的子领域来展示它的多功能性：扩散建模、基于 transformer 的语言建模和学习动力学。每个想法都被实施并发展成一篇完整的论文，每篇论文的成本不到 15 美元。为了评估生成的论文，我们设计并验证了一个自动审稿人，我们表明它在评估论文分数方面达到了接近人类的表现。AI 科学家可以撰写超过顶级机器学习会议接受阈值的论文，由我们的自动审稿人评判。这种方法标志着机器学习科学发现新时代的开始：将 AI 代理的变革性优势引入 AI 本身的整个研究过程，并让我们更接近一个可以释放无穷无尽、负担得起的创造力和创新来解决世界上最具挑战性的问题的世界。
https://github.com/SakanaAI/AI-Scientist

https://img.zhaoweiguo.com/uPic/2024/08/Hg4vtK.png

AI Scientist 有 4 个主要流程:

1. Idea Generation.
    创意生成。
    给定一个起始模板，AI 科学家首先“头脑风暴”一组不同的新研究方向。
    我们为 The AI Scientist 提供了希望 AI Scientist 进一步探索的现有主题的起始代码“模板”。
    然后，AI 科学家可以自由探索任何可能的研究方向。
2. Experimental Iteration.
    实验迭代。
    给定一个想法和一个模板，AI Scientist 的第二阶段首先执行提议的实验，然后获取并生成绘图以可视化其结果。
    它记下了每个图包含的内容，使保存的图表和实验注释能够提供撰写论文所需的所有信息。
3. Paper Write-up.
    论文写作。
    最后，The AI Scientist 以 LaTeX 中标准机器学习会议的风格，对其进展进行了简明扼要且内容丰富的文章。
    它使用 Semantic Scholar 自主查找要引用的相关论文。
4. Automated Paper Reviewing.
    自动论文审阅。
    这项工作的一个关键方面是开发一个自动化LLM的审稿人，能够以接近人类的准确性评估生成的论文。
    生成的评论可用于改进项目或作为对后代的反馈，以进行开放式构思。这实现了一个持续的反馈循环，使 The AI Scientist 能够迭代地改进其研究成果。