2406.13213_Multi-Meta-RAG: Improving RAG for Multi-Hop Queries using Database Filtering with LLM-Extracted Metadata¶

https://arxiv.org/abs/2406.13213
作者：米哈伊洛·波利亚科夫和纳迪亚·什瓦伊
组织：基辅国立大学莫希拉学院
The retrieval-augmented generation (RAG) enables retrieval of relevant information from an external knowledge source and allows large language models (LLMs) to answer queries over previously unseen document collections. However, it was demonstrated that traditional RAG applications perform poorly in answering multi-hop questions, which require retrieving and reasoning over multiple elements of supporting evidence. We introduce a new method called Multi-Meta-RAG, which uses database filtering with LLM-extracted metadata to improve the RAG selection of the relevant documents from various sources, relevant to the question. While database filtering is specific to a set of questions from a particular domain and format, we found out that Multi-Meta-RAG greatly improves the results on the MultiHop-RAG benchmark. The code is available at this https URL.
检索增强生成（RAG）支持从外部知识源检索相关信息，并允许大型语言模型（LLMs）回答对以前未可见的文档集合的查询。然而，研究表明，传统的 RAG 应用程序在回答多跳问题方面表现不佳，这些问题需要检索和推理支持证据的多个元素。我们引入了一种称为 Multi-Meta-RAG 的新方法，该方法使用数据库过滤和LLM提取的元数据来改进 RAG 从各种来源中选择与问题相关的相关文档。当使用数据库过滤来自特定领域和格式的一组问题，我们发现 Multi-Meta-RAG 极大地改善了 MultiHop-RAG 基准测试的结果。

https://img.zhaoweiguo.com/uPic/2024/08/r67U8G.png — A naive RAG implementation for MultiHop-RAG queries. RAG selects chunks from articles not asked in the example query, which leads to LLM giving a wrong response.¶

https://img.zhaoweiguo.com/uPic/2024/08/jPLScr.png — Multi-Meta-RAG: an improved RAG with database filtering using metadata. Metadata is extracted via secondary LLM. With filtering, we can ensure top-K chunks are always from relevant sources with better chances of getting correct overall responses.¶

主页

索引

模块索引

搜索页面