常用 #### * `Faiss `_: Faiss 是一个Facebook开发的开源向量索引库,它提供了一些非常快速和高效的近似最近邻搜索算法。虽然它本身不是一个数据库系统,但可以与其他数据库集成,以支持向量检索。 * Weaviate: https://weaviate.io/ * `Annoy `_: Annoy 是一个小巧而快速的库,专门用于近似最近邻搜索。虽然它不是一个数据库,但你可以将其用于构建自定义向量索引。 * Pinecone: https://www.pinecone.io/ - Langchain AI Handbook: https://www.pinecone.io/learn/langchain/ Chroma ====== * Chroma(Chroma is the open-source embedding database. Chroma makes it easy to build LLM apps by making knowledge, facts, and skills pluggable for LLMs): https://docs.trychroma.com/ * quickstart:: # 1. Install pip install chromadb # 2. Get the Chroma Client import chromadb chroma_client = chromadb.Client() # 3. Create a collection collection = chroma_client.create_collection(name="my_collection") # 4. Add some text documents to the collection collection.add( documents=["This is a document", "This is another document"], metadatas=[{"source": "my_source"}, {"source": "my_source"}], ids=["id1", "id2"] ) # or collection.add( embeddings=[[1.2, 2.3, 4.5], [6.7, 8.2, 9.2]], documents=["This is a document", "This is another document"], metadatas=[{"source": "my_source"}, {"source": "my_source"}], ids=["id1", "id2"] ) # 5. Query the collection results = collection.query( query_texts=["This is a query document"], n_results=2 )