Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit 78b5f44

Browse files
Create 6.RecursiveReferenceRAG.md
1 parent f6348cc commit 78b5f44

File tree

1 file changed

+88
-0
lines changed

1 file changed

+88
-0
lines changed
Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
Sure, here's an example implementation using Python and Langchain to handle document references in a RAG architecture:
2+
3+
```python
4+
from langchain.document_loaders import TextLoader
5+
from langchain.embeddings import HuggingFaceEmbeddings
6+
from langchain.vectorstores import Chroma
7+
from langchain.chains import RetrievalQA
8+
from langchain.llms import HuggingFaceHub
9+
10+
class DocumentReferenceRAG:
11+
def __init__(self, documents):
12+
self.documents = documents
13+
self.embeddings = HuggingFaceEmbeddings()
14+
self.vectorstore = Chroma.from_documents(self.documents, self.embeddings)
15+
self.llm = HuggingFaceHub(repo_id="google/flan-t5-xl")
16+
self.qa = RetrievalQA.from_chain_type(llm=self.llm, chain_type="stuff", retriever=self.vectorstore.as_retriever())
17+
18+
def answer_question(self, question, max_recursion_depth=3):
19+
return self._recursive_answer(question, max_recursion_depth)
20+
21+
def _recursive_answer(self, question, max_recursion_depth, processed_docs=None):
22+
if processed_docs is None:
23+
processed_docs = set()
24+
25+
result = self.qa.run(question)
26+
processed_docs.add(result.source_documents[0].metadata['source'])
27+
28+
for doc in result.source_documents:
29+
if 'referenced_docs' in doc.metadata:
30+
for ref_doc_link in doc.metadata['referenced_docs']:
31+
if ref_doc_link not in processed_docs and max_recursion_depth > 0:
32+
ref_doc = self._retrieve_document(ref_doc_link)
33+
if ref_doc:
34+
self.documents.append(ref_doc)
35+
self.vectorstore = Chroma.from_documents(self.documents, self.embeddings)
36+
self.qa = RetrievalQA.from_chain_type(llm=self.llm, chain_type="stuff", retriever=self.vectorstore.as_retriever())
37+
result = self._recursive_answer(question, max_recursion_depth - 1, processed_docs)
38+
break
39+
40+
return result
41+
42+
def _retrieve_document(self, doc_link):
43+
# Implement document retrieval logic based on the provided link
44+
# For example, load the document from a file or database
45+
loader = TextLoader(doc_link)
46+
return loader.load()[0]
47+
48+
# Example usage
49+
doc1 = TextLoader('doc1.txt').load()[0]
50+
doc2 = TextLoader('doc2.txt').load()[0]
51+
doc3 = TextLoader('doc3.txt').load()[0]
52+
doc4 = TextLoader('doc4.txt').load()[0]
53+
doc5 = TextLoader('doc5.txt').load()[0]
54+
55+
rag = DocumentReferenceRAG([doc1, doc2, doc3, doc4, doc5])
56+
question = "What is the relationship between document 1 and document 3?"
57+
answer = rag.answer_question(question)
58+
print(answer)
59+
```
60+
61+
In this example:
62+
63+
1. The `DocumentReferenceRAG` class is defined to handle the recursive retrieval and processing of documents.
64+
65+
2. The `__init__` method initializes the necessary components:
66+
- Loads the initial set of documents
67+
- Creates document embeddings using HuggingFaceEmbeddings
68+
- Stores the documents in a Chroma vector store
69+
- Sets up the LLM (HuggingFaceHub) and RetrievalQA chain
70+
71+
3. The `answer_question` method takes a question and an optional maximum recursion depth. It calls the `_recursive_answer` method to generate the answer.
72+
73+
4. The `_recursive_answer` method implements the recursive retrieval process:
74+
- Generates an initial answer using the RetrievalQA chain
75+
- Checks if the generated answer references any other documents
76+
- If referenced documents are found, recursively retrieves them using the `_retrieve_document` method
77+
- Appends the retrieved documents to the document collection and updates the vector store and RetrievalQA chain
78+
- Repeats the process until no new referenced documents are found or the maximum recursion depth is reached
79+
80+
5. The `_retrieve_document` method is a placeholder for the actual document retrieval logic. In this example, it loads the document from a file using the `TextLoader` from Langchain.
81+
82+
6. In the example usage, five documents are loaded, and the `DocumentReferenceRAG` class is instantiated with these documents.
83+
84+
7. A question is asked, and the `answer_question` method is called to generate the final answer, considering the referenced documents.
85+
86+
This implementation demonstrates how to extend a RAG architecture to handle document references using Langchain. The recursive retrieval process ensures that all relevant documents are considered when answering questions, even if they are referenced within other documents.
87+
88+
Remember to customize the `_retrieve_document` method to match your specific document storage and retrieval mechanism. Additionally, you may want to add more error handling and optimization techniques based on your requirements.

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /