Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit 9adb41d

Browse files
committed
Create RAG boilerplate using langchain
1 parent eb60836 commit 9adb41d

File tree

2 files changed

+82
-0
lines changed

2 files changed

+82
-0
lines changed
Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
from langchain_community.llms import Ollama
2+
from langchain_community.document_loaders import PyPDFLoader
3+
from langchain_community.embeddings import OllamaEmbeddings
4+
from langchain_community.vectorstores import FAISS
5+
from langchain_core.prompts import ChatPromptTemplate
6+
from langchain_text_splitters import RecursiveCharacterTextSplitter
7+
from langchain.chains.combine_documents import create_stuff_documents_chain
8+
from langchain.chains import create_retrieval_chain
9+
10+
def create_RAG_model(input_file, llm):
11+
# Create the LLM (Large Language Model)
12+
llm = Ollama(model="dolphin-phi")
13+
# Define model used to embed the info
14+
embeddings = OllamaEmbeddings(model="nomic-embed-text")
15+
# Load the PDF
16+
loader = PyPDFLoader(input_file)
17+
doc = loader.load()
18+
# Split the text and embed it into the vector DB
19+
text_splitter = RecursiveCharacterTextSplitter()
20+
split = text_splitter.split_documents(doc)
21+
vector_store = FAISS.from_documents(split, embeddings)
22+
23+
24+
# Prompt generation: Giving the LLM character and purpose
25+
prompt = ChatPromptTemplate.from_template(
26+
"""
27+
Answer the following questions only based on the given context
28+
29+
<context>
30+
{context}
31+
</context>
32+
33+
Question: {input}
34+
"""
35+
)
36+
# Linking the LLM, vector DB and the prompt
37+
docs_chain = create_stuff_documents_chain(llm, prompt)
38+
retriever = vector_store.as_retriever()
39+
retrieval_chain = create_retrieval_chain(retriever, docs_chain)
40+
return retrieval_chain
41+
42+
# Using the retrieval chain
43+
# Example:
44+
45+
'''
46+
chain = create_RAG_model("your_file_here.pdf", "mistral")
47+
output = chain.invoke({"input":"What is the purpose of RAG?"})
48+
print(output["answer"])
49+
'''
50+
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# Rag boilerplate in Python using LangChain #
2+
3+
### About RAG ###
4+
Hi there! This is a RAG(Retrieval Augmented Generation) boilerplate/template in Python. RAG is an amazing technology that links input sources(PDFs in this case) to LLMs(Large Language Models). Then, the LLMs gain the ability to answer questions that they initially didn't know about.
5+
6+
### How RAG works ###
7+
RAG works by splitting the input file(s) into semantically related chunks and embeds these chunks into a vector database. A vector database stores data in a mathematical representation to enable speedy access(of course, machines are more fluent in math). When a prompt or query is given by the user, it gets embedded into the vector DB too. Then, the data stored alongside the query are returned(the data stored alongside the query are the most related because the data is embedded semantically).
8+
9+
### Installation and use ###
10+
1) Install dependencies
11+
12+
```
13+
pip3 install langchain-community
14+
pip3 install langchain-core
15+
pip3 install langchain-text-splitters
16+
pip3 install langchain
17+
```
18+
19+
(excuse me if you find any missing dependencies)
20+
21+
2) Use the template
22+
23+
Refer to the commented code at the end of the ```RAG_boilerplate.py``` file and modify it to suit your needs (remember to uncomment the code block).
24+
25+
3) Download a PDF input file and execute
26+
27+
Download a PDF to implement RAG on (also, specify it in the code) and execute it using ```python3 RAG_boilerplate.py```
28+
29+
4) Happy learning!
30+
31+
32+

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /