145 questions
- Bountied 0
- Unanswered
- Frequent
- Score
- Trending
- Week
- Month
- Unanswered (my tags)
-2
votes
0
answers
67
views
Facing Issue while developing natural language T-SQL generation [closed]
I am developing a NL2SQL model, and for this I am using defog/sqlcoder-7b. While working with this model, I am facing issues where it is not able to generate complex SQL queries, especially queries ...
1
vote
0
answers
36
views
Is there a way in MCP to stream a LLM response chunk by chunk back to the client?
I'm using FastMCP in python to implement a MCP server. Currently I run into a problem when it comes to streaming of the generated tokens from the LLM. I don't want to wait for the completed response ...
0
votes
0
answers
84
views
Recitation issues with gemini file search API
I’m using the Gemini File Search API, but quite often the model responds with an error or warning about "recitation", and the answer gets cut off or completely withheld.
I’m not entirely sure what ...
-1
votes
0
answers
119
views
LangChain ≥1.0.0: What is the replacement for create_react_agent / create_tool_calling_agent in 1.1.3?
I trying out LangChain for my project,but most of the documentation and blogs use classic version LangChain 1.1.3 and I am confused about what the replacement is for agent creation APIs.
What worked ...
Tooling
0
votes
0
replies
62
views
How to use SelfQueryRetriever in the recents versions of Langchain?
I'm trying to use metadata in RAG systems using LangChain. I see a lot of tutorials using SelfQueryRetriever, but it appears that this was deprecated in recent versions. Is this correct? I couldn't ...
Advice
2
votes
2
replies
78
views
RAG with Pinecone + GPT-5 for generating new math problems: incoherent outputs, mixed chunks, and lack of originality
I’m building a tool that generates new mathematics exam problems using an internal database of past problems.
My current setup uses a RAG pipeline, Pinecone as the vector database, and GPT-5 as the ...
Best practices
1
vote
2
replies
142
views
Regarding rag for telephony with deepgram
I'm building a voice-based calling system where users can create AI agents that make outbound phone calls.
The agent uses Deepgram for real-time transcription and ElevenLabs/Cartesia for speech ...
Advice
0
votes
1
replies
59
views
How can I group transcribed phrases into meaningful chunks without using complex models?
I have a large set of phrases obtained via Azure Fast Transcription, and I need to group them into coherent semantic chunks (to use later in a RAG pipeline).
Initially, I tried grouping phrases based ...
0
votes
0
answers
26
views
How to exclude metadata from embedding?
I'm using LlamaIndex 0.14.7. I would like to embed document text without concatenating metadata, because I put a long text in metadata. Here's my code:
table_vec_store: SimpleVectorStore = ...
0
votes
0
answers
60
views
Langchain RAG is not retrieving any document
This is my embedding code, which I run once only:
embeddings = OpenAIEmbeddings(model="text-embedding-3-large")
vector_store = MongoDBAtlasVectorSearch.from_connection_string(
...
1
vote
1
answer
153
views
Why does answer_relevancy return NaN when evaluating RAG with Ragas?
I’m trying to evaluate my Retrieval-Augmented Generation (RAG) pipeline using Ragas.
.
Here’s a complete version of my code:
"""# RAG Evaluation"""
from datasets import ...
0
votes
1
answer
77
views
Chroma not accepting lists in persistentClient collection?
My objective is to do keyword filtering in Chroma. I have a field called keywords with a list of strings and I want to filter with it, but chroma won't let me add lists as a field.
I checked my Chroma ...
1
vote
0
answers
54
views
Why does my LangChain RAG chatbot sometimes miss relevant chunks in semantic search?
I built a RAG chatbot using LangChain + ChromaDB + OpenAI embeddings. The pipeline works, but sometimes the chatbot doesn’t return the most relevant PDF content, even though it exists in the vector DB....
1
vote
0
answers
75
views
RAG Chatbot does not answer paraphrased questions
I built a RAG chatbot in python,langchain, and FAISS for the vectorstore.
And the data is stored as JSON.
The chatbot sometimes refuses to answer when a question is rephrased.
Here are two ...
0
votes
0
answers
31
views
RAG Pipeline Memory Leak - Vector Embeddings Not Releasing After Context Switch in Memo AI
Question:
I'm building a memory-augmented AI system using RAG with persistent vector storage, but facing memory leaks and context contamination between sessions.
Problem:
Vector embeddings aren't ...