Newest 'rag' Questions

1. Home
2. Questions
3. AI Assist
4. Tags
5. Challenges
6. Chat
7. Articles
8. Users
9. Companies
11. Communities for your favorite technologies. Explore all Collectives
Stack Internal

Stack Overflow for Teams is now called Stack Internal. Bring the best of human thought and AI automation together at your work.
Try for free Learn more
Bring the best of human thought and AI automation together at your work. Learn more

145 questions

-2 votes

0 answers

67 views

Facing Issue while developing natural language T-SQL generation [closed]

I am developing a NL2SQL model, and for this I am using defog/sqlcoder-7b. While working with this model, I am facing issues where it is not able to generate complex SQL queries, especially queries ...

Mihir Kansara's user avatar

Mihir Kansara

asked Dec 22 at 6:34

1 vote

0 answers

36 views

Is there a way in MCP to stream a LLM response chunk by chunk back to the client?

I'm using FastMCP in python to implement a MCP server. Currently I run into a problem when it comes to streaming of the generated tokens from the LLM. I don't want to wait for the completed response ...

Daniel's user avatar

Daniel

asked Dec 18 at 10:40

0 votes

0 answers

84 views

Recitation issues with gemini file search API

I’m using the Gemini File Search API, but quite often the model responds with an error or warning about "recitation", and the answer gets cut off or completely withheld. I’m not entirely sure what ...

Amin Mokadem's user avatar

Amin Mokadem

asked Dec 16 at 12:17

-1 votes

0 answers

119 views

LangChain ≥1.0.0: What is the replacement for create_react_agent / create_tool_calling_agent in 1.1.3?

I trying out LangChain for my project,but most of the documentation and blogs use classic version LangChain 1.1.3 and I am confused about what the replacement is for agent creation APIs. What worked ...

Mandar Mawale's user avatar

Mandar Mawale

asked Dec 14 at 10:50

Tooling

0 votes

0 replies

62 views

How to use SelfQueryRetriever in the recents versions of Langchain?

I'm trying to use metadata in RAG systems using LangChain. I see a lot of tutorials using SelfQueryRetriever, but it appears that this was deprecated in recent versions. Is this correct? I couldn't ...

Augusto Firmo's user avatar

Augusto Firmo

asked Dec 12 at 20:49

Advice

2 votes

2 replies

78 views

RAG with Pinecone + GPT-5 for generating new math problems: incoherent outputs, mixed chunks, and lack of originality

I’m building a tool that generates new mathematics exam problems using an internal database of past problems. My current setup uses a RAG pipeline, Pinecone as the vector database, and GPT-5 as the ...

Marc-Loïc Abena's user avatar

Marc-Loïc Abena

asked Nov 29 at 19:22

Best practices

1 vote

2 replies

142 views

Regarding rag for telephony with deepgram

I'm building a voice-based calling system where users can create AI agents that make outbound phone calls. The agent uses Deepgram for real-time transcription and ElevenLabs/Cartesia for speech ...

Sarthak Sahu's user avatar

Sarthak Sahu

asked Nov 15 at 9:35

Advice

0 votes

1 replies

59 views

How can I group transcribed phrases into meaningful chunks without using complex models?

I have a large set of phrases obtained via Azure Fast Transcription, and I need to group them into coherent semantic chunks (to use later in a RAG pipeline). Initially, I tried grouping phrases based ...

Daniel's user avatar

Daniel

asked Nov 6 at 10:18

0 votes

0 answers

26 views

How to exclude metadata from embedding?

I'm using LlamaIndex 0.14.7. I would like to embed document text without concatenating metadata, because I put a long text in metadata. Here's my code: table_vec_store: SimpleVectorStore = ...

Trams's user avatar

Trams

asked Nov 6 at 9:31

0 votes

0 answers

60 views

Langchain RAG is not retrieving any document

This is my embedding code, which I run once only: embeddings = OpenAIEmbeddings(model="text-embedding-3-large") vector_store = MongoDBAtlasVectorSearch.from_connection_string( ...

Mingruifu Lin's user avatar

Mingruifu Lin

asked Oct 29 at 17:00

1 vote

1 answer

153 views

Why does answer_relevancy return NaN when evaluating RAG with Ragas?

I’m trying to evaluate my Retrieval-Augmented Generation (RAG) pipeline using Ragas. . Here’s a complete version of my code: """# RAG Evaluation""" from datasets import ...

Chandima's user avatar

Chandima

asked Sep 25 at 8:51

0 votes

1 answer

77 views

Chroma not accepting lists in persistentClient collection?

My objective is to do keyword filtering in Chroma. I have a field called keywords with a list of strings and I want to filter with it, but chroma won't let me add lists as a field. I checked my Chroma ...

Elena López-Negrete Burón's user avatar

Elena López-Negrete Burón

asked Sep 23 at 10:37

1 vote

0 answers

54 views

Why does my LangChain RAG chatbot sometimes miss relevant chunks in semantic search?

I built a RAG chatbot using LangChain + ChromaDB + OpenAI embeddings. The pipeline works, but sometimes the chatbot doesn’t return the most relevant PDF content, even though it exists in the vector DB....

Naitik Mittal's user avatar

Naitik Mittal

asked Sep 21 at 15:20

1 vote

0 answers

75 views

RAG Chatbot does not answer paraphrased questions

I built a RAG chatbot in python,langchain, and FAISS for the vectorstore. And the data is stored as JSON. The chatbot sometimes refuses to answer when a question is rephrased. Here are two ...

SoftwareEngineer's user avatar

SoftwareEngineer

asked Sep 20 at 13:54

0 votes

0 answers

31 views

RAG Pipeline Memory Leak - Vector Embeddings Not Releasing After Context Switch in Memo AI

Question: I'm building a memory-augmented AI system using RAG with persistent vector storage, but facing memory leaks and context contamination between sessions. Problem: Vector embeddings aren't ...

TensorMind's user avatar

TensorMind

asked Sep 18 at 8:20

15 30 50 per page

2 3 4 5

...

10 Next

CollectivesTM on Stack Overflow

Facing Issue while developing natural language T-SQL generation [closed]

Is there a way in MCP to stream a LLM response chunk by chunk back to the client?

Recitation issues with gemini file search API

LangChain ≥1.0.0: What is the replacement for create_react_agent / create_tool_calling_agent in 1.1.3?

How to use SelfQueryRetriever in the recents versions of Langchain?

RAG with Pinecone + GPT-5 for generating new math problems: incoherent outputs, mixed chunks, and lack of originality

Regarding rag for telephony with deepgram

How can I group transcribed phrases into meaningful chunks without using complex models?

How to exclude metadata from embedding?

Langchain RAG is not retrieving any document

Why does answer_relevancy return NaN when evaluating RAG with Ragas?

Chroma not accepting lists in persistentClient collection?

Why does my LangChain RAG chatbot sometimes miss relevant chunks in semantic search?

RAG Chatbot does not answer paraphrased questions

RAG Pipeline Memory Leak - Vector Embeddings Not Releasing After Context Switch in Memo AI

Hot Network Questions