49 questions
- Bountied 0
- Unanswered
- Frequent
- Score
- Trending
- Week
- Month
- Unanswered (my tags)
4
votes
2
answers
752
views
No module named 'llama_models.cli.model' error while llama 3.1 8B downloading
I'm trying to install the LLaMA 3.1 8B model by following the instructions in the llamamodel GitHub README. When I run the command:
llama-model download --source meta --model-id CHOSEN_MODEL_ID
(...
0
votes
0
answers
55
views
Running Ollama on local computer and prompting from jupyter notebook - does the model recall prior prompts like if it was the same chat?
I am doing some tests using Ollama on local computer, with Llama 3.2, which consists in prompting a task against a document.
I read that after having reached maximum context, I should restart the ...
0
votes
0
answers
50
views
Custom NER to extract header, request and response from API document
I'm trying to extract API integration parameters like Authorization headers, query params, and request body fields from API documentation. This is essentially a custom NER task.
I’ve experimented with ...
0
votes
1
answer
149
views
LLM-Agent: Tool calling problem after conversion from HuggingFace to Ollama for llama stack
I am using llama stack (https://llama-stack.readthedocs.io/en/latest/) and as provider of models to interact with Ollama.
At first I used tool calling from models directly downloaded from Ollama. ...
0
votes
0
answers
99
views
How to implement context-aware tool routing with local models like Ollama?
I'm using a locally hosted model(llama3.2) with Ollama and trying to replicate functionality similar to bind_tools(to create and run the tools with LLM ) for tool calling.
This is my model service
...
1
vote
0
answers
239
views
Multi MCP Tool Servers Issue with llama-3-3-70b-instruct
I'm following codes from links:
https://github.com/jalr4ever/Tiny-OAI-MCP-Agent/blob/main/mcp_client.py
https://github.com/philschmid/mcp-openai-gemini-llama-example/blob/master/...
0
votes
1
answer
135
views
WASM LlamaEdge won't use GPU; problem fix or change tools?
So I'm trying to toss together a little demo that is essentially: 1) generate some text live and save to a file (I've got this working), 2) have a local instance of an LLM running (Llama3 in this case)...
0
votes
0
answers
596
views
passing correct context to the model via the Ollama api
I am teaching myself LLM programming by developing a RAG application. I am running Llama 3.2 on my laptop using Ollama, and using a mix of SQLite and langchain.
I can pass a context to the llm along ...
0
votes
0
answers
30
views
Encountering problem while fine tuning Llama3.1 using custom dataset with Lora
I am learning to fine tune Llama3.1 on a custom dataset.I have converted my dataset to a hugging face dataset.By evaluating directly using the model gives accuracy of 80%.Now when i am trying to fine ...
0
votes
0
answers
350
views
Repetition Issues in Llama Models (3:8B, 3:70B, 3.1, 3.2)
I'm extracting Inputs, Outputs, and Summaries from large legacy codebases (COBOL, RPG), but facing repetition issues, especially when generating bullet points. Summaries work fine, but sections like ...
0
votes
1
answer
136
views
llama31 - Results from tool ignored
I am communicating with ollama (llama3.1b) and have it respond with a tool call that I can resolve. However - I am struggling with the final call to ollama that would resolve the orginal question. I ...
1
vote
1
answer
454
views
Unable to get llama3 to serve json reponse on a local ollama installaiton using jupyter notebook
On a windows 11 machine, I am trying to get a json reponse from the llama3 model on my local ollama installation on jupyter notebook but it does not work
Steps I tried:
This below snippet works
...
0
votes
1
answer
226
views
llama3 responding only function call?
I am trying to make Llama3 Instruct able to use function call from tools , it does work but now it is answering only function call! if I ask something like who are you ? or what is apple device ? it ...
1
vote
0
answers
3k
views
How can I accurately count tokens for Llama3/DeepSeek r1 prompts when Groq API reports "Request too large"?
I'm integrating the Groq API in my Flask application to classify social media posts using a model based on DeepSeek r1 (e.g., deepseek-r1-distill-llama-70b). I build a prompt by combining multiple ...
0
votes
0
answers
143
views
How does batch option work in pipeline transformers library
I have a collection of news articles and I want to produce some new (unbiased) news articles using meta-llama/Meta-Llama-3-8B-Instruct. The articles are in a huggingface Dataset and to feed the ...