Skip to main content
Stack Overflow
  1. About
  2. For Teams
Filter by
Sorted by
Tagged with
1 vote
1 answer
70 views

Description I am creating a local LLM app, which works OK, but on response it stops at 170 words which are = 256 tokens. The models I have tried are so far: Meta-Llama-3.1-8B-Instruct-Q8_0.gguf and ...
4 votes
2 answers
738 views

I'm trying to install the LLaMA 3.1 8B model by following the instructions in the llamamodel GitHub README. When I run the command: llama-model download --source meta --model-id CHOSEN_MODEL_ID (...
0 votes
0 answers
112 views

when the program starts to initialize pipeline object, a unexpected error was thrown: [rank0]: Traceback (most recent call last): [rank0]: File "/root/anaconda3/envs/polar/lib/python3.12/site-...
0 votes
0 answers
55 views

I am doing some tests using Ollama on local computer, with Llama 3.2, which consists in prompting a task against a document. I read that after having reached maximum context, I should restart the ...
0 votes
0 answers
51 views

I am using the llama-8b-llava model. I have made some modifications to the model, which are non-structural and do not introduce any parameters. During the model loading process, I used the torch....
1 vote
1 answer
166 views

I have the following imports for a python file thats meant to be a multi llm agent soon. I wanted to use llama_index and I found a nice video from Tech with Tim which explains everything very well. I ...
1 vote
0 answers
120 views

I’ve been working on fine-tuning LLaMA 2–7B using QLoRA with bitsandbytes 4-bit quantization and ran into a weird issue. I did adaptive pretraining on Arabic data with a custom tokenizer (vocab size ~...
1 vote
0 answers
221 views

I am trying to set up local, high speed NLP but am failing to install the arm64 version of llama-cpp-python. Even when I run CMAKE_ARGS="-DLLAMA_METAL=on -DLLAMA_METAL_EMBED_LIBRARY=on" \ ...
2 votes
1 answer
184 views

I'm studying the llama_cookbok repo, in particular their finetuning example. This example uses LlamaForCausalLM model and samsum_dataset (input: dialog, output: summary). Now, looking at how they ...
0 votes
0 answers
61 views

I wanted to make a web app that uses llama-index to answer queries using RAG from specific documents. I have locally set up Llama3.2-1B-instruct llm and using that locally to create indexes of the ...
0 votes
0 answers
116 views

I use the following command to compile an executable file for Android: cmake \ -DCMAKE_TOOLCHAIN_FILE=$ANDROID_NDK/build/cmake/android.toolchain.cmake \ -DANDROID_ABI=arm64-v8a \ -...
1 vote
0 answers
173 views

I'm working with a locally hosted Hugging Face transformers model (mistral-7b, llama2-13b, etc.), using the pipeline interface on a GPU server (A100). Sometimes inference takes much longer than ...
2 votes
1 answer
91 views

I have a long chunk of text that I need to process using a transformer, I would then like to have users ask different questions about it (all questions are independent, they don't relate to each other)...
1 vote
1 answer
244 views

I am experimenting with Llama-3.2-1B-Instruct for learning purposes. When I try to implement a simple re-write task with Hugging Face transformers, I get a weird result when the model does not ...
1 vote
1 answer
88 views

My goal is to create a chat bot specialized in answering questions related to diabetes. I am new to fine tuning and have a couple questions before I begin. My question is about the dataset format and ...

15 30 50 per page
1
2 3 4 5
...
25

AltStyle によって変換されたページ (->オリジナル) /