Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Improve DLL/shared library stability by replacing static initializers #16717

gianmichelebertoni started this conversation in Ideas
Discussion options

Hi llama.cpp team,

I'm writing to propose a couple of small changes to improve the library's stability when it's loaded as a DLL/shared library, particularly from non-C++ environments like the JVM (via JNA/JNI) or perhaps Python (via ctypes).

The Problem: Unreliable Static Initialization in Foreign Hosts

When a C++ DLL is loaded into a "foreign" process (like java.exe), the C++ Runtime's automatic static initialization mechanism can be unreliable. This is a known issue in complex plugin architectures.

The result is that global static objects can be left in an uninitialized state. For llama.cpp, this can lead to crashes when, for example, a static std::mutex is used before its constructor has been called. I have verified this behavior by loading the library from a JVM environment, which results in a memory access violation.

Proposed Solutions

I've identified a couple of areas where the problematic implicit static initialization can be replaced with more robust, explicit patterns that work reliably in any environment.

Case 1: Lazy, Thread-Safe Initialization with std::call_once

Location: ggml/src/ggml-cuda/ggml-cuda.cu lines 3848-3853
Current Code:

 static bool initialized = false;
 {
 static std::mutex mutex;
 std::lock_guard<std::mutex> lock(mutex);
 if (!initialized) {
 //rest of the code
 }

Proposed Change:

 {
 static std::once_flag init_flag;
 std::call_once(init_flag, []() {
 //same code
 });

Rationale: This pattern moves initialization from the unreliable library load time to a thread-safe, on-demand call the first time the resource is needed. It completely bypasses the static initialization problem.

Case 2: Safely Initializing a Global Lock

A static std::mutex is a common point of failure. A more robust approach is to use a function-local static, which is guaranteed to be initialized in a thread-safe manner since C++11 (the "Construct on First Use Idiom").

Location: ggml/src/ggml-threading.cpp
Current Code:

std::mutex ggml_critical_section_mutex;
void ggml_critical_section_start() {
 ggml_critical_section_mutex.lock();
}
void ggml_critical_section_end(void) {
 ggml_critical_section_mutex.unlock();
}

Proposed Change:

#ifndef _WIN32
namespace {
 static std::mutex * get_critical_section_mutex() {
 static std::mutex * mtx = new std::mutex();
 return mtx;
 }
} // namespace
void ggml_critical_section_start() {
 get_critical_section_mutex()->lock();
}
void ggml_critical_section_end() {
 get_critical_section_mutex()->unlock();
}
#else
#include <windows.h>
namespace {
 struct CriticalSectionWrapper {
 CRITICAL_SECTION cs;
 CriticalSectionWrapper() { InitializeCriticalSection(&cs); }
 ~CriticalSectionWrapper() { DeleteCriticalSection(&cs); }
 };
 static CriticalSectionWrapper * get_cs_wrapper() {
 static CriticalSectionWrapper * wrapper = new CriticalSectionWrapper();
 return wrapper;
 }
} // namespace
void ggml_critical_section_start() {
 EnterCriticalSection(&get_cs_wrapper()->cs);
}
void ggml_critical_section_end() {
 LeaveCriticalSection(&get_cs_wrapper()->cs);
}
#endif

Rationale: This modern C++ idiom ensures the mutex is constructed correctly and only once, the first time it's needed. It's clean, efficient, and completely avoids the global static initialization issue. For older compilers, std::call_once could also be used to initialize a static pointer to the mutex. The Windows-specific code is due to a known bug in some CRTs bundled with, for example, recent JVMs (the bugfix came with VS2022 17.10)

DeepWiki problem analysis:

You're experiencing crashes due to static initialization order fiasco (SIOF) across DLL boundaries, specifically with static std::mutex objects in ggml-threading.cpp and ggml-cuda.cu.
Problem Analysis

The crashes occur at these locations:

ggml-threading.cpp: The global ggml_critical_section_mutex at line 4 ggml-threading.cpp:4 is a static mutex that may not be initialized when accessed across DLL boundaries ggml-threading.cpp:6-8 .

ggml-cuda.cu: Multiple static mutexes are used in backend registration functions:
ggml_cuda_lock at line 526 ggml-cuda.cu:526
Static mutex in ggml_backend_cuda_buffer_type() at line 734 ggml-cuda.cu:734-735
Static mutex in ggml_backend_cuda_split_buffer_type() at line 1051 ggml-cuda.cu:1051-1052
Static mutex in ggml_backend_cuda_reg() at line 3851 ggml-cuda.cu:3851-3852

Similar patterns exist in other backends like CANN ggml-cann.cpp:2972-2973 , SYCL ggml-sycl.cpp:4575-4576 , and OpenCL ggml-opencl.cpp:4022-4025 .

Thanks for considering these suggestions. I believe they would make llama.cpp even more robust and versatile.

Best regards,

GM

You must be logged in to vote

Replies: 1 comment 1 reply

Comment options

The proposed changes look good to me.

You must be logged in to vote
1 reply
Comment options

Thank you for considering my contribution. My suggestions are based on a very limited use of llama.cpp's features, but they solve the crashes I've encountered so far. I don't rule out the possibility that there may be other points, in the complex environment that represents my use case, where I need to intervene as I progress. If so, I'll report them, as I have already done, along with the solutions I've found.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Ideas
Labels
None yet

AltStyle によって変換されたページ (->オリジナル) /