Tracking issue for thread safety in llama.cpp. The global inference lock can be removed once this is resolved. https://github.com/ggerganov/llama.cpp/issues/3960