I tried everything at this point i think i am doing something wrong or i have discovered some very strange bug. i was thinking on posting on their github but i am not sure if i am not simply making a very stupid error.

```

in a fresh conda install set up with python 3.12

i used export LLAMA_CUBLAS=1

then i copied this:

CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python

it runs without complaint creating a working llama-cpp-python install but without cuda support. I know that i have cuda working in the wsl because nvidia-sim shows cuda version 12.

i have tried to set up multiple environments i tried removing and reinstalling, i tried different things besides cuda that also dont work so something seems to be off with the backend part but i dont know what. Best guess i do something very basic wrong like not setting the environmental variable correctly or somthing.

when i reinstalled i used this option

pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir

Also it does simply not create the llama_cpp_cuda folder in so llama-cpp-python not using NVIDIA GPU CUDA - Stack Overflow does not seem to be the problem.

Hardware:

Ryzen 5800H

RTX 3060

16gb of ddr4 RAM

WSL2 Ubuntu

TO test it i run the following code and look at the gpu mem usage which stays at about 0

from llama_cpp import Llama

llm = Llama(model_path=“/mnt/d/Maschine learning/llm models/llama_2_7b/llama27bchat.Q4_K_M.gguf”, n_gpu_layers=20,

n_threads=6, n_ctx=3584, n_batch=521, verbose=True)

output = llm("Q: Name the planets in the solar system? A: ", max_tokens=32, stop=[“Q:”, “\n”], echo=True)

So any help or idea what could be going on here would be of great help because i am out of ideas. Thank you very much :)