Replies: 1 comment
-
doc of tensorrt-llm is for old version, you can't follow it, I'm trying install tensorrt-llm 0.21.0 in windows, and here is the reason why I'm failed —— nvdia has gave up Windows in some:
I'm trying to build whl, but most possible result is fail……I'll trun on docker |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
How do I actually use TensorRT-LLM to run a model?
I have the latest update to text-generation-webui, running on Windows 11.
I converted my own model on my GPU, which is now stored as a ".engine" file.
This shows up in my Model list, I choose the "TensorRT-LLM" model loader, and I get the following errors
Okay, I guess TensorRT isn't actually installed...
So I open the command line and the text-generation-webui environment by running "cmd_windows.bat"
I then follow the TensorRT-LLM installation instructions as linked to from the text-generation-webui, and for the windows installation, here: https://nvidia.github.io/TensorRT-LLM/installation/windows.html
Now it's a bit of a mess, because the text-generation-webui environment has Python 3.11 installed, but the TensorRT-LLM installation says that you must have Python 3.10. I actually also have 3.10 installed, but not in this environment.
I have used the full tensorRT package by just using a git clone of the Nvidia project (not the LLM scripts used here) in my windows system environment, that's how I built the LLM engine to begin with.
But I can't get this tensorrt_llm to install for text-generation-webui
In any case, inside of the "cmd_windows.bat" environment, running:
pip install tensorrt_llm==0.17.0.post1 --extra-index-url https://download.pytorch.org/whl/ --extra-index-url https://pypi.nvidia.com
Always results in the same errors. I have tried different versions: tensorrt-llm==0.16.0, tensorrt-llm==0.10.8

I don't use linux, nor github very much, and although I can program decently, the readme saying that "[TensorRT-LLM] is supported via its own [Dockerfile]" isn't really that helpful to me. I see that the Dockerfile is just a python script to install various libraries and packages. I did directly try the command from the docker script for version 10.0, but it gives this error.
Help?
Am I doing something stupid?
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions