Multimodal Tutorial

Getting started

1. Find a multimodal model

GGUF models with vision capabilities are uploaded along a mmproj file to Hugging Face.

For instance, unsloth/gemma-3-4b-it-GGUF has this:

2. Download the model to `user_data/models`

As an example, download

https://huggingface.co/unsloth/gemma-3-4b-it-GGUF/resolve/main/gemma-3-4b-it-Q4_K_S.gguf?download=true

to your text-generation-webui/user_data/models folder.

3. Download the associated mmproj file to `user_data/mmproj`

Then download

https://huggingface.co/unsloth/gemma-3-4b-it-GGUF/resolve/main/mmproj-F16.gguf?download=true

to your text-generation-webui/user_data/mmproj folder. Name it mmproj-gemma-3-4b-it-F16.gguf to give it a recognizable name.

4. Load the model

Launch the web UI
Navigate to the Model tab
Select the GGUF model in the Model dropdown:

Select the mmproj file in the Multimodal (vision) menu:

Click "Load"

5. Send a message with an image

Select your image by clicking on the 📎 icon and send your message:

The model will reply with great understanding of the image contents:

Multimodal with ExLlamaV3

Multimodal also works with the ExLlamaV3 loader (the non-HF one).

No additional files are necessary, just load a multimodal EXL3 model and send an image.

Examples of models that you can use:

Multimodal API examples

In the page below you can find some ready-to-use examples:

Multimodal/vision (llama.cpp and ExLlamaV3)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Multimodal Tutorial

Getting started

1. Find a multimodal model

2. Download the model to `user_data/models`

3. Download the associated mmproj file to `user_data/mmproj`

4. Load the model

5. Send a message with an image

Multimodal with ExLlamaV3

Multimodal API examples

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Multimodal Tutorial

Getting started

1. Find a multimodal model

2. Download the model to user_data/models

3. Download the associated mmproj file to user_data/mmproj

4. Load the model

5. Send a message with an image

Multimodal with ExLlamaV3

Multimodal API examples

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

2. Download the model to `user_data/models`

3. Download the associated mmproj file to `user_data/mmproj`