Skip to content
This repository was archived by the owner on Jun 5, 2025. It is now read-only.

Add FIM functionalty for VLLM provider #132

Merged
merged 1 commit into from
Nov 29, 2024
Merged

Conversation

aponcedeleonch
Copy link
Contributor

@aponcedeleonch aponcedeleonch commented Nov 29, 2024

This enables FIM completion for VLLM. Tested with Continue

"tabAutocompleteModel": {
       "title": "CodeGate - Stacklok Hosted",
       "provider": "openai",
       "model": "Qwen/Qwen2.5-Coder-14B",
       "apiKey": "$token",
       "apiBase": "http://localhost:8989/vllm"
},

This enables FIM completion for VLLM. Tested with Continue
```json
"tabAutocompleteModel": {
       "title": "CodeGate - Stacklok Hosted",
       "provider": "openai",
       "model": "Qwen/Qwen2.5-Coder-14B",
       "apiKey": "stlk_033ad7c6cb39721306afdc2d9fad6185422af668fde9ffb867205740833f1b9a",
       "apiBase": "http://localhost:8989/vllm"
},
```
Comment on lines +19 to +21
completion_handler = LiteLLmShim(
stream_generator=sse_stream_generator, fim_completion_func=atext_completion
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to specify litellm to use atext_completion instead of acompletion when is FIM. Continue give us prompt instead of messages but atext_completion is able to handle it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Downside of using atext_completion is that it returns a TextCompletionResponse instead of a ModelResponse like our regular acompletion. I checked and the parameters of both objects are almost identical, hopefully it doesn't cause too much problems

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would that have an impact on the pipeline processing @jhrozek ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will do some testing. The pipeline does expect a ModelResponse maybe we could normalize TextCompletion into ModelResponse..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I had another look and the biggest difference between atext_completion and acompletion in liteLLM is that acompletion receives a conversation with multiple prompts and roles and atext_completion just receives a prompt.

If this is OK for you then let's go ahead. I'm not sure if FIM will use any system prompts or such.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will merge and attempt to do the normalization of TextCompletionResponse to ModelResponse in a separate PR. I agree is better if we try to keep the stuff as normalized as possible.

@aponcedeleonch aponcedeleonch merged commit de91231 into main Nov 29, 2024
2 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants