-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat: Add /v1/embeddings endpoint to batches API #3384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: Add /v1/embeddings endpoint to batches API #3384
Conversation
cc: @mattf can you take a look at this! Thanks! |
4037309
to
0a5f8ab
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking good. two requests -
- remove the special case
- add a list input to the e2e test
else: # /v1/embeddings | ||
required_params = [ | ||
("model", str, "a string"), | ||
("input", str, "a string or array of strings"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
str
-> (str, list)
@@ -449,6 +454,18 @@ async def _validate_input(self, batch: BatchObject) -> tuple[list[BatchError], l | |||
) | |||
) | |||
valid = False | |||
elif param == "input" and batch.endpoint == "/v1/embeddings": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can remove this and use (str, list)
above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue with (str, list) is, it causes a mypy type error:
llama_stack/providers/inline/batches/reference/batches.py:443: error: List item 1 has incompatible type "tuple[str, tuple[type[str], type[list[_T]]], str]"; expected "tuple[str, ABCMeta, str]" [list-item]
mypy expects a single type in the tuple, not a tuple of types. Which is why I've implemented it using str as the base type with a special case for validation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mypy infers the type for required_params
the first time it appears -
required_params = [
("model", str, "a string"),
("prompt", str, "a string"), # TODO: allow prompt to be a list of strings??
]
effectively making it required_params: list[tuple[str, type, str]]
so the later required_params = [(..., (x, y), ...)]
doesn't match the inferred type.
you can fix this by providing the type so mypy doesn't infer it, something like -
required_params: tuple[str, type | tuple, str] = [
("model", str, "a string"),
("prompt", str, "a string"), # TODO: allow prompt to be a list of strings??
]
This PR extends the Llama Stack Batches API to support the /v1/embeddings endpoint, enabling efficient batch processing of embedding requests alongside the existing /v1/chat/completions and /v1/completions support. Signed-off-by: Varsha Prasad Narsing <[email protected]>
0a5f8ab
to
d156008
Compare
What does this PR do?
This PR extends the Llama Stack Batches API to support the /v1/embeddings endpoint, enabling efficient batch processing of embedding requests alongside the existing /v1/chat/completions and /v1/completions support.
Closes: #3145
Test Plan