Skip to content

Commit c09e7f3

Browse files
author
zoey
authored
feat: server-side sampling capability (#173)
This pull request introduces significant enhancements to the Hermes server and its client interactions, focusing on adding sampling request capabilities, improving error handling, and extending interactivity in the Mix commands. Below is a categorized summary of the most important changes: ### Sampling Request Support * Added a new callback `handle_sampling_response/3` to process responses from sampling requests, including detailed documentation and examples (`lib/hermes/server.ex`). * Implemented `send_sampling_request/3` for asynchronously sending sampling requests to clients, supporting options like `system_prompt` and `max_tokens` (`lib/hermes/server.ex`). * Introduced helper functions in `Hermes.Server.Base` to manage sampling requests, including timeout handling, validation of client capabilities, and response/error processing (`lib/hermes/server/base.ex`). [[1]](diffhunk://#diff-1146a134b004c3f80ca5063ac01c480f7a3364c39d4ce152f102e24db0f18673R867-R1026) [[2]](diffhunk://#diff-1146a134b004c3f80ca5063ac01c480f7a3364c39d4ce152f102e24db0f18673R229-R248) [[3]](diffhunk://#diff-1146a134b004c3f80ca5063ac01c480f7a3364c39d4ce152f102e24db0f18673R317-R325) ### Server State and Error Handling * Updated the server state structure to include `server_requests` for tracking active sampling requests (`lib/hermes/server/base.ex`). [[1]](diffhunk://#diff-1146a134b004c3f80ca5063ac01c480f7a3364c39d4ce152f102e24db0f18673L34-R43) [[2]](diffhunk://#diff-1146a134b004c3f80ca5063ac01c480f7a3364c39d4ce152f102e24db0f18673L124-R134) * Added robust error handling for invalid requests and unexpected responses, improving server resilience (`lib/hermes/server/base.ex`). [[1]](diffhunk://#diff-1146a134b004c3f80ca5063ac01c480f7a3364c39d4ce152f102e24db0f18673R397-R405) [[2]](diffhunk://#diff-1146a134b004c3f80ca5063ac01c480f7a3364c39d4ce152f102e24db0f18673R229-R248) ### Mix Command Enhancements * Enhanced `call_tool` and `get_prompt` commands to accept arguments in JSON or file paths, providing greater flexibility (`lib/mix/interactive/commands.ex`). [[1]](diffhunk://#diff-f3574853ac8b62e805132df55e7d657b6617d3faa4443a69f8aadf584e4667faL96-R128) [[2]](diffhunk://#diff-f3574853ac8b62e805132df55e7d657b6617d3faa4443a69f8aadf584e4667faL146-R190) * Introduced support for custom timeouts in all server-interacting commands, improving usability for long-running operations (`lib/mix/interactive/commands.ex`). [[1]](diffhunk://#diff-f3574853ac8b62e805132df55e7d657b6617d3faa4443a69f8aadf584e4667faR74-R90) [[2]](diffhunk://#diff-f3574853ac8b62e805132df55e7d657b6617d3faa4443a69f8aadf584e4667faR143-R145) ### Workflow Adjustment * Simplified the test execution command in the CI workflow by removing the `--trace` flag, potentially improving test performance (`.github/workflows/ci.yml`). ### Miscellaneous Improvements * Added a new alias `Hermes.MCP.ID` in `Hermes.Server.Base` for generating request IDs, streamlining request tracking (`lib/hermes/server/base.ex`). * Updated the transport layer to support routing responses and errors to the appropriate server handlers (`lib/hermes/server/transport/streamable_http.ex`).
1 parent da617a6 commit c09e7f3

File tree

10 files changed

+612
-58
lines changed

10 files changed

+612
-58
lines changed

.github/workflows/ci.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -182,4 +182,4 @@ jobs:
182182
run: mix clean
183183

184184
- name: Run tests
185-
run: mix test --trace --warnings-as-errors
185+
run: mix test --warnings-as-errors

lib/hermes/server.ex

Lines changed: 95 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -325,6 +325,43 @@ defmodule Hermes.Server do
325325
"""
326326
@callback terminate(reason :: term, Frame.t()) :: term
327327

328+
@doc """
329+
Handles the response from a sampling/createMessage request sent to the client.
330+
331+
This callback is invoked when the client responds to a sampling request initiated
332+
by the server. The response contains the generated message from the client's LLM.
333+
334+
## Parameters
335+
336+
* `response` - The response from the client containing:
337+
* `"role"` - The role of the generated message (typically "assistant")
338+
* `"content"` - The content object with type and data
339+
* `"model"` - The model used for generation
340+
* `"stopReason"` - Why generation stopped (e.g., "endTurn")
341+
* `request_id` - The ID of the original request for correlation
342+
* `frame` - The current server frame
343+
344+
## Returns
345+
346+
* `{:noreply, frame}` - Continue processing
347+
* `{:stop, reason, frame}` - Stop the server
348+
349+
## Examples
350+
351+
def handle_sampling_response(response, request_id, frame) do
352+
%{"content" => %{"text" => text}} = response
353+
# Process the generated text...
354+
{:noreply, frame}
355+
end
356+
"""
357+
@callback handle_sampling_response(
358+
response :: map(),
359+
request_id :: String.t(),
360+
Frame.t()
361+
) ::
362+
{:noreply, Frame.t()}
363+
| {:stop, reason :: term(), Frame.t()}
364+
328365
@optional_callbacks handle_notification: 2,
329366
handle_info: 2,
330367
handle_call: 3,
@@ -334,7 +371,8 @@ defmodule Hermes.Server do
334371
handle_resource_read: 2,
335372
handle_prompt_get: 3,
336373
handle_request: 2,
337-
init: 2
374+
init: 2,
375+
handle_sampling_response: 3
338376

339377
@doc """
340378
Checks if the MCP session has been initialized.
@@ -743,4 +781,60 @@ defmodule Hermes.Server do
743781
send(server, {:send_notification, method, params})
744782
:ok
745783
end
784+
785+
# Sampling Request Functions
786+
787+
@doc """
788+
Sends a sampling/createMessage request to the client.
789+
790+
This function is used when the server needs the client to generate a message
791+
using its language model. The client must have declared the sampling capability
792+
during initialization.
793+
794+
Note: This is an asynchronous operation. The response will be delivered to your
795+
`handle_sampling_response/3` callback.
796+
797+
## Parameters
798+
799+
* `server` - The server process
800+
* `messages` - List of message objects with role and content
801+
* `opts` - Optional parameters:
802+
* `:model_preferences` - Hints about model selection
803+
* `:system_prompt` - System prompt to guide generation
804+
* `:max_tokens` - Maximum tokens to generate
805+
* `:metadata` - Any metadata to attach for correlation in the callback
806+
807+
## Returns
808+
809+
* `:ok` - Request queued for sending
810+
811+
## Examples
812+
813+
messages = [
814+
%{"role" => "user", "content" => %{"type" => "text", "text" => "Hello"}}
815+
]
816+
817+
:ok = Hermes.Server.send_sampling_request(self(), messages,
818+
system_prompt: "You are a helpful assistant",
819+
max_tokens: 100,
820+
metadata: %{request_type: :greeting}
821+
)
822+
"""
823+
@spec send_sampling_request(GenServer.server(), list(map()), keyword()) :: :ok
824+
def send_sampling_request(server, messages, opts \\ []) when is_list(messages) do
825+
params = %{"messages" => messages}
826+
827+
params =
828+
opts
829+
|> Keyword.take([:model_preferences, :system_prompt, :max_tokens])
830+
|> Enum.reduce(params, fn
831+
{:model_preferences, prefs}, acc -> Map.put(acc, "modelPreferences", prefs)
832+
{:system_prompt, prompt}, acc -> Map.put(acc, "systemPrompt", prompt)
833+
{:max_tokens, max}, acc -> Map.put(acc, "maxTokens", max)
834+
end)
835+
836+
metadata = Keyword.get(opts, :metadata, %{})
837+
send(server, {:send_sampling_request, params, metadata})
838+
:ok
839+
end
746840
end

0 commit comments

Comments
 (0)