Skip to content

Commit 7a06d80

Browse files
crickmanmoonbox3
andauthored
ADR Agents - Assistant V2 Support (#7215)
### Motivation and Context <!-- Thank you for your contribution to the semantic-kernel repo! Please help reviewers and future users, providing the following information: 1. Why is this change required? 2. What problem does it solve? 3. What scenario does it contribute to? 4. If it fixes an open issue, please link to the issue here. --> Open AI has release the _Assistants V2_ API. This builds on top of the V1 _assistant_ concept, but also invalidates certain V1 features. In addition, the _dotnet_ API that supports _Assistant V2_ features is entirely divergent on the `Azure.AI.OpenAI.Assistants` SDK that is currently in use. ### Description <!-- Describe your changes, the overall approach, the underlying design. These notes will help understanding how your code works. Thanks! --> ADR describes changes to `Microsoft.SemanticKernel.Agents.OpenAI` project / package. Changes prototyped in this PR: #7126 PR completion is gated by the V2 connector migration work due to versioning constraints. > ADR Reviewed on July 18th, 2024 ### Contribution Checklist <!-- Before submitting this PR, please make sure: --> - [X] The code builds clean without any errors or warnings - [X] The PR follows the [SK Contribution Guidelines](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md) and the [pre-submission formatting script](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#development-scripts) raises no violations - [X] All unit tests pass, and I have added new tests where possible - [X] I didn't break anyone 😄 --------- Co-authored-by: Evan Mattson <[email protected]>
1 parent 93df57e commit 7a06d80

11 files changed

+332
-0
lines changed
Lines changed: 184 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,184 @@
1+
2+
# Agent Framework - Assistant V2 Migration
3+
4+
## Context and Problem Statement
5+
6+
Open AI has release the _Assistants V2_ API. This builds on top of the V1 _assistant_ concept, but also invalidates certain V1 features. In addition, the _dotnet_ API that supports _Assistant V2_ features is entirely divergent on the `Azure.AI.OpenAI.Assistants` SDK that is currently in use.
7+
8+
### Open Issues
9+
- **Streaming:** To be addressed as a discrete feature
10+
11+
12+
## Design
13+
14+
Migrating to Assistant V2 API is a breaking change to the existing package due to:
15+
- Underlying capability differences (e.g. `file-search` vs `retrieval`)
16+
- Underlying V2 SDK is version incompatible with V1 (`OpenAI` and `Azure.AI.OpenAI`)
17+
18+
### Agent Implementation
19+
20+
The `OpenAIAssistant` agent is roughly equivalent to its V1 form save for:
21+
22+
- Supports options for _assistant_, _thread_, and _run_
23+
- Agent definition shifts to `Definition` property
24+
- Convenience methods for producing an OpenAI client
25+
26+
Previously, the agent definition as exposed via direct properties such as:
27+
28+
- `FileIds`
29+
- `Metadata`
30+
31+
This has all been shifted and expanded upon via the `Definition` property which is of the same type (`OpenAIAssistantDefinition`) utilized to create and query an assistant.
32+
33+
<p align="center">
34+
<kbd><img src="diagrams/assistant-agent.png" style="width: 720pt;"></kbd>
35+
</p>
36+
37+
The following table describes the purpose of diagramed methods on the `OpenAIAssistantAgent`.
38+
39+
|Method Name|Description|
40+
---|---
41+
**Create**|Create a new assistant agent
42+
**ListDefinitions**|List existing assistant definitions
43+
**Retrieve**|Retrieve an existing assistant
44+
**CreateThread**|Create an assistant thread
45+
**DeleteThread**|Delete an assistant thread
46+
**AddChatMessage**|Add a message to an assistant thread
47+
**GetThreadMessages**|Retrieve all messages from an assistant thread
48+
**Delete**|Delete the assistant agent's definition (puts agent into a terminal state)
49+
**Invoke**|Invoke the assistant agent (no chat)
50+
**GetChannelKeys**|Inherited from `Agent`
51+
**CreateChannel**|Inherited from `Agent`
52+
53+
54+
### Class Inventory
55+
This section provides an overview / inventory of all the public surface area described in this ADR.
56+
57+
|Class Name|Description|
58+
---|---
59+
**OpenAIAssistantAgent**|An `Agent` based on the Open AI Assistant API
60+
**OpenAIAssistantChannel**|An 'AgentChannel' for `OpenAIAssistantAgent` (associated with a _thread-id_.)
61+
**OpenAIAssistantDefinition**|All of the metadata / definition for an Open AI Assistant. Unable to use the _Open AI API_ model due to implementation constraints (constructor not public).
62+
**OpenAIAssistantExecutionOptions**|Options that affect the _run_, but defined globally for the agent/assistant.
63+
**OpenAIAssistantInvocationOptions**|Options bound to a discrete run, used for direct (no chat) invocation.
64+
**OpenAIThreadCreationOptions**|Options for creating a thread that take precedence over assistant definition, when specified.
65+
**OpenAIServiceConfiguration**|Describes the service connection and used to create the `OpenAIClient`
66+
67+
68+
### Run Processing
69+
70+
The heart of supporting an _assistant_ agent is creating and processing a `Run`.
71+
72+
A `Run` is effectively a discrete _assistant_ interaction on a `Thread` (or conversation).
73+
74+
- https://platform.openai.com/docs/api-reference/runs
75+
- https://platform.openai.com/docs/api-reference/run-steps
76+
77+
This `Run` processing is implemented as internal logic within the _OpenAI Agent Framework_ that is outlined here:
78+
79+
Initiate processing using:
80+
81+
- `agent` -> `OpenAIAssistantAgent`
82+
- `client` -> `AssistantClient`
83+
- `threadid` -> `string`
84+
- `options` -> `OpenAIAssistantInvocationOptions` (optional)
85+
86+
87+
Perform processing:
88+
89+
- Verify `agent` not deleted
90+
- Define `RunCreationOptions`
91+
- Create the `run` (based on `threadid` and `agent.Id`)
92+
- Process the run:
93+
94+
do
95+
96+
- Poll `run` status until is not _queued_, _in-progress_, or _cancelling_
97+
- Throw if `run` status is _expired_, _failed_, or _cancelled_
98+
- Query `steps` for `run`
99+
100+
- if `run` status is _requires-action_
101+
102+
- process function `steps`
103+
104+
- post function results
105+
106+
- foreach (`step` is completed)
107+
108+
- if (`step` is tool-call) generate and yield tool content
109+
110+
- else if (`step` is message) generate and yield message content
111+
112+
while (`run` status is not completed)
113+
114+
115+
### Vector Store Support
116+
117+
_Vector Store_ support is required in order to enable usage of the `file-search` tool.
118+
119+
In alignment with V2 streaming of the `FileClient`, the caller may also directly target `VectorStoreClient` from the _OpenAI SDK_.
120+
121+
122+
### Definition / Options Classes
123+
124+
Specific configuration/options classes are introduced to support the ability to define assistant behavior at each of the supported articulation points (i.e. _assistant_, _thread_, & _run_).
125+
126+
|Class|Purpose|
127+
|---|---|
128+
|`OpenAIAssistantDefinition`|Definition of the assistant. Used when creating a new assistant, inspecting an assistant-agent instance, or querying assistant definitions.|
129+
|`OpenAIAssistantExecutionOptions`|Options that affect run execution, defined within assistant scope.|
130+
|`OpenAIAssistantInvocationOptions`|Run level options that take precedence over assistant definition, when specified.|
131+
|`OpenAIAssistantToolCallBehavior`|Informs tool-call behavior for the associated scope: assistant or run.|
132+
|`OpenAIThreadCreationOptions`|Thread scoped options that take precedence over assistant definition, when specified.|
133+
|`OpenAIServiceConfiguration`|Informs the which service to target, and how.|
134+
135+
136+
#### Assistant Definition
137+
138+
The `OpenAIAssistantDefinition` was previously used only when enumerating a list of stored agents. It has been evolved to also be used as input for creating and agent and exposed as a discrete property on the `OpenAIAssistantAgent` instance.
139+
140+
This includes optional `ExecutionOptions` which define default _run_ behavior. Since these execution options are not part of the remote assistant definition, they are persisted in the assistant metadata for when an existing agent is retrieved. `OpenAIAssistantToolCallBehavior` is included as part of the _execution options_ and modeled in alignment with the `ToolCallBehavior` associated with _AI Connectors_.
141+
142+
> Note: Manual function calling isn't currently supported for `OpenAIAssistantAgent` or `AgentChat` and is planned to be addressed as an enhancement. When this supported is introduced, `OpenAIAssistantToolCallBehavior` will determine the function calling behavior (also in alignment with the `ToolCallBehavior` associated with _AI Connectors_).
143+
144+
**Alternative (Future?)**
145+
146+
A pending change has been authored that introduces `FunctionChoiceBehavior` as a property of the base / abstract `PromptExecutionSettings`. Once realized, it may make sense to evaluate integrating this pattern for `OpenAIAssistantAgent`. This may also imply in inheritance relationship of `PromptExecutionSettings` for both `OpenAIAssistantExecutionOptions` and `OpenAIAssistantInvocationOptions` (next section).
147+
148+
**DECISION**: Do not support `tool_choice` until the `FunctionChoiceBehavior` is realized.
149+
150+
<p align="center">
151+
<kbd><img src="diagrams/assistant-definition.png" style="width: 500pt;"></kbd>
152+
</p>
153+
154+
155+
#### Assistant Invocation Options
156+
157+
When invoking an `OpenAIAssistantAgent` directly (no-chat), definition that only apply to a discrete run may be specified. These definition are defined as `OpenAIAssistantInvocationOptions` and ovetake precedence over any corresponding assistant or thread definition.
158+
159+
> Note: These definition are also impacted by the `ToolCallBehavior` / `FunctionChoiceBehavior` quadary.
160+
161+
<p align="center">
162+
<kbd><img src="diagrams/assistant-invocationsettings.png" style="width: 370pt;"></kbd>
163+
</p>
164+
165+
166+
#### Thread Creation Options
167+
168+
When invoking an `OpenAIAssistantAgent` directly (no-chat), a thread must be explicitly managed. When doing so, thread specific options may be specified. These options are defined as `OpenAIThreadCreationOptions` and take precedence over any corresponding assistant definition.
169+
170+
<p align="center">
171+
<kbd><img src="diagrams/assistant-threadcreationsettings.png" style="width: 132pt;"></kbd>
172+
</p>
173+
174+
175+
#### Service Configuration
176+
177+
The `OpenAIServiceConfiguration` defines how to connect to a specific remote service, whether it be OpenAI, Azure, or proxy. This eliminates the need to define multiple overloads for each call site that results in a connection to the remote API service (i.e. create a _client)_.
178+
179+
> Note: This was previously named `OpenAIAssistantConfiguration`, but is not necessarily assistant specific.
180+
181+
<p align="center">
182+
<kbd><img src="diagrams/assistant-serviceconfig.png" style="width: 520pt;"></kbd>
183+
</p>
184+
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
classDiagram
2+
3+
KernelAgent
4+
OpenAIAssistantDefinition
5+
OpenAIAssistantDefinition --> OpenAIExecutionOptions
6+
OpenAIExecutionOptions
7+
OpenAIExecutionOptions --> AssistantToolCallBehavior
8+
OpenAIServiceConfiguration
9+
OpenAIAssistantInvocationOptions
10+
OpenAIAssistantInvocationOptions --> AssistantToolCallBehavior
11+
OpenAIThreadCreationOptions
12+
13+
KernelAgent <|-- OpenAIAssistantAgent
14+
OpenAIAssistantAgent ..> OpenAIServiceConfiguration
15+
OpenAIAssistantAgent -- OpenAIAssistantChannel
16+
OpenAIAssistantAgent --> OpenAIAssistantDefinition
17+
OpenAIAssistantAgent ..> OpenAIAssistantInvocationOptions
18+
OpenAIAssistantAgent ..> OpenAIThreadCreationOptions
19+
class OpenAIAssistantAgent {
20+
+OpenAIAssistantDefinition Definition
21+
+bool IsDeleted
22+
+RunPollingConfiguration Polling
23+
+Task~OpenAIAssistantAgent~ Create(Kernel kernel, OpenAIServiceConfiguration config, OpenAIAssistantDefinition definition)$
24+
+AsyncEnumerable~OpenAIAssistantDefinition~ ListDefinitions(OpenAIServiceConfiguration config)$
25+
+Task~OpenAIAssistantAgent~ Retrieve(Kernel kernel, OpenAIServiceConfiguration config, string id)$
26+
+Task~string~ CreateThread()
27+
+Task~string~ CreateThread(OpenAIThreadCreationOptions? Options)
28+
+Task~bool~ DeleteThread(string threadId)
29+
+Task AddChatMessage(string threadId, ChatMessageContent message)
30+
+AsyncEnumerable~ChatMessageContent~ GetThreadMessages(string threadId)
31+
+Task~bool~ Delete()
32+
+AsyncEnumerable~ChatMessageContent~ Invoke(string threadId)
33+
+AsyncEnumerable~ChatMessageContent~ Invoke(string threadId, OpenAIAssistantInvocationOptions? Options)
34+
#AsyncEnumerable~string~ GetChannelKeys()
35+
#Task~AgentChannel~ CreateChannel()
36+
}
37+
38+
OpenAIAssistantChannel ..> OpenAIAssistantAgent
39+
class OpenAIAssistantChannel {
40+
#Task Receive(IReadOnlyList<ChatMessageContent> history)
41+
#AsyncEnumerable<ChatMessageContent> Invoke(OpenAIAssistantAgent agent)
42+
#AsyncEnumerable<ChatMessageContent> GetHistory()
43+
}
44+
186 KB
Loading
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
classDiagram
2+
3+
OpenAIAssistantDefinition --> OpenAIAssistantExecutionOptions
4+
class OpenAIAssistantDefinition {
5+
string ModelName
6+
string? Description
7+
string Id
8+
string? Instructions
9+
string? Name
10+
List~string~? CodeInterpterFileIds
11+
bool EnableCodeInterpreter
12+
bool EnableJsonResponse
13+
Dictionary~string, string~? Metadata
14+
float? Temperature
15+
float? TopP
16+
string? VectorStoreId
17+
OpenAIAssistantExecutionOptions? ExecutionOptions
18+
}
19+
20+
OpenAIAssistantExecutionOptions --> OpenAIAssistantToolCallBehavior
21+
class OpenAIAssistantExecutionOptions {
22+
int? MaxCompletionTokens
23+
int? MaxPromptTokens
24+
bool? ParallelToolCallsEnabled
25+
int? TruncationMessageCount
26+
OpenAIAssistantToolCallBehavior? ToolCallBehavior
27+
}
28+
29+
class OpenAIAssistantToolCallBehavior {
30+
AssistantToolCallBehavior RequireCodeInterpreter()$
31+
AssistantToolCallBehavior RequireFunction(KernelFunction function)$
32+
AssistantToolCallBehavior RequireFileSearch()$
33+
}
Loading
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
classDiagram
2+
3+
OpenAIAssistantInvocationOptions --> OpenAIAssistantToolCallBehavior
4+
class OpenAIAssistantInvocationOptions {
5+
string? ModelName
6+
bool? EnableCodeInterpreter
7+
bool? EnableFileSearch
8+
bool? EnableJsonResponse
9+
int? MaxCompletionTokens
10+
int? MaxPromptTokens
11+
bool? ParallelToolCallsEnabled
12+
int? TruncationMessageCount
13+
float? Temperature
14+
float? TopP
15+
Dictionary~string, string~? Metadata
16+
OpenAIAssistantToolCallBehavior? ToolCallBehavior
17+
}
18+
19+
class OpenAIAssistantToolCallBehavior {
20+
AssistantToolCallBehavior RequireCodeInterpreter()$
21+
AssistantToolCallBehavior RequireFunction(KernelFunction function)$
22+
AssistantToolCallBehavior RequireFileSearch()$
23+
}
Loading
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
classDiagram
2+
3+
OpenAIClientFactory ..> OpenAIServiceConfiguration
4+
class OpenAIClientFactory {
5+
<<internal>>
6+
}
7+
8+
OpenAIServiceConfiguration --> OpenAIServiceType
9+
class OpenAIServiceConfiguration {
10+
OpenAIServiceConfiguration ForAzureOpenAI(string? apiKey, Uri? endpoint, HttpClient httpClient)$
11+
OpenAIServiceConfiguration ForAzureOpenAI(TokenCredential credential, Uri? endpoint, HttpClient httpClient)$
12+
OpenAIServiceConfiguration OpenAI(string? apiKey, Uri? endpoint, HttpClient httpClient)$
13+
-string? ApiKey
14+
-TokenCredential? TokenCredential
15+
-Uri? Endpoint
16+
-HttpClient? HttpClient
17+
-OpenAIServiceType ServiceType
18+
}
19+
20+
OpenAIServiceConfigurationExtensions ..> OpenAIServiceConfiguration
21+
OpenAIServiceConfigurationExtensions ..> FileClient
22+
OpenAIServiceConfigurationExtensions ..> VectorStoreClient
23+
class OpenAIServiceConfigurationExtensions {
24+
+FileClient CreateFileClient(this OpenAIServiceConfiguration config)$
25+
+VectorStoreClient CreateVectorStoreClient(this OpenAIServiceConfiguration config)$
26+
}
27+
28+
class OpenAIServiceType {
29+
<<enumeration>>
30+
AzureOpenAI
31+
OpenAI
32+
}
33+
34+
class FileClient {
35+
<<OpenAI>>
36+
}
37+
38+
class VectorStoreClient {
39+
<<OpenAI>>
40+
}
Loading
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
classDiagram
2+
3+
class OpenAIThreadCreationOptions {
4+
List~string~? CodeInterpterFileIds
5+
IReadOnlyList<ChatMessageContent>? Messages
6+
string? VectorStoreId
7+
Dictionary~string, string~? Metadata
8+
}

0 commit comments

Comments
 (0)