Skip to content

Conversation

@vishal1997
Copy link
Contributor

@vishal1997 vishal1997 commented Feb 28, 2025

Provide a description of what has been changed

  1. This PR introduces Managed Identity (MI) support for the Azure Cosmos DB external scaler and updates the Demo project to use MI-based access for autoscaling
  2. Included detailed steps for configuring and deploying the Demo project with MI-based access.

Manual testing and validation:

  1. Used the Demo app in the repo to generate data in cosmos DB, used MSI to access cosmos DB.
  2. Used the Demo app in the repo to process the data generated in the first step, used MSI to access the cosmosDB.
  3. Created the scale object having the clientId of MSI as properties, validated the pods count increased to 2 from 0 and back to 0 after processing all the data.

The manual testing was performed on AKS cluster, details to setup the cluster with MSI is added in the Readme file of the demo app.

Original PR: #78

Checklist

  • Commits are signed with Developer Certificate of Origin (DCO)
  • A PR is opened to update the documentation on our docs repo

Fixes #

Vighnesh Shenoy and others added 7 commits February 28, 2025 17:33
- name: CosmosDbConfig__LeaseContainerId
value: "OrderProcessorLeases"
- name: CosmosDbConfig__ProcessorName
value: "OrderProcessor"
Copy link

@raorugan raorugan Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to share outcome of this test - order-processor with user-assigned and system assigned? Where was this tested on AKS or ACA?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the details in the description, this was tested on AKS cluster

@vishal1997 vishal1997 force-pushed the cosmosdb_mi_changes branch from 9d715cb to 757902d Compare March 3, 2025 09:59
selector:
matchLabels:
app: cosmosdb-scaler
azure.workload.identity/use: "true" # Optional: Use Workload Identity for authentication. Refer https://azure.github.io/azure-workload-identity/docs/topics/service-account-labels-and-annotations.html for more details.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to enable workload identity by default in the scaler deployment?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, do we want to keep serviceAccountName as sa in scaler deployment? It might not clear on what this service account refers to until looking at the OrderProcessor Scaler Demo file.

We can probably use a different Scaler Deploy.yaml file for the Demo?

// Prioritize credential based connections
if (!string.IsNullOrWhiteSpace(metadata.Endpoint))
{
return _factory.GetCosmosClient(metadata.Endpoint, useCredentials: true, clientId: metadata.ClientId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does passing null in clientId argument switch to authenticating system MI?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the clientId here is null, the azure.workload.identity/client-id annotation in the service account will be picked as the default client ID during the client initialization.
This change does not support system managed identity, reason is we are using federated credentials for workload identity federation, which cannot be created for system assigned identity.
ref: https://azure.github.io/azure-workload-identity/docs/topics/federated-identity-credential.html

public CosmosClient GetCosmosClient(string endpointOrConnection, bool useCredentials, string clientId)
{
return _cosmosClientCache.GetOrAdd(connection, CreateCosmosClient);
return _cosmosClientCache.GetOrAdd((endpointOrConnection, clientId), CreateCosmosClient(endpointOrConnection, useCredentials, clientId));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to confirm, will clientId:null cause any issues in adding to dictionary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change is tested with null value.

/// </summary>
/// <param name="clientId">ClientId of the identity to be used. System identity is used if this is null. </param>
/// <returns></returns>
private TokenCredential GetChainedCredential(string clientId)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we validate change with both user-assigned and system MI?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please refer this comment #80 (comment)

apiVersion: v1
kind: ServiceAccount
metadata:
name: <service-account-name>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this file here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a template for scaled-object using mi

Copy link
Collaborator

@JatinSanghvi JatinSanghvi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added first set of review comments.

README.md Outdated
# Managed Identity properties
endpoint: <endpoint> # Optional. Account endpoint of the CosmosDB account containing the monitored container.
leaseEndpoint: <lease-endpoint> # Optional. Account endpoint of the CosmosDB account containing the lease container.
clientId: <client-Id> # Optional. ClientId of the identity to be used. System assigned identity is used, if this is null.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace with `Client ID' here and in the description below. Should we say, 'if this is not provided'?

{
if (useCredentials)
{
var credential = GetChainedCredential(clientId);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since CosmosClientOptions is being inlined, should be do the same for TokenCredential?

{
var options = new DefaultAzureCredentialOptions
{
ManagedIdentityClientId = clientId,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my understanding, can clientId be null? Will that force it to use system assigned MSI?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please refer this comment: #80 (comment)

public static string _applicationName = "cosmosdb-order-generator";

public static async Task Main(string[] args)
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was it necessary to kill user interactivity and validation of input values? Can you explain the reason here. Let me know if this needs to be discussed offline.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this requires a drastic change, and in my opinion, it loses the charm of demo, we can think of keeping the order generator project connection string based only, as this was meant to be called locally, from demoing person's computer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention here was that during the demo, we can keep generating new orders rapidly. I think the new demo would require updating the deployment YAML file and kubectl apply ing it, which will slow it down. The order processor is the main application that resembles user workload that requires KEDA scaling and the generator is there to have a complete end-to-end demo-able sample.

@vishal1997 vishal1997 force-pushed the cosmosdb_mi_changes branch from 11cb334 to edb6a0e Compare March 17, 2025 04:37
@villanisaac-kr
Copy link

Is there any ETA for this feature to be completed (Support for Workload identity)?

@ni-bhandari
Copy link
Contributor

@villanisaac-kr currently working on completing this PR - #86. It has been approved by a reviewer, hoping to get it merged soon.

@villanisaac-kr
Copy link

@villanisaac-kr currently working on completing this PR - #86. It has been approved by a reviewer, hoping to get it merged soon.

Awesome thanks 🥇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants