Skip to content

Conversation

@mittalrishabh
Copy link
Member

Created RFC for region level isolation

Signed-off-by: rishabh_mittal <[email protected]>
@mittalrishabh mittalrishabh changed the title region level isolation [RFC ]region level isolation Dec 11, 2025
rishabh_mittal added 3 commits December 11, 2025 09:41
Signed-off-by: rishabh_mittal <[email protected]>
Signed-off-by: rishabh_mittal <[email protected]>
Signed-off-by: rishabh_mittal <[email protected]>
These functionalities are missing in the current implementation.

1. **Region-level fairness**: Hot regions (with hot keys or large scans) should be deprioritized to prevent resource monopolization within a tenant
3. **Traffic Moderation**: In a multi-tenant SOA environment, setting correct rate limits is challenging - limits that are too tight reject valid traffic, while limits that are too loose allow overload. Instead of hard rate limits, implement adaptive traffic moderation that responds to sudden spikes on hot regions by gracefully deprioritizing rather than outright rejecting requests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: 3rd when it is 2nd

This point sounds more like non-goal. Move it to alternative approaches section?

Separately, do you want to add a goal around non interfering with region split?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a goal. If we don't have this goal then system will become overloaded after split.


### Traffic moderation and split/scatter

Currently, split/scatter is non-deterministic when node is overloaded - it depends on how many requests on this region are succeeded. With this design, Hot regions accumulate high VT and get deprioritized, which slows down split decisions based on served QPS.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If split depends on requests succeeded then deprioritization of such requests will delay the split. Should it use scheduled/dropped QPS rather than succeeded instead>

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can use scheduled/dropped qps. I will modify the design.

Comment on lines +97 to +104
### Background Task Demotion

Background tasks (GC, compaction, statistics) use LOW `group_priority` regardless of their resource group's configured priority:
```
group_priority = LOW // instead of resource group's configured priority
```

This ensures foreground traffic is always prioritized over background.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the proposal here to create a virtual RG for background task with low priority to schedule them relative to other traffic as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tagged you below in the implementation.


When queue is full:
1. Calculate priority of incoming task
2. Compare with lowest priority task in queue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is probably hard to implement effectively. Will it require more one priority queue with inverse priority? Or the current SkipMap implementation allows popping from both ends effectively?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, current skipmap allows. It is O(log N) complexity.

Comment on lines +376 to +378
2. **Shared region fairness issues**: When multiple resource groups access the same region, two fairness problems arise:
- **Innocent tenant penalized**: Tenant A's heavy usage increases the region's VT, penalizing Tenant B's requests to that region even though Tenant B didn't cause the hotness
- **Hot region stays hot**: If Tenant A and B alternate requests to a shared region, each tenant's group_vt stays low (they're taking turns), so the region never gets properly deprioritized despite being continuously hot
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you consider to have a region tracker per group? What are trade offs of this approach instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is region tracker per group ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current design tracks cpu per regions across all tenants which results on the highlighted issues. I'm asking if you considered to track it per tenant.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

existing design is if a region is split into r1 and r2 then they will share the same VT if cpu utilization is more than 80%. Having region tracker per group will complicate this design.


// 3. Use LOW priority for background tasks
let group_priority = if metadata.is_background() {
LOW
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Tema this is how priority is decided for background task

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants