-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Add PR category detection for UI tests #33176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Update android_ui_tests, android_ui_tests_coreclr, ios_ui_tests_mono, ios_ui_tests_nativeaot, winui_ui_tests, and mac_ui_tests stages - Add dependency on discover_ui_test_categories stage - Add stage variables to capture detected categories from discovery - Add Calculate test filter step to each job that: - For PRs with detected categories: filters to only matching categories - For non-PR builds or when no categories detected: runs all categories - Uses _NO_MATCHING_CATEGORY_ filter when no categories match (runs 0 tests) - Update testFilter parameter to use EffectiveTestFilter variable
Remove stage-level variables and use direct stage dependency expression
in job-level variables instead. This ensures the detected categories
are properly accessible within each matrix job.
Changed from:
stage:
variables:
detectedCategories: $[ stageDep... ]
jobs:
- job:
variables:
DETECTED_CATEGORIES: $(detectedCategories)
To:
stage:
jobs:
- job:
variables:
DETECTED_CATEGORIES: $[ stageDep... ]
The testFilter parameter uses $(CATEGORYGROUP) which is already wired up correctly in the ui-tests-steps.yml template. Instead of creating a new EffectiveTestFilter variable and trying to pass it, simply overwrite the existing CATEGORYGROUP variable with the filtered value. This leverages the existing parameter passing mechanism that already works.
Matrix variables are readonly in Azure Pipelines and cannot be overwritten. Create a new EFFECTIVE_FILTER variable instead and use that for testFilter.
Adopted the approach from PR #32935: - Pass CATEGORYGROUP, DETECTED_CATEGORIES via env block - Move intersection logic into ui-tests-steps.yml PowerShell - Remove duplicate 'Calculate test filter' steps from all stages - Read variables from environment instead of pipeline expressions This simplifies the pipeline by consolidating the filtering logic in one place rather than repeating it in every test stage.
When there are no matching categories for a PR, exit with success immediately instead of running tests with a fake filter. This truly skips the test execution rather than running the test framework with no tests matching.
- Added early 'Check if category should run' step in ui-tests-steps.yml - Set SHOULD_RUN_TESTS variable based on category match - Added condition to skip: artifact downloads, clean bot, KVM enable, notification center disable, defaults modify, .NET install, PATH add, screen resolution, node install, Appium install, test execution This saves time for PR builds where only specific test categories are modified, by skipping setup for non-matching category groups.
Skip expensive provisioning steps (JDK, Android SDK, .NET SDK, Xcode, simulator runtimes) when SHOULD_RUN_TESTS is False. This saves significant build time when a UI test category doesn't match the detected categories from the PR changes. Uses a single early check step that sets SKIP_PROVISIONING variable, which is then checked by only the expensive steps.
6a983cf to
f060cc5
Compare
- Add discover_ui_test_categories dependency to CV2 and CARV2 stages - Add DETECTED_CATEGORIES variable to CV2 and CARV2 jobs - Update early check to use testFilter parameter as fallback when CATEGORYGROUP is not set (for non-matrix stages like CV2/CARV2) Now CV2 will only run if CollectionView is detected in PR changes, and CARV2 will only run if CarouselView is detected.
When CATEGORYGROUP variable is not defined (CV2/CARV2 stages without matrix), Azure Pipelines passes the literal string '$(CATEGORYGROUP)' instead of empty. Now we check for both empty AND the literal unexpanded variable string before falling back to testFilter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces automatic UI test category detection for pull requests to optimize test execution time. When a PR adds new UI tests with [Category(...)] attributes, the pipeline detects these categories and runs only the relevant test matrix jobs, while non-matching jobs complete quickly with 0 tests.
Key Changes:
- Adds a discovery stage that parses git diffs to detect new test categories in PRs
- Implements early filtering logic to skip provisioning and test execution for non-matching category groups
- Updates all test stages to depend on category discovery and use detected categories for filtering
- Includes temporary dummy tests in ButtonUITests and LabelUITests to validate the detection mechanism
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| eng/scripts/detect-ui-test-categories.ps1 | New PowerShell script that parses git diffs to extract Category attributes from added test code |
| eng/pipelines/common/ui-tests.yml | Adds discovery stage and updates test stage dependencies to use detected categories |
| eng/pipelines/common/ui-tests-steps.yml | Implements early category check and dynamic test filter calculation based on detected categories |
| eng/pipelines/common/provision.yml | Adds skip logic to bypass provisioning when no tests will run for a category group |
| src/Controls/tests/TestCases.Shared.Tests/Tests/ButtonUITests.cs | Adds dummy test to validate Button category detection |
| src/Controls/tests/TestCases.Shared.Tests/Tests/LabelUITests.cs | Adds dummy test to validate Label category detection |
| try { | ||
| git fetch origin "${targetBranch}" --no-tags --prune --depth=200 | Out-Null | ||
| } catch { | ||
| Write-Warning "Failed to fetch origin/${targetBranch}: $($_.Exception.Message). Falling back to running all categories." | ||
| return | ||
| } |
Copilot
AI
Dec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The script uses Write-Warning but then returns early, which means detection silently falls back to running all categories. This could mask real problems like git configuration issues or network failures. Consider logging more explicitly what fallback behavior will occur, or consider whether some failures (like git fetch failures) should fail the pipeline rather than silently fall back.
| Write-Warning "Unrecognized category expression '$rawValue'. Falling back to running all categories." | ||
| return |
Copilot
AI
Dec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When an unrecognized category expression is found (line 76), the script returns early and falls back to running all categories. This is overly conservative - if a developer adds a new test with a malformed category attribute, it would be better to fail the pipeline or at least log more prominently. The current behavior means malformed category attributes will go unnoticed and cause all tests to run, defeating the purpose of the optimization.
| Write-Warning "Unrecognized category expression '$rawValue'. Falling back to running all categories." | |
| return | |
| $message = "Unrecognized category expression '$rawValue'. Expected formats: UITestCategories.<Name>, nameof(UITestCategories.<Name>), or a quoted category string. Failing category detection." | |
| Write-Error $message | |
| throw $message |
| return | ||
| } | ||
|
|
||
| $categoryPattern = '^[+]{1}\s*\[Category\((?<value>[^\)]*)\)\]' |
Copilot
AI
Dec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The regex pattern uses a single + quantifier which requires exactly one + character. In git diff output, added lines are prefixed with a single +, but the pattern ^[+]{1} is more verbose than necessary. The pattern could be simplified to ^\+ or ^[+] for better readability, though the current pattern is functionally correct.
| $categoryPattern = '^[+]{1}\s*\[Category\((?<value>[^\)]*)\)\]' | |
| $categoryPattern = '^\+\s*\[Category\((?<value>[^\)]*)\)\]' |
| foreach ($cat in $categoryList) { | ||
| $cat = $cat.Trim() | ||
| foreach ($det in $detectedList) { | ||
| $det = $det.Trim() | ||
| if ($cat -eq $det) { | ||
| $matchingCategories += $cat | ||
| break | ||
| } | ||
| } |
Copilot
AI
Dec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The category comparison uses case-sensitive string equality ($cat -eq $det). However, the HashSet in the PowerShell detection script (detect-ui-test-categories.ps1 line 57) uses case-insensitive comparison ([System.StringComparer]::OrdinalIgnoreCase). This inconsistency could cause matching failures if category names differ in case. Use case-insensitive comparison here as well with -ieq or .ToLower() for consistency.
| vmImage: ubuntu-latest | ||
| steps: | ||
| - checkout: self | ||
| fetchDepth: 0 |
Copilot
AI
Dec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The checkout step uses fetchDepth: 0 which fetches the entire git history. This is quite expensive and can slow down the discovery job significantly for repositories with long history. Consider using a more reasonable depth (e.g., fetchDepth: 200) since the script already uses --depth=200 when fetching the target branch at line 30. This would be consistent and still provide enough history for most PR comparisons.
| fetchDepth: 0 | |
| fetchDepth: 200 |
| condition: | | ||
| or( | ||
| ne(variables['Build.Reason'], 'PullRequest'), | ||
| in(dependencies.discover_ui_test_categories.result, 'Succeeded', 'Skipped') | ||
| ) |
Copilot
AI
Dec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition uses in(dependencies.discover_ui_test_categories.result, 'Succeeded', 'Skipped') which allows the build to proceed even if discovery was skipped. However, for non-PR builds, the discovery stage has condition: eq(variables['Build.Reason'], 'PullRequest') which means it will be skipped, not produce any output variables. The test stages depend on these output variables (DETECTED_CATEGORIES), so they need to handle the case where these variables are empty or undefined. Verify that the downstream logic correctly handles undefined DETECTED_CATEGORIES for non-PR builds.
| condition: | | |
| or( | |
| ne(variables['Build.Reason'], 'PullRequest'), | |
| in(dependencies.discover_ui_test_categories.result, 'Succeeded', 'Skipped') | |
| ) | |
| condition: succeeded() |
| foreach ($cat in $categoryList) { | ||
| $cat = $cat.Trim() | ||
| foreach ($det in $detectedList) { | ||
| $det = $det.Trim() | ||
| if ($cat -eq $det) { | ||
| $hasMatch = $true | ||
| Write-Host "Match found: '$cat'" | ||
| break | ||
| } | ||
| } | ||
| if ($hasMatch) { break } | ||
| } |
Copilot
AI
Dec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The category comparison at lines 52 and 224 uses case-sensitive string equality ($cat -eq $det). However, the HashSet used for collecting categories in the PowerShell detection script uses case-insensitive comparison (line 57: [System.StringComparer]::OrdinalIgnoreCase). This inconsistency could cause matching failures if category names differ in case between the detection script output and the matrix CATEGORYGROUP values. Use case-insensitive comparison here as well with -ieq or .ToLower() for consistency.
| $testFilter = "" | ||
| $testConfigrationArgs = "${{ parameters.testConfigurationArgs }}" | ||
| "${{ parameters.testFilter }}".Split(",") | ForEach { | ||
| $testFilter += "TestCategory=" + $_ + "|" | ||
| # Get test filter from environment variable (passed from matrix via env block) | ||
| $testFilterParam = $env:CATEGORY_GROUP | ||
| $detectedCategories = $env:DETECTED_CATEGORIES | ||
| $isPR = $env:BUILD_REASON -eq "PullRequest" | ||
| Write-Host "Category Group from matrix: '$testFilterParam'" | ||
| Write-Host "Detected Categories: '$detectedCategories'" | ||
| # Fallback to parameter for non-matrix builds | ||
| if (-not $testFilterParam) { | ||
| $testFilterParam = "${{ parameters.testFilter }}" | ||
| Write-Host "Using testFilter parameter: '$testFilterParam'" | ||
| } | ||
| # For PRs with detected categories, use only matching categories | ||
| if ($isPR -and -not [string]::IsNullOrWhiteSpace($detectedCategories) -and -not [string]::IsNullOrWhiteSpace($testFilterParam)) { | ||
| $categoryList = $testFilterParam -split "," | ||
| $detectedList = $detectedCategories -split "," | ||
| $matchingCategories = @() | ||
| foreach ($cat in $categoryList) { | ||
| $cat = $cat.Trim() | ||
| foreach ($det in $detectedList) { | ||
| $det = $det.Trim() | ||
| if ($cat -eq $det) { | ||
| $matchingCategories += $cat | ||
| break | ||
| } | ||
| } | ||
| } | ||
| $testFilterParam = $matchingCategories -join "," | ||
| Write-Host "Running matching categories: $testFilterParam" | ||
| } | ||
| Write-Host "Final test filter: '$testFilterParam'" | ||
| if ($testFilterParam) { | ||
| $testFilterParam.Split(",") | ForEach { | ||
| $testFilter += "TestCategory=" + $_ + "|" | ||
| } | ||
| $testFilter = $testFilter.TrimEnd("|") | ||
| } | ||
| $testFilter = $testFilter.TrimEnd("|") | ||
Copilot
AI
Dec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is duplicated category matching logic between the early check step (lines 20-69) and the test filter calculation step (lines 200-233). Both steps perform the same category matching operation with identical logic. Consider extracting this into a shared function or removing one of the checks. Since the early check already determines whether tests should run and sets SHOULD_RUN_TESTS, the test filter calculation step could be simplified to only handle the filter string construction when SHOULD_RUN_TESTS is True.
| - pwsh: | | ||
| $testFilterParam = $env:CATEGORY_GROUP | ||
| $testFilterFallback = $env:TEST_FILTER_PARAM | ||
| $detectedCategories = $env:DETECTED_CATEGORIES | ||
| $isPR = $env:BUILD_REASON -eq "PullRequest" | ||
| Write-Host "=== Early Category Check ===" | ||
| Write-Host "Build Reason: $env:BUILD_REASON" | ||
| Write-Host "Category Group (from matrix): '$testFilterParam'" | ||
| Write-Host "Test Filter (from parameter): '$testFilterFallback'" | ||
| Write-Host "Detected Categories: '$detectedCategories'" | ||
| # Use testFilter parameter as fallback when CATEGORYGROUP is not set or not expanded | ||
| # When CATEGORYGROUP variable doesn't exist, it shows as literal '$(CATEGORYGROUP)' | ||
| $categoryGroupNotSet = [string]::IsNullOrWhiteSpace($testFilterParam) -or $testFilterParam -eq '$(CATEGORYGROUP)' | ||
| if ($categoryGroupNotSet -and -not [string]::IsNullOrWhiteSpace($testFilterFallback)) { | ||
| $testFilterParam = $testFilterFallback | ||
| Write-Host "Using testFilter parameter as category: '$testFilterParam'" | ||
| } | ||
| $shouldRun = $true | ||
| # For PRs with detected categories, check if this category group has any matches | ||
| if ($isPR -and -not [string]::IsNullOrWhiteSpace($detectedCategories) -and -not [string]::IsNullOrWhiteSpace($testFilterParam)) { | ||
| $categoryList = $testFilterParam -split "," | ||
| $detectedList = $detectedCategories -split "," | ||
| $hasMatch = $false | ||
| foreach ($cat in $categoryList) { | ||
| $cat = $cat.Trim() | ||
| foreach ($det in $detectedList) { | ||
| $det = $det.Trim() | ||
| if ($cat -eq $det) { | ||
| $hasMatch = $true | ||
| Write-Host "Match found: '$cat'" | ||
| break | ||
| } | ||
| } | ||
| if ($hasMatch) { break } | ||
| } | ||
| if (-not $hasMatch) { | ||
| $shouldRun = $false | ||
| Write-Host "##[warning]No matching categories - SKIPPING all steps for this job" | ||
| Write-Host "Category group '$testFilterParam' does not contain any detected categories: $detectedCategories" | ||
| } | ||
| } | ||
| Write-Host "Should run tests: $shouldRun" | ||
| Write-Host "##vso[task.setvariable variable=SHOULD_RUN_TESTS]$shouldRun" |
Copilot
AI
Dec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The early check step could skip the job entirely if SHOULD_RUN_TESTS is False, but instead it sets a variable and forces all subsequent steps to check this condition. Azure Pipelines supports job-level conditions that could terminate the job earlier. Consider using a job-level condition expression or setting the job result to "Skipped" when no matching categories are found, rather than requiring every subsequent step to check SHOULD_RUN_TESTS.
Note
Are you waiting for the changes in this PR to be merged?
It would be very helpful if you could test the resulting artifacts from this PR and let us know in a comment if this change resolves your issue. Thank you!
Description of Change
This PR adds automatic category detection for UI tests in pull requests. When a PR adds new UI tests with
[Category(...)]attributes, only the corresponding test categories will run (though all matrix jobs still spin up, non-matching ones complete quickly with 0 tests).How it works:
Discovery stage (
discover_ui_test_categories) - Only runs for PRs:detect-ui-test-categories.ps1script[Category(UITestCategories.X)]patternsTest stages - All stages now depend on discovery:
_NO_MATCHING_CATEGORY_as filter for non-matching jobs (runs 0 tests but completes quickly)Files changed:
eng/scripts/detect-ui-test-categories.ps1- New PowerShell script for category detectioneng/pipelines/common/ui-tests.yml- Updated pipeline with discovery stage and filtering logicTest Validation
This PR includes a dummy Button test to validate the detection is working. The discovery stage should detect "Button" category, and only the Button matrix jobs should run actual tests.
Issues Fixed
Reduces UI test run time for PRs by only running relevant test categories.