coverage: `llvm-cov` expects column numbers to be bytes, not code points #119033

Zalathar · 2023-12-17T01:10:25Z

Normally the compiler emits column numbers as a 1-based number of Unicode code points.

But when we embed coverage mappings for -Cinstrument-coverage, those mappings will ultimately be read by the llvm-cov tool. That tool assumes that column numbers are 1-based numbers of bytes, and relies on that assumption when slicing up source code to apply highlighting (in HTML reports, and in text-based reports with colour).

For the very common case of all-ASCII source code, bytes and code points are the same, so the difference isn't noticeable. But for code that contains non-ASCII characters, emitting column numbers as code points will result in llvm-cov slicing strings in the wrong places, producing mangled output or fatal errors.

(See taiki-e/cargo-llvm-cov#275 as an example of what can go wrong.)

Zalathar · 2024-01-06T01:34:04Z

#119034 has landed, so now I can think about making this ready for review.

rustbot · 2024-01-06T01:50:06Z

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

rustbot · 2024-01-06T01:50:47Z

Error: Parsing assign command in comment failed: ...'' | error: specify user to assign to at >| ''...

Please file an issue on GitHub at triagebot if there's a problem with this bot, or reach out on #t-infra on Zulip.

Zalathar · 2024-01-06T01:51:17Z

r? compiler

compiler/rustc_mir_transform/src/coverage/mod.rs

davidtwco

r=me when CI passes

oli-obk · 2024-01-08T11:33:30Z

@bors delegate+

bors · 2024-01-08T11:33:33Z

✌️ @Zalathar, you can now approve this pull request!

If @oli-obk told you to "r=me" after making some further change, please make that change, then do @bors r=@oli-obk

Zalathar · 2024-01-08T11:52:44Z

@bors r=davidtwco

bors · 2024-01-08T11:52:47Z

📌 Commit 6971e93 has been approved by davidtwco

It is now in the queue for this repository.

Zalathar · 2024-03-01T11:56:08Z

@rustbot label +relnotes

This is a user-visible bug fix for taiki-e/cargo-llvm-cov#275.

Suggested summary: “Fix coverage instrumentation/reports for non-ASCII source code”

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. A-code-coverage Area: Source-based code coverage (-Cinstrument-coverage) labels Dec 17, 2023

Zalathar mentioned this pull request Dec 17, 2023

Allow coverage tests to ignore test modes, and to enable color in coverage reports #119034

Merged

Swatinem approved these changes Dec 17, 2023

View reviewed changes

This comment was marked as resolved.

Sign in to view

Zalathar force-pushed the unicode branch from ae031ef to f14cf71 Compare December 28, 2023 03:20

Zalathar force-pushed the unicode branch from f14cf71 to 349a9c1 Compare January 6, 2024 01:32

Zalathar marked this pull request as ready for review January 6, 2024 01:50

This comment was marked as resolved.

Sign in to view

rustbot assigned davidtwco Jan 6, 2024

Zalathar mentioned this pull request Jan 6, 2024

coverage: Never emit improperly-ordered coverage regions #119460

Merged

oli-obk reviewed Jan 8, 2024

View reviewed changes

compiler/rustc_mir_transform/src/coverage/mod.rs Outdated Show resolved Hide resolved

Zalathar added 2 commits January 8, 2024 21:43

coverage: Test for column numbers involving non-ASCII characters

585a285

coverage: Allow make_code_region to fail

88f5759

Zalathar force-pushed the unicode branch from 349a9c1 to 4060418 Compare January 8, 2024 10:49

coverage: llvm-cov expects column numbers to be bytes, not code points

6971e93

Zalathar force-pushed the unicode branch from 4060418 to 6971e93 Compare January 8, 2024 10:58

davidtwco approved these changes Jan 8, 2024

View reviewed changes

bors removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 8, 2024

bors added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Jan 8, 2024

This was referenced Jan 8, 2024

Rollup of 8 pull requests #119737

Closed

Rollup of 8 pull requests #119745

Closed

Rollup of 11 pull requests #119747

Closed

Rollup of 10 pull requests #119754

Merged

bors merged commit 70e3f8d into rust-lang:master Jan 9, 2024

rustbot added this to the 1.77.0 milestone Jan 9, 2024

Zalathar deleted the unicode branch January 9, 2024 02:24

Zalathar mentioned this pull request Jan 9, 2024

html output break utf-8 taiki-e/cargo-llvm-cov#275

Closed

rustbot added the relnotes Marks issues that should be documented in the release notes of the next release. label Mar 1, 2024

Zalathar mentioned this pull request Mar 17, 2024

coverage: Replace color terminal tests with HTML output tests #122631

Closed

Uh oh!

coverage: llvm-cov expects column numbers to be bytes, not code points #119033

coverage: llvm-cov expects column numbers to be bytes, not code points #119033

Uh oh!

Conversation

Zalathar commented Dec 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as resolved.

Zalathar commented Jan 6, 2024

Uh oh!

rustbot commented Jan 6, 2024

Uh oh!

This comment was marked as resolved.

rustbot commented Jan 6, 2024

Uh oh!

Zalathar commented Jan 6, 2024

Uh oh!

Uh oh!

davidtwco left a comment

Choose a reason for hiding this comment

Uh oh!

oli-obk commented Jan 8, 2024

Uh oh!

bors commented Jan 8, 2024

Uh oh!

Zalathar commented Jan 8, 2024

Uh oh!

bors commented Jan 8, 2024

Uh oh!

Zalathar commented Mar 1, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

coverage: `llvm-cov` expects column numbers to be bytes, not code points #119033

coverage: `llvm-cov` expects column numbers to be bytes, not code points #119033

Zalathar commented Dec 17, 2023 •

edited

Loading