Skip to content

Convert read file content directly to strings #1180

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

jakebailey
Copy link
Member

@jakebailey jakebailey commented Jun 13, 2025

I don't like unsafe, but in this case I believe it to be reasonable.

When reading from a filesystem, the returned bytes must necessarily never be touched by the underlying filesystem ever again. Otherwise, the caller would experience data races or be unable to modify the data. This means that in effect, the FS is guaranteed to no longer "own" that data, and the caller can do whatever it wants with it.

Our FS abstraction works with strings, but the underlying FSs all provide []byte. Given the above, we can skip the typical string to []byte copy and "safely" use unsafe.String to directly convert it.

This nets about a 50% improvement in time and memory allocation on non-trivial files. It's not much, but for larger files like checker.ts, it means going from a 1 ms read to a 0.5 ms read.

I'm not totally sure if we should do this (other stuff is more expensive), but this is an improvement, so...

goos: linux
goarch: amd64
pkg: github.com/microsoft/typescript-go/internal/vfs
cpu: Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz
                             │   old.txt    │               new.txt               │
                             │    sec/op    │   sec/op     vs base                │
ReadFile/MapFS_small-20         296.9n ± 0%   277.5n ± 1%   -6.52% (p=0.000 n=10)
ReadFile/OS_small-20            4.657µ ± 0%   4.723µ ± 1%   +1.41% (p=0.000 n=10)
ReadFile/MapFS_checker.ts-20   1067.8µ ± 6%   498.6µ ± 3%  -53.31% (p=0.000 n=10)
ReadFile/OS_checker.ts-20       833.4µ ± 7%   315.8µ ± 2%  -62.11% (p=0.000 n=10)
geomean                         33.30µ        21.31µ       -36.00%
                             │   old.txt    │               new.txt                │
                             │     B/op     │     B/op      vs base                │
ReadFile/MapFS_small-20          152.0 ± 0%     136.0 ± 0%  -10.53% (p=0.000 n=10)
ReadFile/OS_small-20            1016.0 ± 0%    1000.0 ± 0%   -1.57% (p=0.000 n=10)
ReadFile/MapFS_checker.ts-20   5.906Mi ± 0%   2.953Mi ± 0%  -50.00% (p=0.000 n=10)
ReadFile/OS_checker.ts-20      5.907Mi ± 0%   2.954Mi ± 0%  -50.00% (p=0.000 n=10)
geomean                        48.18Ki        33.00Ki       -31.50%
                             │  old.txt   │              new.txt               │
                             │ allocs/op  │ allocs/op   vs base                │
ReadFile/MapFS_small-20        5.000 ± 0%   4.000 ± 0%  -20.00% (p=0.000 n=10)
ReadFile/OS_small-20           11.00 ± 0%   10.00 ± 0%   -9.09% (p=0.000 n=10)
ReadFile/MapFS_checker.ts-20   5.000 ± 0%   4.000 ± 0%  -20.00% (p=0.000 n=10)
ReadFile/OS_checker.ts-20      11.00 ± 0%   10.00 ± 0%   -9.09% (p=0.000 n=10)
geomean                        7.416        6.325       -14.72%

@Copilot Copilot AI review requested due to automatic review settings June 13, 2025 16:55
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR optimizes vfs.ReadFile by avoiding unnecessary copies when converting file bytes to strings, and adds benchmarks to measure the improvement. Key changes include:

  • Using unsafe.String to convert the byte slice from the filesystem directly into a Go string.
  • Refactoring decodeBytes (and decodeUtf16) to operate on strings instead of byte slices.
  • Adding a benchmark (BenchmarkReadFile) to compare the old vs. new performance and exposing TypeScriptSubmoduleExists as a public helper.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
internal/vfs/vfs_test.go Added BenchmarkReadFile to measure read performance gains.
internal/vfs/internal/internal.go Switched ReadFile to use unsafe.String and updated decoding functions to use strings.
internal/repo/paths.go Introduced public TypeScriptSubmoduleExists wrapper.
Comments suppressed due to low confidence (1)

internal/repo/paths.go:49

  • Exported function TypeScriptSubmoduleExists lacks a doc comment; consider adding a comment to describe its behavior.
func TypeScriptSubmoduleExists() bool {

@jakebailey

This comment was marked as outdated.

@jakebailey jakebailey marked this pull request as draft June 13, 2025 17:03
@jakebailey jakebailey marked this pull request as ready for review June 13, 2025 17:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant