Skip to content

Transitive PyPI resolution consumes a large amount of CPU #2327

@spencerschrock

Description

@spencerschrock

Scorecard upgrade to osv-scanner v2 this past week and saw a huge increase in resource consumption (at least 5-10x, but it could be another order of magnitude higher as our replica count went from 14 -> 1 worker due to resource related evictions).

Profiling shows 98% of our time spent resolving pypi transitive dependencies. I didnt have time to dig into root causes, but does the profile give you any hints if the inefficiency is depsdev, osv-scalibr, or here (e.g. if there's repeated calls to depsdev being done).

Top cumulative pprof functions
Showing top 50 nodes out of 126
      flat  flat%   sum%        cum   cum%
         0     0%     0%    205.33s 98.53%  deps.dev/util/resolve/pypi.(*resolution).resolve
         0     0%     0%    205.33s 98.53%  deps.dev/util/resolve/pypi.(*resolver).Resolve
         0     0%     0%    205.33s 98.53%  github.com/google/osv-scalibr.Scanner.Scan
         0     0%     0%    205.33s 98.53%  github.com/google/osv-scalibr/extractor/filesystem.(*walkContext).handleFile
         0     0%     0%    205.33s 98.53%  github.com/google/osv-scalibr/extractor/filesystem.(*walkContext).runExtractor
         0     0%     0%    205.33s 98.53%  github.com/google/osv-scalibr/extractor/filesystem.Run
         0     0%     0%    205.33s 98.53%  github.com/google/osv-scalibr/extractor/filesystem.RunFS
         0     0%     0%    205.33s 98.53%  github.com/google/osv-scalibr/extractor/filesystem.runOnScanRoot
         0     0%     0%    205.33s 98.53%  github.com/google/osv-scalibr/extractor/filesystem.walkIndividualPaths
         0     0%     0%    205.33s 98.53%  github.com/google/osv-scalibr/extractor/filesystem/internal.WalkDirUnsorted
         0     0%     0%    205.33s 98.53%  github.com/google/osv-scalibr/extractor/filesystem/internal.walkDirUnsorted
         0     0%     0%    205.33s 98.53%  github.com/google/osv-scalibr/extractor/filesystem/language/python/requirementsnet.Extractor.Extract
         0     0%     0%    205.33s 98.53%  github.com/google/osv-scanner/v2/internal/scalibrextract/language/python/requirementsenhancable.(*Extractor).Extract
         0     0%     0%    205.33s 98.53%  github.com/google/osv-scanner/v2/pkg/osvscanner.DoScan
         0     0%     0%    205.33s 98.53%  github.com/google/osv-scanner/v2/pkg/osvscanner.scan
         0     0%     0%    205.33s 98.53%  github.com/ossf/scorecard/v5/checker.(*Runner).Run
         0     0%     0%    205.33s 98.53%  github.com/ossf/scorecard/v5/checks.Vulnerabilities
         0     0%     0%    205.33s 98.53%  github.com/ossf/scorecard/v5/checks/raw.Vulnerabilities
         0     0%     0%    205.33s 98.53%  github.com/ossf/scorecard/v5/clients.osvClient.ListUnfixedVulnerabilities
         0     0%     0%    205.33s 98.53%  github.com/ossf/scorecard/v5/pkg/scorecard.runEnabledChecks.func1
     0.01s 0.0048% 0.0048%    205.22s 98.48%  deps.dev/util/resolve/pypi.(*resolution).attemptToPinCriterion
     0.02s 0.0096% 0.014%    205.17s 98.45%  deps.dev/util/resolve/pypi.(*resolution).getCriteriaToUpdate
     0.06s 0.029% 0.043%    195.28s 93.71%  deps.dev/util/resolve/pypi.(*resolution).mergeIntoCriterion
     0.05s 0.024% 0.067%    195.13s 93.64%  deps.dev/util/resolve/pypi.(*provider).findMatches
     0.07s 0.034%   0.1%    194.54s 93.35%  deps.dev/util/resolve/pypi.(*provider).matchingVersions
     0.02s 0.0096%  0.11%    193.65s 92.93%  github.com/google/osv-scalibr/clients/resolution.(*OverrideClient).MatchingVersions
         0     0%  0.11%    193.58s 92.89%  github.com/google/osv-scalibr/clients/resolution.(*PyPIRegistryClient).MatchingVersions
     0.25s  0.12%  0.23%    186.88s 89.68%  github.com/google/osv-scalibr/clients/resolution.(*PyPIRegistryClient).Versions
     0.04s 0.019%  0.25%    180.27s 86.51%  github.com/google/osv-scalibr/clients/datasource.(*PyPIRegistryAPIClient).GetIndex
     0.12s 0.058%  0.31%    179.28s 86.03%  encoding/json.Unmarshal
     0.16s 0.077%  0.38%    115.78s 55.56%  encoding/json.(*decodeState).unmarshal
     6.04s  2.90%  3.28%    115.76s 55.55%  encoding/json.(*decodeState).object
     1.23s  0.59%  3.87%    115.76s 55.55%  encoding/json.(*decodeState).value
     0.39s  0.19%  4.06%    114.84s 55.11%  encoding/json.(*decodeState).array
    40.52s 19.44% 23.50%     65.81s 31.58%  encoding/json.checkValid
     1.66s   0.8% 24.30%     33.06s 15.86%  encoding/json.(*decodeState).literalStore
    22.31s 10.71% 35.01%     22.68s 10.88%  encoding/json.stateInString
    12.47s  5.98% 40.99%     19.26s  9.24%  encoding/json.(*decodeState).skip
    16.56s  7.95% 48.94%     16.89s  8.10%  encoding/json.unquoteBytes
     1.50s  0.72% 49.66%     16.53s  7.93%  runtime.mallocgc
    13.19s  6.33% 55.99%     15.88s  7.62%  encoding/json.(*decodeState).rescanLiteral
     0.11s 0.053% 56.04%     15.03s  7.21%  deps.dev/util/semver.System.Parse
     0.42s   0.2% 56.24%     14.41s  6.91%  deps.dev/util/semver.System.parse
     2.04s  0.98% 57.22%     11.51s  5.52%  runtime.mapaccess1_faststr
     2.20s  1.06% 58.28%     11.29s  5.42%  encoding/json.indirect
     0.08s 0.038% 58.31%     11.10s  5.33%  deps.dev/util/semver.(*versionParser).version
         0     0% 58.31%     11.09s  5.32%  slices.SortFunc[go.shape.[]string,go.shape.string] (inline)
     0.04s 0.019% 58.33%     11.08s  5.32%  slices.pdqsortCmpFunc[go.shape.string]
     0.27s  0.13% 58.46%     11.02s  5.29%  deps.dev/util/semver.(*versionParser).pep440Version (inline)
     0.05s 0.024% 58.49%     10.85s  5.21%  github.com/google/osv-scalibr/clients/resolution.(*PyPIRegistryClient).Versions.func1

pprof graph Image

We've temporarily disabled transitive scanning through your experimental scan actions, so we at least have a temporary solution:

ExperimentalScannerActions: osvscanner.ExperimentalScannerActions{
	TransitiveScanningActions: osvscanner.TransitiveScanningActions{
		Disabled: true,
	},
},

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions