Skip to content
This repository was archived by the owner on Jun 5, 2025. It is now read-only.

feat(vector-db): add cve_packages table #1243

Merged
merged 1 commit into from
Mar 7, 2025

Conversation

samuv
Copy link
Contributor

@samuv samuv commented Mar 6, 2025

This PR introduces a new table, cve_packages, to store package-version pairs that have at least one vulnerability classified as high or critical. The data is stored in the S3 bucket alongside malicious, deprecated, and archived data. The goal is to have this data ready for when we export the dependency list. The data is generated by the GitHub Action and is currently triggered manually.

Changes

Added a new table cve_packages with the following fields:

  • name (TEXT, NOT NULL) – package name
  • version (TEXT, NOT NULL) – package version
  • type (TEXT, NOT NULL) – package type (e.g., npm, pypi)

Created indexes on the name, name:version, and name:version:type fields to optimize queries.
Modified import_packages.py to process and insert package versions with high/critical vulnerabilities into cve_packages.

@samuv samuv force-pushed the add-cve-to-vector-db branch from 72c1e48 to 901cbbe Compare March 6, 2025 15:00
Copy link
Contributor

@aponcedeleonch aponcedeleonch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Should we calculate the embeddings for the packages with CVEs? Or is that left for a future PR?

@samuv
Copy link
Contributor Author

samuv commented Mar 7, 2025

Looks good! Should we calculate the embeddings for the packages with CVEs? Or is that left for a future PR?

for packages with CVEs, we realized that embeddings aren’t needed since it’s just a straightforward select query for package:version

@aponcedeleonch aponcedeleonch merged commit 43de72a into stacklok:main Mar 7, 2025
5 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants