Predicting the OWASP Top 10 by looking at CVE data

Disclaimer: This project is a learning exercise, not an official OWASP forecast. It uses public data (CVEs → CWEs → OWASP categories) to explore trends. I’m not a data scientist; feedback and PRs welcome!

What it does

Wrote a couple of scripts that:

Extracts CWE identifiers for each CVE
Maps CWEs → OWASP Top 10 categories
Produces summary tables and plots

Getting the data

Option A — Download CVE data from the NVD website

Visit https://nvd.nist.gov/vuln/data-feeds.
In the CVE JSON section (JSON 2.0 feeds), download the yearly feeds for 2022, 2023, 2024, 2025 (and/or "recent"/"modified" as needed).

Option B — Command line

At the time you run this, check the exact filenames on the NVD page above, then run:

# Example shapes only — confirm filenames on the NVD page first!
# curl -O https://.../CVE-2022.json.gz
# curl -O https://.../CVE-2023.json.gz
# curl -O https://.../CVE-2024.json.gz
# curl -O https://.../CVE-2025.json.gz

Map CVE → OWASP categories

The CWE -> Owasp category was copied from the CWE website

Run cwe.py for each JSON file that has been downloaded. Make sure to replace the filename for the input and output file in this line:

process_nvd_file("nvdcve-2.0-2025.json", "mapped_output_2025.json")

Analysis & Visualization

The cwe_total.py file aggregates counts by OWASP category and plots a graph.

Ensure all the mapped_output_year.json files are in the same directory.

Run cwe_total.py to get a data table and a visualization of the yearly CVE occurrences of each category.

This is what the CVE data from 2022 - July 2025 looks like

Here’s a table with all the data:

	A01: Broken Access Control	A02: Cryptographic Failures	A03: Injection	A04: Insecure Design	A05: Security Misconfiguration	A06: Vulnerable & Outdated Components	A07: Identification & Authentication Failures	A08: Software & Data Integrity Failures
2022	4004	370	6496	1217	151	1	1233	334
2023	5498	411	8846	1480	178	1	1357	468
2024	7182	447	13280	1922	163	4	1430	584
2025	4314	209	7563	1056	90	2	774	418
Totals	20998	1437	36185	5675	582	8	4794	1804

Notes, assumptions, caveats

Coverage bias: CVEs/CWEs underrepresent some classes (e.g., business logic flaws) and I've added only web app risks.
Many‑to‑many mapping: Some CVEs have multiple CWEs; some CWEs could map to multiple OWASP categories. Document your rule (e.g., pick the most directly related to app‑layer risk).
Not the OWASP process: OWASP also uses partner telemetry, incident data, and community surveys. We’re focusing on public CVE data only.

Attribution

CVE® is a registered trademark of The MITRE Corporation.
Data courtesy of the National Vulnerability Database (NVD).

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
README.md		README.md
cwe.py		cwe.py
cwe_total.py		cwe_total.py
owasp A01-A08 chart.png		owasp A01-A08 chart.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Predicting the OWASP Top 10 by looking at CVE data

What it does

Getting the data

Option A — Download CVE data from the NVD website

Option B — Command line

Map CVE → OWASP categories

Analysis & Visualization

Notes, assumptions, caveats

Attribution

About

Uh oh!

Releases

Packages

Languages

sohanmaheshwar/cve-owasp-mapping

Folders and files

Latest commit

History

Repository files navigation

Predicting the OWASP Top 10 by looking at CVE data

What it does

Getting the data

Option A — Download CVE data from the NVD website

Option B — Command line

Map CVE → OWASP categories

Analysis & Visualization

Notes, assumptions, caveats

Attribution

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages