A Python-based data analysis project using Jupyter Notebook to explore the NCRB Crime in India 2020 dataset. This project covers data cleaning, exploratory data analysis (EDA), and data visualization using Pandas, Matplotlib, and Seaborn.
This project analyzes crime trends across India using official NCRB data from 2020. The goal is to:
- Clean and preprocess real-world data
- Explore state-wise and crime-wise statistics
- Visualize key insights using bar charts, heatmaps, and more
- Understand regional trends and anomalies
The Jupyter Notebook walks through the entire pipeline of data analysis from import to insight.
The dataset used is published by the National Crime Records Bureau (NCRB), Ministry of Home Affairs, India, and contains detailed records of reported crimes across Indian states and union territories for the year 2020.
Key features:
- Crime statistics by age
- State/UT-wise distribution
- Gender based breakdowns
Source: ncrb.gov.in
Before running this project, ensure you have the following installed:
- Python 3.7+
- Jupyter Notebook
- pip (Python package manager)
Required Python libraries:
- pandas
- numpy
- matplotlib
- seaborn
- openpyxl (for reading
.xlsx
files)
You can install them via: pip install pandas numpy matplotlib seaborn openpyxl
1. Clone the repository: git clone https://github.com/themrinal/Data-Analysis-Project.git cd Data-Analysis-Project
2. (Optional) Create a virtual environment: python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
3. Install dependencies: pip install -r requirements.txt
1. Launch Jupyter Notebook: jupyter notebook
2. Open the file: Data_Analysis_Project_with_Python.ipynb
3. Run each cell step-by-step to follow the analysis.
- Python – Core programming language
- Jupyter Notebook – Interactive coding environment
- Pandas – Data manipulation and analysis
- NumPy – Numerical operations
- Matplotlib – Plotting and visualizations
- Seaborn – Statistical data visualization