Kiva Loan Part 1: Database Set Up & Normalization

Overview

This repository is the first phase of a comprehensive data analytics project leveraging the Kiva Loan datasets. In this phase, we focus on transitioning raw data from Excel to a structured MySQL database. The aim is to create a clean, normalized database ready for analysis in subsequent phases.

Objectives

Clean datasets: Prepare and clean raw data in Excel.
Database schema creation: Design the database structure in MySQL.
Data import: Load the cleaned datasets into MySQL.
Normalization: Normalize tables( up to 3 NF) to eliminate redundancy and establish relationships.
ERD creation: Generate an Entity-Relationship Diagram (ERD) to visualize the database structure.

Project Deliverables

SQL Scripts:
- create_schema.sql: Defines the database schema and creates Skeleton tables for all datasets.
- data_import.sql: Imports cleaned data into the database.
- normalization.sql: Normalizes tables and establishes relationships.
ERD Diagram: A diagram illustrating the relationships between tables.

Datasets

Kiva Loans
- Rows: 671,205
- Columns: 20
- Contains information on loans, amounts, activities, sectors, and borrower details.
Kiva MPI Region Location
- Contains regional information, geographic coordinates, and Multidimensional Poverty Index (MPI) data.
- Rows: 2772
- Columns: 9
Loan Theme IDs
- Metadata about loan themes and their types.
- Rows: 779,093
- Columns: 4
Loan Themes by Region
- Provides details about loan themes categorized by region.
- Rows: 15,736
- Columns:21
  
  Data Description

Tools Used

MySQL Workbench: For database creation, normalization, and data import. For ERD creation and visualization.
Microsoft Excel: For cleaning and preparing datasets.

Steps to Reproduce

Step 1: Clean the Datasets

Duplicate each dataset before cleaning.
Standardize column names, data formats, and values.
Replace missing values and perform data validation.
Save the cleaned files in the datasets directory.

Data Cleaning Guide

Step 2: Create the Database Schema

Run the create_schema.sql script in MySQL Workbench to create the database and skeleton tables for the datasets.

Step 3: Import Data

Execute the data_import.sql script to load data into the tables.

Check this guide on how to import large data into MySQL with no data lost or compromised: Here

Step 4: Normalize Tables

Execute the normalization.sql script to normalize the tables and define relationships.

Step 6: Generate ERD

MySQL Workbench to create the ERD and export it as kiva_erd.png.

Entity-Relationship Diagram (ERD)

(To be added after normalization)

Future Enhancements

Add constraints to improve data integrity.
Optimize queries for faster data retrieval.

Author

Olamide Quzeem

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
LICENSE		LICENSE
README.md		README.md
create_schema.sql		create_schema.sql
data_import.sql		data_import.sql

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Kiva Loan Part 1: Database Set Up & Normalization

Overview

Objectives

Project Deliverables

Datasets

Tools Used

Steps to Reproduce

Step 1: Clean the Datasets

Step 2: Create the Database Schema

Step 3: Import Data

Step 4: Normalize Tables

Step 6: Generate ERD

Entity-Relationship Diagram (ERD)

Future Enhancements

Author

License

About

Uh oh!

Releases

Packages

License

quzeem91/Kiva-Loan-Part-1-Database-Set-Up-Normalization

Folders and files

Latest commit

History

Repository files navigation

Kiva Loan Part 1: Database Set Up & Normalization

Overview

Objectives

Project Deliverables

Datasets

Tools Used

Steps to Reproduce

Step 1: Clean the Datasets

Step 2: Create the Database Schema

Step 3: Import Data

Step 4: Normalize Tables

Step 6: Generate ERD

Entity-Relationship Diagram (ERD)

Future Enhancements

Author

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages