🌸 Perfume Scraper

A Python-based web scraping project designed to extract structured perfume product data from Liliome.com.
The scraper collects brand and product information and supports multiple database backends, allowing the same data pipeline to be stored in relational and NoSQL databases.

📌 Features

✔ Robust HTTP session

Uses requests.Session with retry logic
Handles connection failures gracefully (safe_get())

✔ Web scraping

Extracts:
- Brand name
- English title
- Persian title
- Old price
- New price
- Product rating (Point)
- Photo URL
Automatically discovers all available brands and their product pages

✔ Pagination handling

Detects number of pages for each brand using total_pages()

🗄️ Supported Databases

The project has been refactored to support multiple storage backends, making it easy to switch between databases:

SQLite – lightweight local storage
SQL Server – enterprise relational database
PostgreSQL – open-source relational database
MongoDB – NoSQL document-based storage

This design enables comparison between SQL and NoSQL data models using the same scraping logic. Two tables are created automatically:

`Brands`

Column	Type	Description
Brand_ID	INTEGER	Primary key
Brand_Link	TEXT	URL of brand page
Brand_Name	TEXT	Extracted brand name

`Master`

Column	Type	Description
ID	INTEGER	Primary key
Brand	TEXT	Brand slug
EnglishName	TEXT	Product English title
Name	TEXT	Product Persian title
Point	FLOAT	Product rating
OldPrice	INTEGER	Old price
NewPrice	INTEGER	New price
Photo	TEXT	Image URL

🛠 Technologies Used

Python 3
Requests
BeautifulSoup4
SQLite3
SQL Server
PostgreSQL
MongoDB
Retry & Timeout handling
Regex for price cleanup

📁 Project Structure

Perfume_Scraper/
│
├── assets/
│ └── mongodb_brands.png
│ └── mongodb_master.png
│ └── postgres_brands.png
│ └── postgres_master.png
│ └── sqlite_brands.png
│ └── sqlite_master.png
│ └── sqlserver_brands.png
│ └── sqlserver_master.png
│
├── db/
│ └── Perfume.db # Automatically created database for SQLite
│
├── Scraper_MongoDB.py
├── Scraper_MongoDB_Safe.py
├── Scraper_PostgreSQL.py
├── Scraper_PostgreSQL_Safe.py
├── Scraper_SQL.py
├── Scraper_SQL_Safe.py
├── Scraper_SQLite.py
├── Scraper_SQLite_Safe.py
├── README.md
├── requirements.txt

🚀 How It Works

1️⃣ Load Liliome brand list

The script visits:

https://liliome.com/برندها-عطر-ادکلن-فروشگاه-عطر-لیلیوم

It finds all brand links and stores them in the Brands table.

2️⃣ For each brand:

Detects how many pages of products exist
Extracts products from each page
Saves structured data into the Master table

▶️ How to Run

Clone the repository:

git clone https://github.com/SamiraSiavash/Perfume_Scraper_Multi_DB.git
cd Perfume_Scraper_Multi_DB

Install dependencies:

pip install -r requirements.txt

Run one of the scrapers:

Scraper_MongoDB.py
Scraper_MongoDB_Safe.py
Scraper_PostgreSQL.py
Scraper_PostgreSQL_Safe.py
Scraper_SQL.py
Scraper_SQL_Safe.py
Scraper_SQLite.py
Scraper_SQLite_Safe.py

🖼 Screenshots

SQLite

![Brands Table](assets/sqlite_brands.png)

![Master Table](assets/sqlite_master.png)

SQL Server

![Brands Table](assets/sqlserver_brands.png)

![Master Table](assets/sqlserver_master.png)

PostgreSQL

![Brands Table](assets/postgres_brands.png)

![Master Table](assets/postgres_master.png)

MongoDB

![Brands Collection](assets/mongodb_brands.png)

![Master Collection](assets/mongodb_master.png)

📝 Notes

Adjust CSS selectors depending on website structure.
Website layouts may change; update selectors accordingly.
Always follow the target website’s Terms of Service.

📄 License

MIT License (optional)

✨ Author

Samira Siavash

🔗 GitHub: https://github.com/SamiraSiavash

🔗 LinkedIn: https://linkedin.com/in/samira-siavash

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🌸 Perfume Scraper

📌 Features

✔ Robust HTTP session

✔ Web scraping

✔ Pagination handling

🗄️ Supported Databases

`Brands`

`Master`

🛠 Technologies Used

📁 Project Structure

🚀 How It Works

1️⃣ Load Liliome brand list

2️⃣ For each brand:

▶️ How to Run

🖼 Screenshots

SQLite

SQL Server

PostgreSQL

MongoDB

📝 Notes

📄 License

✨ Author

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
assets		assets
db		db
README.md		README.md
Scraper_MongoDB.py		Scraper_MongoDB.py
Scraper_MongoDB_Safe.py		Scraper_MongoDB_Safe.py
Scraper_PostgreSQL.py		Scraper_PostgreSQL.py
Scraper_PostgreSQL_Safe.py		Scraper_PostgreSQL_Safe.py
Scraper_SQL.py		Scraper_SQL.py
Scraper_SQL_Safe.py		Scraper_SQL_Safe.py
Scraper_SQLite.py		Scraper_SQLite.py
Scraper_SQLite_Safe.py		Scraper_SQLite_Safe.py
requirements.txt		requirements.txt

SamiraSiavash/Perfume_Scraper_Multi_DB

Folders and files

Latest commit

History

Repository files navigation

🌸 Perfume Scraper

📌 Features

✔ Robust HTTP session

✔ Web scraping

✔ Pagination handling

🗄️ Supported Databases

Brands

Master

🛠 Technologies Used

📁 Project Structure

🚀 How It Works

1️⃣ Load Liliome brand list

2️⃣ For each brand:

▶️ How to Run

🖼 Screenshots

SQLite

SQL Server

PostgreSQL

MongoDB

📝 Notes

📄 License

✨ Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`Brands`

`Master`

Packages