A Python-based web scraping project designed to extract structured perfume product data from Liliome.com.
The scraper collects brand and product information and supports multiple database backends, allowing the same data pipeline to be stored in relational and NoSQL databases.
- Uses
requests.Sessionwith retry logic - Handles connection failures gracefully (
safe_get())
- Extracts:
- Brand name
- English title
- Persian title
- Old price
- New price
- Product rating (Point)
- Photo URL
- Automatically discovers all available brands and their product pages
- Detects number of pages for each brand using
total_pages()
The project has been refactored to support multiple storage backends, making it easy to switch between databases:
- SQLite β lightweight local storage
- SQL Server β enterprise relational database
- PostgreSQL β open-source relational database
- MongoDB β NoSQL document-based storage
This design enables comparison between SQL and NoSQL data models using the same scraping logic. Two tables are created automatically:
| Column | Type | Description |
|---|---|---|
| Brand_ID | INTEGER | Primary key |
| Brand_Link | TEXT | URL of brand page |
| Brand_Name | TEXT | Extracted brand name |
| Column | Type | Description |
|---|---|---|
| ID | INTEGER | Primary key |
| Brand | TEXT | Brand slug |
| EnglishName | TEXT | Product English title |
| Name | TEXT | Product Persian title |
| Point | FLOAT | Product rating |
| OldPrice | INTEGER | Old price |
| NewPrice | INTEGER | New price |
| Photo | TEXT | Image URL |
- Python 3
- Requests
- BeautifulSoup4
- SQLite3
- SQL Server
- PostgreSQL
- MongoDB
- Retry & Timeout handling
- Regex for price cleanup
Perfume_Scraper/
β
βββ assets/
β βββ mongodb_brands.png
β βββ mongodb_master.png
β βββ postgres_brands.png
β βββ postgres_master.png
β βββ sqlite_brands.png
β βββ sqlite_master.png
β βββ sqlserver_brands.png
β βββ sqlserver_master.png
β
βββ db/
β βββ Perfume.db # Automatically created database for SQLite
β
βββ Scraper_MongoDB.py
βββ Scraper_MongoDB_Safe.py
βββ Scraper_PostgreSQL.py
βββ Scraper_PostgreSQL_Safe.py
βββ Scraper_SQL.py
βββ Scraper_SQL_Safe.py
βββ Scraper_SQLite.py
βββ Scraper_SQLite_Safe.py
βββ README.md
βββ requirements.txt
The script visits:
https://liliome.com/Ψ¨Ψ±ΩΨ―ΩΨ§-ΨΉΨ·Ψ±-Ψ§Ψ―Ϊ©ΩΩ-ΩΨ±ΩΨ΄Ϊ―Ψ§Ω-ΨΉΨ·Ψ±-ΩΫΩΫΩΩ
It finds all brand links and stores them in the Brands table.
- Detects how many pages of products exist
- Extracts products from each page
- Saves structured data into the
Mastertable
- Clone the repository:
git clone https://github.com/SamiraSiavash/Perfume_Scraper_Multi_DB.git
cd Perfume_Scraper_Multi_DB
- Install dependencies:
pip install -r requirements.txt
- Run one of the scrapers:
Scraper_MongoDB.py
Scraper_MongoDB_Safe.py
Scraper_PostgreSQL.py
Scraper_PostgreSQL_Safe.py
Scraper_SQL.py
Scraper_SQL_Safe.py
Scraper_SQLite.py
Scraper_SQLite_Safe.py








- Adjust CSS selectors depending on website structure.
- Website layouts may change; update selectors accordingly.
- Always follow the target websiteβs Terms of Service.
MIT License (optional)
Samira Siavash
π GitHub: https://github.com/SamiraSiavash
π LinkedIn: https://linkedin.com/in/samira-siavash