This project is a Python-based web crawler that logs into NetSuite, navigates to specific sections (like Custom Records), and extracts links for analysis. It supports 2FA authentication, security question handling, and headless mode for automation.
✅ Automated Login (Email & Password)
✅ Handles 2FA (Console Input in Headless Mode)
✅ Answers Security Questions Automatically
✅ Crawls NetSuite Pages & Extracts Links
✅ Supports Headless Mode for Automation
✅ Navigates Directly to Custom Records
✅ Scrapes User Role Permissions across Transactions, Reports, Lists, Setup and Custom Record sections
✅ Extracts Workflow Actions for any record type
✅ Scrapes Custom List Values into CSV
✅ Exports Scraped Data to CSV
✅ Handles Multi‐Page Role Lists with Pagination
git clone https://github.com/Joewus12345/netsuite_crawler.git
cd netsuite_crawler
python -m venv venv
source venv/bin/activate # On macOS/Linux
venv\Scripts\activate # On Windows
pip install -r requirements.txt
Create a config.py file in the root directory and add your NetSuite credentials:
# config.py
NETSUITE_URL = "https://your-netsuite-url.com"
NETSUITE_EMAIL = "[email protected]"
NETSUITE_PASSWORD = "your-password"
SECURITY_ANSWER = "your-security-question-answer"
HEADLESS_MODE = False # Change to True to run without opening a browser
Note: The NetSuite administrator login URL can vary between companies or accounts. Replace
https://your-netsuite-url.com
with the correct admin URL for your environment.
Choose one or more scrapers to run using the --scrapers
flag. Scrapers run
sequentially after a single login.
Available scrapers:
crawler
workflows
user-roles
list-values
Each scraper saves its results to a CSV file in the project root:
list-values
→list_values.csv
, containing custom list IDs, names, and their associated values.user-roles
→user_role_permissions.csv
, capturing each role's permissions across transactions, reports, lists, and setup categories.workflows
→workflow_actions.csv
, listing workflow names, record types, and their associated actions.
Scrape list values and user roles:
python main.py --scrapers list-values,user-roles
Scrape workflows for specific record types:
The --records
flag expects a JSON array of record-type names. Quoting rules
vary by terminal:
bash (Linux/macOS):
python main.py --scrapers workflows --records '["Admin Request","Feedback"]'
cmd.exe:
python main.py --scrapers workflows --records "[\"Admin Request\",\"Feedback\"]"
For quick runs you can pre-populate workflow_scraper.HARDCODED
with a list of
record names. This list is used whenever --records
is not passed:
# workflow_scraper.py
HARDCODED: list[str] = ["Admin Request", "Feedback"]
Edit config.py and set:
HEADLESS_MODE = True
Use the same command-line options; the browser runs hidden and prompts you in the terminal for the 2FA code.
- Provide the 2FA code when prompted.
📂 netsuite_crawler
┣ 📜 config.py # Stores credentials & config
┣ 📂 chromedriver # Chrome browser for running project
┣ 📜 main.py # Entry point for the bot
┣ 📜 auth_utils.py # Authentication helpers
┣ 📜 crawler.py # Core logic for logging in & crawling
┣ 📜 list_values_scraper.py # Scrapes custom list values
┣ 📜 user_roles_scraper.py # Scrapes role permissions
┣ 📜 workflow_scraper.py # Scrapes workflow actions
┣ 📂 tests # Unit tests
┣ 📜 requirements.txt # Dependencies list
┗ 📜 README.md # Project documentation (You are here!)
Run the test suite with:
python -m pytest
The tests mock browser interactions and require no live NetSuite credentials.
- Ensure Google Chrome is installed & updated.
- Download ChromeDriver from: chromedriver.chromium.org
- Check if the submit button selector is correct:
div[data-type='primary'][role='button']
- Try logging in manually to ensure credentials are correct.
- Check if NetSuite has CAPTCHA enabled (this bot does not bypass CAPTCHA).