Billion-Scale Bipartite Graph Embedding: A Global-Local Induced Approach

Requirements

python 3.8
torch >= 1.8.0
numpy
tqdm

Datasets

Yelp and Amazon-Book are obtained from LightGCN. We have preprocessed and provided both datasets in CSR format under the default dataset directory ./dataset.
The remaining datasets are obtained from GEBEp. Please download and extract them under your specified directory GEBE_DATA_ROOT_DIR.

Preprocessing

1. Recommendation

# generate train/test splits
python -u ./code/preprocess_recommendation_convert_2_csr.py --dataset DATASET --edge_root_dir GEBE_DATA_ROOT_DIR --output_root_dir ./dataset

2. Link prediction

# generate train/test splits
python -u ./code/preprocess_lp_convert_2_csr.py --dataset DATASET --edge_root_dir GEBE_DATA_ROOT_DIR --output_root_dir ./dataset

# generate link prediction samples
python -u ./code/generate_lp_samples.py --dataset DATASET --edge_root_dir ./dataset --output_root_dir ./dataset

Usage

The parameter settings for all datasets are provided in run.sh. Several running examples are as follows.

1. Recommendation

Yelp (small dataset)

python -u train.py --dataset Yelp --gpu 0 --reg 0.005 --data_root_dir ./dataset

MAG (large dataset)

python -u train_large_graph.py --dataset MAG --gpu 0 --reg 0.002 --test_batch_interval 200 --test_batch 100 --data_root_dir ./dataset

2. Link prediction

Pinterest (small dataset)

# train AnchorGNN
python -u train.py --dataset Pinterest --gpu 0 --reg 0.005 --save_model --data_root_dir ./dataset --model_root_dir ./models

# train LP classifier
python -u train_lp.py --model_file ./models/Pinterest/model.epoch.15.pt --batch_testing --test_batch 100000

Orkut (large dataset)

# train AnchorGNN
python -u train_large_graph.py --dataset Orkut --reg 0.002 --test_batch_interval 1000 --test_batch 40 --save_model --data_root_dir ./dataset --model_root_dir ./models

# train LP classifier
python -u train_lp.py --model_file ./models/Orkut/model.epoch.0.batch.3000.pt --batch_testing --test_batch 100000

FAQ

If OOM occurs when training the link prediction classifier, adjust test_batch and train_ratio to fit a smaller RAM.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Billion-Scale Bipartite Graph Embedding: A Global-Local Induced Approach

Requirements

Datasets

Preprocessing

Usage

FAQ

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
code		code
dataset		dataset
.gitignore		.gitignore
README.md		README.md
run.sh		run.sh
train.py		train.py
train_large_graph.py		train_large_graph.py
train_lp.py		train_lp.py

iBoom2333/AnchorGNN

Folders and files

Latest commit

History

Repository files navigation

Billion-Scale Bipartite Graph Embedding: A Global-Local Induced Approach

Requirements

Datasets

Preprocessing

Usage

FAQ

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages