Cloud edition: https://collab.dev
collab-dev is an open-source tool that generates collaboration metrics and insights from GitHub pull request data. Use it to analyze collaboration patterns, review workflow, process efficiency, and more.
- Data Collection: Fetches pull request data from any public or private GitHub repository (requires GitHub token).
- Visualization: Generate interactive charts using Plotly.
- Command Line Interface: Run analysis with a single command.
- Portable & Minimal: Designed to work with CSV data to keep things simple.
- Extensible: Add new charts by adding them to the chart modules list.
- Python 3.12+
- Python Dependency Manager (
pdm
) - Installation Instructions - A GitHub Personal Access Token with repository read permissions
- Clone the repository:
git clone https://github.com/pullflow/collab-dev.git
cd collab-dev
- Install dependencies:
pdm install
- Set up your GitHub API token as an environment variable:
export GITHUB_TOKEN=your_token_here
To download data from a GitHub repository, run:
pdm collect owner/repo_name
This will generate CSV files with pull request data in the data/
directory.
You can specify the number of PRs to fetch using the -n
flag:
pdm collect -n 100 owner/repo_name
For example, to collect 100 PRs from the React repository using your GitHub token:
GITHUB_TOKEN=your_token pdm run collect -n 100 facebook/react
Alternatively, you can save your GitHub token in a .env
file.
To analyze the data and view the results:
- Start the Flask application:
pdm serve
- Open your browser and navigate to:
-
You'll see a list of repositories you've collected data for using the collect script.
-
Click on any repository to view its detailed metrics and visualizations at
/report/owner/repo
.
collab-dev organizes collected data in a hierarchical file structure:
./data/
├── {owner}/
│ ├── {repo_name}/
│ │ ├── repository.csv # Repository metadata
│ │ ├── pull_requests.csv # All PR data for this repo
│ │ ├── all_events.csv # Consolidated events from all PRs
│ │ ├── pr_{number}/
│ │ │ └── events.csv # Events for specific PR
│ │ ├── pr_{number}/
│ │ │ └── events.csv
│ │ └── ...
- repository.csv: Contains metadata about the GitHub repository
- pull_requests.csv: Stores information about all pull requests collected from the repository
- all_events.csv: Consolidates timeline events from all PRs for easier analysis
- events.csv: In each PR subdirectory, stores the timeline events for that specific PR
This structure allows for efficient data collection, storage, and analysis while maintaining a clear organization based on GitHub's repository hierarchy.
Charts are defined in the CHART_MODULES
list in src/collab_dev/components/charts/chart_renderer.py
. To add a custom chart:
- Create a new module in
src/collab_dev/components/charts/
- Implement a
render(repo_df)
function in your module - Add your module to the
CHART_MODULES
list inchart_renderer.py
Existing chart types include:
- Workflow (Sankey diagram)
- Contributor distribution patterns
- Bot contribution analysis
- Review coverage metrics
- Review funnel analysis
- Review turnaround time
- Request Approval time analysis
- Merge time distribution
We use ruff
for code formatting and linting:
# Run linter
pdm lint
# Format code
pdm format
# Fix auto-fixable issues
pdm lint-fix
We're looking for help in the following areas:
- Validate and improve data and calculations: Help ensure our metrics are accurate and meaningful.
- Improve current charts and other visualizations: Enhance the clarity and usefulness of existing visualizations.
- Add new charts that help measure collaboration: Develop new metrics and visualizations that provide insights into team collaboration patterns.
To contribute:
- Fork the repository
- Create your branch:
git checkout -b feature/my-new-feature
- Commit your changes:
git commit -m "Add some feature"
- Push to the branch:
git push origin feature/my-new-feature
- Open a Pull Request
This project is licensed under the MIT License. See the LICENSE file for details.
For issues and feature requests, please use the GitHub Issues page.