GitHub Stargazers Crawler

A simple Scrapy-based crawler that extracts information about users who have starred a GitHub repository.

Setup Instructions

Setting up a Virtual Environment

Make sure you have Python installed (Python 3.6+ recommended)

Create a virtual environment:

# Navigate to the project directory
cd /path/to/leads-crawler/github_stargazers

# Create a virtual environment
python -m venv venv

# Activate the virtual environment
# On macOS/Linux:
source venv/bin/activate
# On Windows:
# venv\Scripts\activate

Install dependencies:

# With the virtual environment activated
pip install -r requirements.txt

Running the Crawler

Option 1: Using the run.py script (Recommended)

# Make sure your virtual environment is activated
python run.py --repo username/repository

Options:

--repo or -r: GitHub repository in format "username/repository" (default: "langfuse/langfuse")
--output or -o: Custom output filename without extension (optional)

Examples:

# Default repository (langfuse/langfuse)
python run.py

# Custom repository
python run.py --repo openai/openai-python

# Custom repository with custom output filename
python run.py --repo openai/openai-python --output openai_stars

Option 2: Using Scrapy directly

# Make sure your virtual environment is activated
scrapy crawl stargazers -a repo_url="https://github.com/username/repository"

Example:

scrapy crawl stargazers -a repo_url="https://github.com/openai/openai-python"

Output

Results will be saved to CSV files in the results directory with the following format:

Custom filename: custom_filename.csv
Default filename: stargazers_owner_repo_YYYYMMDD_HHMMSS.csv

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
github_stargazers		github_stargazers
results		results
README.md		README.md
requirements.txt		requirements.txt
run.py		run.py
scrapy.cfg		scrapy.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GitHub Stargazers Crawler

Setup Instructions

Setting up a Virtual Environment

Running the Crawler

Option 1: Using the run.py script (Recommended)

Option 2: Using Scrapy directly

Output

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

GitHub Stargazers Crawler

Setup Instructions

Setting up a Virtual Environment

Running the Crawler

Option 1: Using the run.py script (Recommended)

Option 2: Using Scrapy directly

Output

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages