Skip to content

aitrace-dev/github-starts-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GitHub Stargazers Crawler

A simple Scrapy-based crawler that extracts information about users who have starred a GitHub repository.

Setup Instructions

Setting up a Virtual Environment

  1. Make sure you have Python installed (Python 3.6+ recommended)

  2. Create a virtual environment:

    # Navigate to the project directory
    cd /path/to/leads-crawler/github_stargazers
    
    # Create a virtual environment
    python -m venv venv
    
    # Activate the virtual environment
    # On macOS/Linux:
    source venv/bin/activate
    # On Windows:
    # venv\Scripts\activate
  3. Install dependencies:

    # With the virtual environment activated
    pip install -r requirements.txt

Running the Crawler

Option 1: Using the run.py script (Recommended)

# Make sure your virtual environment is activated
python run.py --repo username/repository

Options:

  • --repo or -r: GitHub repository in format "username/repository" (default: "langfuse/langfuse")
  • --output or -o: Custom output filename without extension (optional)

Examples:

# Default repository (langfuse/langfuse)
python run.py

# Custom repository
python run.py --repo openai/openai-python

# Custom repository with custom output filename
python run.py --repo openai/openai-python --output openai_stars

Option 2: Using Scrapy directly

# Make sure your virtual environment is activated
scrapy crawl stargazers -a repo_url="https://github.com/username/repository"

Example:

scrapy crawl stargazers -a repo_url="https://github.com/openai/openai-python"

Output

Results will be saved to CSV files in the results directory with the following format:

  • Custom filename: custom_filename.csv
  • Default filename: stargazers_owner_repo_YYYYMMDD_HHMMSS.csv

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages