CLI Reference

Command-line interface for fetching, processing, and managing Drupal.org data.

Quick Start

# Get all data and create BigQuery tables
make cli extract-all
make cli deploy-all

# Keep data updated
make data

Core Commands

Data Workflow

# Extract: Download data from Drupal.org
make cli extract <resource>     # Single resource
make cli extract-all           # All resources

# Deploy: Create BigQuery tables from local data
make cli deploy <resource>     # Single resource  
make cli deploy-all           # All resources

# Update: Incremental updates (run regularly)
make data                     # Update all resources

Storage Operations

# Sync: Balance local ↔ cloud storage
make cli sync <resource>
make cli sync-all

# Transform: Convert JSON → Parquet
make cli transform <resource>
make cli transform-all

Utilities

# Check version
python cli.py version

# Query BigQuery data
python cli.py select <resource>

Supported Resources

  • project - Drupal modules, themes, distributions
  • issue - Bug reports, feature requests
  • release - Software releases and versions
  • user - User profiles and activity
  • forum - Forum posts and discussions
  • organization - Companies and groups
  • changenotice - API change notifications
  • casestudy - Case studies and examples
  • event - Events and meetups
  • term - Taxonomy terms
  • vocabulary - Taxonomy vocabularies

Use <resource> as the resource name in commands above.