Data Fetcher

HTTP client for the Drupal.org API with automatic rate limiting and session management.

Purpose

Downloads paginated data from Drupal.org API endpoints efficiently while respecting rate limits.

Key Features

  • Rate limiting: Automatic delays between requests
  • Session management: Handles cookies and authentication
  • Retry logic: Automatic retries on failures
  • Pagination: Handles multi-page datasets

Usage

Typically used through CLI commands rather than direct Python imports:

# Fetch missing data for a resource
make cli extract project

# Check what needs to be fetched
python cli.py fetch project

Configuration

API endpoints and resource mappings are defined in src/config.py.

Rate limits and timeouts are automatically handled based on Drupal.org's API policies.