Data Updater
Performs incremental updates to fetch only the latest changes from Drupal.org.
Purpose
Efficiently maintains up-to-date datasets by fetching only new or modified records instead of re-downloading everything.
How It Works
- Check timestamp: Finds the latest change timestamp in BigQuery
- Fetch changes: Downloads only records newer than that timestamp
- Process data: Transforms and cleans the new records
- Upsert data: Updates BigQuery tables with new/changed records
Common Commands
# Update single resource with latest changes
make cli update project
# Update all resources (main workflow)
make data
Supported Resources
Most resources support incremental updates, except:
user
- Requires full re-extractionterm
- Requires full re-extractionvocabulary
- Requires full re-extraction
Benefits
- Faster: Only downloads changed data
- Efficient: Reduces API calls and bandwidth
- Fresh data: Keeps datasets current without full rebuilds
- Automated: Can be run regularly via cron/scheduler
When to Use
- Regular maintenance: Daily/weekly automated updates
- After initial setup: Once BigQuery tables exist
- Monitoring changes: Track new issues, releases, etc.
Use make data
as your primary command for keeping data current.