This dataset contains the analysis data used in the report “International Comparative Study Report on Open Source Software Publication Activities by Governments: Quantitative Analysis of Government Agency Repositories on GitHub” published by the Information-technology Promotion Agency, Japan (IPA).
It compiles two sets of snapshot data collected at different points in time, focusing on repositories from official accounts of government agencies and related organizations published on GitHub. This data is organized to enable comparison of quantitative trends in open source software (OSS) publishing activities.
| File Name | Content | Data Acquisition Period | Primary Key | Relationship |
|---|---|---|---|---|
| organization_master.csv | Master information for target government agency GitHub organization accounts | Fixed (as of 2025 survey) | organization_id |
Base for each statistical data point |
| github_org_stats_firstcommit.csv | Snapshot A: Repository statistics primarily acquired from late September to early October 2025 | 2025/9/30–2025/10/3, 2025/12/17–2025/12/18 | (organization_id, repository) |
One-to-many relationship with organization master |
| github_org_stats_pr.csv | Snapshot B: Repository statistics at the time of acquisition in September and December 2025 | 2025/9/16–2025/9/24, 2025/12/17–2025/12/18 | (organization_id, repository) |
One-to-many relationship with organization master |
| Column Name | Description | Type | Example |
|---|---|---|---|
| organization_id | Organization identifier (internally unique) | int | 12 |
| organization_account | GitHub organization account name | string | govuk |
| repository_count | Number of target repositories | int | 53 |
| country_code | Country code (ISO 3166-1 alpha-2 format) | string | GB |
| Column Name | Content | Type | Example |
|---|---|---|---|
| organization_id | Organization ID (corresponds to master) | int | 12 |
| repository | Repository name | string | govuk-frontend |
| star | Number of stars (at Snapshot A) | int | 1420 |
| fork | Number of forks (at same time) | int | 205 |
| branch | Number of branches (at same point) | int | 8 |
| people | Number of contributors (at same point) | int | 24 |
| first_commit_date | First commit date and time | string (ISO 8601) | 2015-09-01T09:30:00Z |
| Column Name | Description | Type | Example |
|---|---|---|---|
| organization_id | Organization ID (corresponds to master) | int | 12 |
| repository | Repository name | string | govuk-frontend |
| star | Number of stars (at Snapshot B) | int | 1456 |
| fork | Number of forks (at same time) | int | 210 |
| branch | Number of branches (at same point) | int | 8 |
| people | Number of commit contributors (at same point) | int | 24 |
| issue | Total number of issues (at same point) | int | 158 |
| pull_request | Total number of pull requests (at same point) | int | 233 |
| contributor | Contributors (at time) | int | 45 |
github_org_stats_firstcommit.csv):
September 30–October 3, 2025, and December 17–18, 2025github_org_stats_pr.csv): September 16–24,
2025, and December 17–18, 2025| File | Primary Key | Foreign Key |
|---|---|---|
| organization_master.csv | organization_id |
- |
| github_org_stats_firstcommit.csv | (organization_id, repository) |
organization_id → organization_master |
| github_org_stats_pr.csv | (organization_id, repository) |
organization_id → organization_master |
| Item | Content |
|---|---|
| Data Collection Period | - Snapshot A (firstcommit.csv): 2025/9/30–10/3,
2025/12/17–12/18 - Snapshot B (pr.csv): 2025/9/16–9/24, 2025/12/17–12/18 |
| Target Countries | Japan, Estonia, Singapore, Germany, France, United States, United Kingdom |
| Data Release Date | January 28, 2026 |
| Update Schedule | To be determined |
This dataset is provided under the Creative Commons Attribution 4.0 International License (CC BY 4.0). It may be freely used for commercial or non-commercial purposes provided the source is clearly attributed.
Citation Example:
Information-technology Promotion Agency, Japan (IPA) “International Comparative Study Report on Open Source Software Publication Activities by Governments: Quantitative Analysis of Government Agency Repositories on GitHub” Analysis Data (2026)
This organization makes no warranties whatsoever regarding the usefulness, accuracy, non-infringement of intellectual property rights, or any other aspect of the content of this dataset.
Information-technology Promotion Agency, Japan (IPA)
Digital Infrastructure Center
Email: disc-info@ipa.go.jp