International Comparative Study Report on Open Source Software Publication Activities by Governments: Analysis Data

Overview

This dataset contains the analysis data used in the report “International Comparative Study Report on Open Source Software Publication Activities by Governments: Quantitative Analysis of Government Agency Repositories on GitHub” published by the Information-technology Promotion Agency, Japan (IPA).

It compiles two sets of snapshot data collected at different points in time, focusing on repositories from official accounts of government agencies and related organizations published on GitHub. This data is organized to enable comparison of quantitative trends in open source software (OSS) publishing activities.

Dataset Structure

File Name Content Data Acquisition Period Primary Key Relationship
organization_master.csv Master information for target government agency GitHub organization accounts Fixed (as of 2025 survey) organization_id Base for each statistical data point
github_org_stats_firstcommit.csv Snapshot A: Repository statistics primarily acquired from late September to early October 2025 2025/9/30–2025/10/3, 2025/12/17–2025/12/18 (organization_id, repository) One-to-many relationship with organization master
github_org_stats_pr.csv Snapshot B: Repository statistics at the time of acquisition in September and December 2025 2025/9/16–2025/9/24, 2025/12/17–2025/12/18 (organization_id, repository) One-to-many relationship with organization master

Data Item Definitions

1. organization_master.csv

Column Name Description Type Example
organization_id Organization identifier (internally unique) int 12
organization_account GitHub organization account name string govuk
repository_count Number of target repositories int 53
country_code Country code (ISO 3166-1 alpha-2 format) string GB

2. github_org_stats_firstcommit.csv (Snapshot A)

Column Name Content Type Example
organization_id Organization ID (corresponds to master) int 12
repository Repository name string govuk-frontend
star Number of stars (at Snapshot A) int 1420
fork Number of forks (at same time) int 205
branch Number of branches (at same point) int 8
people Number of contributors (at same point) int 24
first_commit_date First commit date and time string (ISO 8601) 2015-09-01T09:30:00Z

3. github_org_stats_pr.csv (Snapshot B)

Column Name Description Type Example
organization_id Organization ID (corresponds to master) int 12
repository Repository name string govuk-frontend
star Number of stars (at Snapshot B) int 1456
fork Number of forks (at same time) int 210
branch Number of branches (at same point) int 8
people Number of commit contributors (at same point) int 24
issue Total number of issues (at same point) int 158
pull_request Total number of pull requests (at same point) int 233
contributor Contributors (at time) int 45

Data Creation Method

Primary Key and Relation Structure

File Primary Key Foreign Key
organization_master.csv organization_id -
github_org_stats_firstcommit.csv (organization_id, repository) organization_id → organization_master
github_org_stats_pr.csv (organization_id, repository) organization_id → organization_master

Analysis Considerations

Update History & Version Information

Item Content
Data Collection Period - Snapshot A (firstcommit.csv): 2025/9/30–10/3, 2025/12/17–12/18
- Snapshot B (pr.csv): 2025/9/16–9/24, 2025/12/17–12/18
Target Countries Japan, Estonia, Singapore, Germany, France, United States, United Kingdom
Data Release Date January 28, 2026
Update Schedule To be determined

License

This dataset is provided under the Creative Commons Attribution 4.0 International License (CC BY 4.0). It may be freely used for commercial or non-commercial purposes provided the source is clearly attributed.

Citation Example:

Information-technology Promotion Agency, Japan (IPA) “International Comparative Study Report on Open Source Software Publication Activities by Governments: Quantitative Analysis of Government Agency Repositories on GitHub” Analysis Data (2026)

Disclaimer

This organization makes no warranties whatsoever regarding the usefulness, accuracy, non-infringement of intellectual property rights, or any other aspect of the content of this dataset.

Contact

Information-technology Promotion Agency, Japan (IPA)

Digital Infrastructure Center

Email: disc-info@ipa.go.jp