Enabling digital transformations in industries and a society

2.Purpose and Methodology

International Comparative Study Report on Open Source Software Publication Activities by Governments

2.1 Purpose

The purpose of this survey is to grasp the overall picture of "the actual state of OSS disclosure by public institutions worldwide," specifically "which institutions in which countries are disclosing what OSS and when." To achieve this purpose, we focus on repositories created by national government organizations on GitHub, the world's largest OSS development platform, as the information collection platform. Note that this survey excludes local governments and targets only central ministries and agencies and nationwide public sector entities.
GitHub was selected as the information collection platform based on the criterion that it allows for the systematic observation of OSS release activities by government agencies worldwide. Indeed, some government agencies utilize open-source platforms other than GitHub. For example, German government agencies publish OSS on GitLab alongside GitHub. Furthermore, the Indian government has established an inner-source platform limited to domestic use, while Estonia and Singapore also differentiate between GitHub and government-managed inner-source platforms. While utilizing inner-source platforms offers advantages like mitigating risks associated with external disclosure by sharing solutions within specific scopes (e.g., government agencies or domestically), their limited accessibility from other countries makes investigating their actual use challenging, exceeding the scope of this study. Thus, while this study acknowledges the existence and significance of OSS platforms and inner-source platforms used by governments, it prioritized data consistency by focusing on observing specific platforms.
Considering the limitations regarding the target platforms mentioned above, the countries selected for this first-year survey were determined through desk research based on the following criteria: "Countries where the majority of activity can be observed on GitHub" and "Countries where advanced examples related to the context of digital government and OSS utilization can be confirmed." As a result, it was decided to conduct the survey in seven countries: Japan, Estonia, France, the United States, Germany, Singapore, and the United Kingdom.

2.2 Research Methodology

2.2.1 Setting Data Collection Items

This survey established the following data collection items based on four perspectives—"activity level," "maturity level," "organizational size," and "project duration"—regarding the OSS activities of government agencies on GitHub.

A) Activity

OSS activity level can be primarily evaluated by metrics such as "number of repositories," "number of stars," "number of forks," and "number of branches." Activity level relates to how actively an organization publishes OSS and the degree of external interest and reuse it receives. This survey references the following values as observable metrics correlated with activity level.

Number of Repositories (Repositories)

The number of publicly available OSS projects. A higher number indicates that the organization is actively publishing OSS.

Number of Stars (Star)

An indicator showing interest in and popularity of an OSS project. It reflects external attention and usage history.

Number of Forks (Fork)

The number of times others have created derivative developments or reused the project. Indicates the ripple effect of the OSS and its utilization within external developer communities.

Number of Branches

Indicates development diversity and the status of parallel work. Shows an active development structure and simultaneous progress on multiple features.

B) Maturity

Maturity relates to whether the OSS development and operation processes function continuously and the extent to which external improvement proposals and participation are accepted. This survey refers to the following values as observable metrics related to maturity. Higher values for these indicators suggest the organization's OSS operations may be institutionally and culturally mature, enabling sustained improvement and diverse collaboration. However, factors like the quality, stability, and scope of the software itself also influence these metrics, making it important to interpret multiple indicators holistically.

Issue Count

The volume of interactions between users and developers, such as bug reports and feature requests. This indicates the level of engagement and improvement activities with the external developer community.

Pull Request Count (PR)

The number of code contribution proposals from external and internal sources. This indicates the system's readiness to accept external contributions and the maturity of collaboration.

Number of Contributors

The number of individuals who have contributed code or documentation. A higher number indicates a more open development culture and diverse participation.

C) Organizational Penetration

Organizational scale reflects the extent to which the significance and culture of publishing OSS to a broad range of organizations has permeated administrative OSS activities within each country. This perspective suggests the presence or absence of strategic initiatives such as cross-ministerial OSS promotion and the establishment of unified national-level strategies and frameworks.

Number of Organizations

The number of government organizations participating in OSS activities. This indicates the spread of decentralized OSS promotion and cross-ministerial initiatives.

D) Project Duration

Project duration is a key perspective for evaluating the "sustainability" of OSS activities. Projects with earlier start dates are more likely to have been maintained and improved over a long period, suggesting that OSS activities are well-established within the organization. Conversely, a high number of recently published projects within an organization suggest that OSS activities are expanding.

First Commit Date

The date each OSS project first published code on GitHub. This provides insight into the project's start date and historical background, serving as important reference information for evaluating the "continuity" of OSS activities.

By collecting data based on the above items, it is possible to broadly observe not only the number of projects but also the degree of collaboration with external developer communities, the extent of reuse, and the level of institutional and cultural maturity.

2.2.2 Implementing Data Collection

Data collection utilized a Python script leveraging the GitHub API. The primary GitHub APIs used, and the information retrieved are shown in Table 2-1.

Table 2-1 Primary GitHub APIs and Retrieved Information Used for Data Collection

 GitHub API

 Retrieved Information

GET /orgs/{org}/repos

Retrieve list of repositories in an organization

GET /orgs/{org}/members

Get list of organization members

GET /repos/{org}/{repo}/branches

Number of branches

GET /repos/{org}/{repo}/issues?state=all

Number of issues (excluding pull requests)

GET /repos/{org}/{repo}/pulls?state=all

Number of pull requests

GET /repos/{org}/{repo}/contributors

Number of contributors

GET /repos/{owner}/{repo}/commits?per_page=1

First commit date and time

This data was saved in CSV format via script and aggregated/analyzed by country and theme. The script details include verbatim in the main text, ensuring easy reproducibility and expansion to other countries.

The data collection period is as follows.

  • First commit date and time: September 16, 2025 - September 26, 2025
  • Initial commit date and time: September 30,2025 - October 3,2025

2.2.3 Data Analysis

This analysis quantitatively compared the scale and nature of OSS activities using various metrics collected from national government GitHub accounts, including repository count, star count, fork count, issue count, pull request count, contributor count, and first commit date. Based on this data, we organized each country's characteristics from the perspectives of OSS activity level, institutional support, and cultural background, performing grouping and trend analysis.