GitHub REST API Call Returns Truncated Result-set

Large result-sets returned from calls to GitHub REST API endpoints may undergo truncation.

For example, lets say a call to the list repositories endpoint is run for an organization hosting 200 repos. By default, the output result will contain only 30 repositories .

A closer look at the default values of the following path parameters explains why:

per_pageNumber of values per page (default = 30)
pageThe page number of the results to fetch (default = 1)
Image 23

This article outlines two approaches for dealing with this behaviour:

Generate Scoped API Token for API Calls

  • Login to GitHub with a user account that has admin privileges on the target organization
  • Generate a token by navigating to
    • Settings/Developer Settings
    • Personal access tokens
      • Tokens (classic)
    • Generate new token
      • Generate new token (classic)
      • Enter a note for the token, e.g. programmatic access
      • Ensure repo scope is ticked
Image 24
  • Store the token in a secure location—e.g. device with drive that is encrypted-at-rest
  • For the examples, location $HOME/tokens/github-api.txt is used

Approach 1: Bash Script and Curl

The sample script $HOME/list-git-repos.sh below lists private repositories (“owned_private_repos“) belonging to a given organization.

#!/bin/bash

target_org=$1
results_file=$HOME/repo-list.txt
api_token_file=$HOME/tokens/github-api.txt
rm ${results_file} 2>/dev/null

private_repos_cnt=$(curl -s \
  -H "Accept: application/vnd.github+json" \
  -H "Authorization: Bearer $(cat ${api_token_file})" \
  -H "X-GitHub-Api-Version: 2022-11-28" \
  -L "https://api.github.com/orgs/${target_org}" | jq .owned_private_repos)

limit_per_page=100
pages_required="$(($private_repos_cnt / $limit_per_page + 1))"

for page in $(seq 1 $pages_required); do
  curl -s -H "Accept: application/vnd.github+json" \
    -H "Authorization: Bearer $(cat ${api_token_file})" \
    -H "X-GitHub-Api-Version: 2022-11-28" \
    -L "https://api.github.com/orgs/${target_org}/repos?page=${page}&per_page=${limit_per_page}" |
    jq .[] |
    jq -r '"Repo_id: \(.id), Repo_Name: \(.full_name)"' >>${results_file}
done

It accepts the GitHub organization name as an input parameter.

Usage Instructions

  • Copy the script to target path: $HOME/list-git-repos.sh
  • Set to executable
    • chmod +x $HOME/list-git-repos.sh
  • Store the GitHub API token in file: $HOME/tokens/github-api.txt

Run the script, passing the GitHub organization name as a parameter.

For example, for an organization name foo-org:

$ $HOME/list-git-repos.sh foo-org

Output results are in file: $HOME/repo-list.txt.

The following head command displays the first twenty records from the output file:

$ head -n 20 $HOME/repo-list.txt
Image 25

Approach 2: GitHub CLI

Prerequisite: Ensure the GitHub CLI is installed.

  • gh api with the --paginate option
--paginate

Make additional HTTP requests to fetch all pages of results
  • Authenticate using API token:
$ gh auth login --with-token <<<"$(cat $HOME/tokens/github-api.txt)"
  • The following command would return names of all repositories within organization foo-org
$ gh api /orgs/foo-org/repos --paginate --jq '.[].name'