Results produced by various static code analysis tools, (such as GitHub CodeQL), are generally stored in SARIF format (Static Analysis Results Interchange Format). SARIF files present static code analysis results using JSON. The popularity of JSON amongst developers has resulted in a range of publicly accessible SARIF-parsing utilities, capable of transforming code analysis results into other popular report-based output formats, such as HTML/Word.
In this post, we’ll walk through the process of downloading respective SARIFs from a historical CodeQL scan, performed on a sample GitHub repository. We’ll then use microsoft/sarif-tools, to “transform” them into a more presentable, HTML-based, reporting format.
Prerequisites
Sample commands/code presented throughout the article assume the following utilities are already installed:
Sample GitHub Repository
Repository URL | https://github.com/my-git-user/test.git |
Branch | main |
GitHub Token
A personal access token was generated to allow for relevant GitHub REST authentication and API calls.
The following fictitious value will be substituted for sample commands used through this article:
github_pat_tttttttttttttttttttttttxxxxxxxx
Retrieve Commit Details for CodeQL Analysis
As part of the final CodeQL analysis scan phase, the results (SARIF file(s)) are uploaded to GitHub using the codeql github upload-results
command:
codeql github upload-results --sarif=<file> ....[--commit=<commit>]
The command requires a commit argument value as input, which essentially tags/ties the analysis results (SARIF file) to a specific commit.
For our sample repository, we will aim to obtain SARIFs related to a specific commit.
To identify the commit we’re interested, navigate to:
- Security > code scanning
- click on “Tools”
- under Setup types, click on “API upload“

- We note down the full value of the commit SHA: 68d2a854xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
- Also, notice that for this particular analysis, there were two invocations of CodeQL, i.e., for
java & python
- The screenshot below is a sample of the analysis results, as viewed from GitHub

Retrieve Code Scanning Analysis ID(s)
For Specific Commit
We start by identifying the code scanning analysis ID(s) by calling the REST API list-code-scanning-analyses-for-a-repository
endpoint, and filtering by the commit SHA 68d2a854xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
.
- The commands below setup the call the to the endpoint, along with a filter for the specific commit:
$ GITHUB_TOKEN=github_pat_tttttttttttttttttttttttxxxxxxxx
$ REPOS=(test)
$ OWNER=my-git-user
$ for ((i=0; i<${#REPOS[@]}; i++));
do
gh api /repos/$OWNER/"${REPOS[i]}"/code-scanning/analyses --paginate | jq -r \
'["ID","URL","CATEGORY","BRANCH","COMMIT_SHA"],
(sort_by(.category) | .[] |select(.commit_sha=="68d2a854xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx") |
[.id,.url,.category,.ref,.commit_sha]) | @tsv' | column -ts $'\t'
done
- Sample output
ID URL CATEGORY BRANCH COMMIT_SHA
234567890 https://api.github.com/repos/my-git-user/test/code-scanning/analyses/234567890 python refs/heads/main 68d2a854xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
123456789 https://api.github.com/repos/my-git-user/test/code-scanning/analyses/123456789 java refs/heads/main 68d2a854xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Each URL listed can be as input to the REST API code scanning analysis GET request for downloading the respective SARIF file:
- When using CURL, the fully qualified URL should be used:
https://api.github.com/repos/my-git-user/test/code-scanning/analyses/234567890
https://api.github.com/repos/my-git-user/test/code-scanning/analyses/123456789
- When working with GitHub cli command,
gh api
, the “https://api.github.com
” prefix can be dropped:
/repos/my-git-user/test/code-scanning/analyses/234567890
/repos/my-git-user/test/code-scanning/analyses/123456789
Downloading SARIFs
Call the REST API endpoint, using the URLs from the previous section.
Python Analysis
- Make the call to the GitHub API to retrieve the SARIF associated with Python analysis, saving the output to file sarif_python.json :
$ gh api \
-H "Accept: application/sarif+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
/repos/my-git-user/test/code-scanning/analyses/234567890 | jq . > sarif_python.json
Note the subtle change in the header request, i.e.,
-H "Accept: application/sarif+json"
Java Analysis
- Repeat the procedure for the Java analysis, saving the output to file: sarif_java.json :
$ gh api \
-H "Accept: application/sarif+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
/repos/my-git-user/test/code-scanning/analyses/123456789 | jq . > sarif_java.json
So, we now have the SARIFs stored as local files, i.e.:
sarif_python.json
sarif_java.json
At this point, we can use these files as inputs to microsoft/sarif-tools in order to generate our HTML-based output.
HTML Report from SARIF Files
Python (sarif_python.json)
- Generate a HTML file, sarif_python.html, from sarif_python.json file by running:
$ sarif html --output ./sarif_python.html --no-autotrim sarif_python.json
- The following shows the HTML output, sarif_python.html, as viewed through a standard web browser

- What if we wanted to include the URL used during the API request when downloading the SARIF, i.e.,?
https://api.github.com/repos/my-git-user/test/code-scanning/analyses/234567890
- We can achieve the desired output by using sed to append a line at the correct location within the raw sarif_python.html file, i.e., just after “Sarif Summary: CodeQL“:
$ sed -i "/<h3>Sarif Summary: <b>CodeQL<\/b><\/h3>/a\
<h4>Git URL: <b>https://api.github.com/repos/my-git-user/test/code-scanning/analyses/234567890<\/b><\/h4>\
" ./sarif_python.html

Java (sarif_java.json)
Repeat the steps in the previous section to generate the HTML report for to the Java analysis by using the corresponding URI:
https://api.github.com/repos/my-git-user/test/code-scanning/analyses/123456789
Report Automation
Wouldn’t it be nice if can automate the whole process, and add in support for uploading the reports as GitHub pages?