Writing GitHub Web Hooks with Bash
Bring your GitHub repository to the next level of functionality.
For the past year since Microsoft has acquired GitHub, I've been hosting
my Git repositories on a private server. Although I relished the opportunity
and challenge of setting it all up, and the end product works well for my
needs, doing this was not without its sacrifices. GitHub offers a clean interface
for configuring many Git features that otherwise would require more time and
effort than simply clicking a button. One of the features made easier to
implement by GitHub that I was most fond of was web hooks.
A web hook is
executed when a specific event occurs within the GitHub application. Upon
execution, data is sent via an HTTP POST
to a specified URL.
This article walks through how to set up a custom web hook, including configuring a web server, processing the POST data from GitHub and creating a few basic web hooks using Bash.
Preparing Apache
For the purpose of this project, let's use the Apache web server to host
the web hook scripts. The module that Apache uses to run server-side shell
scripts is mod_cgi
, which is available on major Linux distributions.
Once the module is enabled, it's time to configure the directory permissions and virtual host within Apache. Use the /opt/hooks directory to host the web hooks, and give ownership of this directory to the user that runs Apache. To determine the user running an Apache instance, run the following command (provided Apache is currently running):
ps -e -o %U%c| grep 'apache2\|httpd'
These commands will return a two-column output containing the name of the
user running Apache and the name of the Apache binary (typically either
httpd
or apache2
). Grant directory
permission with the following
chown
command (where USER
is the name of the
user shown in the previous ps
command):
chown -R USER /opt/hooks
Within this directory, two sub-directories will be created: html and
cgi-bin. The html folder will be used as a web root for the virtual host,
and cgi-bin will contain all shell scripts for the virtual host. Be aware
that as new sub-directories and files are created under /opt/hooks, you
may need to rerun the
above chown
to verify proper access to files
and sub-directories.
Here's the configuration for the virtual host within Apache:
<VirtualHost *:80>
ServerName SERVERNAME
ScriptAlias "/cgi-bin" "/opt/hooks/cgi-bin"
DocumentRoot /opt/hooks/html
</VirtualHost>
Change the value of the ServerName
directive from
SERVERNAME
to
the name of the host that will be accessed via the web hook. This
configuration provides base functionality to host files and executes shell
scripts. The DocumentRoot
directive specifies the root of the virtual host
using an absolute path on the local system. The ScriptAlias
directive
takes two arguments: an absolute path within the virtual host and an
absolute path on the local system. The path within the virtual host is
mapped to the local system path. mod_cgi
handles all requests made to the path specified
in the ScriptAlias
directive.
(Note: any additional
configuration including SSL or logging isn't covered in this
article.)
CGI Basics
You'll need a basic understanding of the HTTP protocol and Bash scripting to understand how CGI scripts work. When a request is made to an HTTP server, a response is generated and sent back to the client. The HTTP request contains headers that instruct the server how to handle the request. Likewise, the HTTP response contains headers that instruct the client how to handle the response. Viewing and analyzing HTTP traffic can be very simple using the developer tools on any modern browser. Here's a simple example of an HTTP request and response:
Request:
POST /cgi-bin/clone.cgi HTTP/1.1
Host: hooks.andydoestech.com
Content-length: 86
{"repository":{"name":webhook-test","url":https://github.com/
↪bng44270/webhook-test"}}
Response:
HTTP/1.1 200 OK
Date: Tue, 11 Jun 2019 02:44:52 GMT
Content-Length: 18
Content-Type: text/json
{"success":"true"}
The request is making a POST
request to the clone.cgi file
located in https://hooks.andydoestech.com//cgi-bin/. The response contains
the response code, date/time when the request was handled, the length of
the content body (in bytes) and the content body itself. Although there are
instances when binary data may be sent via HTTP, the examples in this
article deal only with clear-text transmissions.
Given the robust text-processing capabilities and commands available, Bash is well suited for constructing and manipulating the text in an HTTP transaction. If the above HTTP request were to be handled by a Bash script, it might look like this:
#!/bin/bash
JSONPOST="$(cat -)"
echo "Date: $(date)"
echo "Content-Length: 18"
echo "Content-Type: text/json"
echo ""
echo "{\"success\":\"true\"}"
Although this script is lacking in logic, it nicely illustrates how the HTTP
POST
data is captured as the JSONPOST
variable, and how the HTTP response
headers and data are returned to the client via standard script output.
Parsing JSON
Although many GitHub resources can trigger web hooks, this article
focuses specifically on the push event that fires when data is remotely
pushed into a code repository. When the HTTP POST request of a web hook is
made, a JSON object is posted to the URL. This JSON object contains many
pieces of information relating to the push operation, including information
about the repository and commits contained in the data push. The command
to parse individual values out of the POST JSON is jq
,
which is available on major Linux distributions. The syntax for the
command requires the desired property to be specified in dot notation. As
an example, consider the following snippet of the JSON object returned from
GitHub:
{
"repository": {
"name": "webhook-test",
"git_url": "git://github.com/bng44270/webhook-test.git",
"ssh_url": "git@github.com:bng44270/webhook-test.git",
"clone_url": "https://github.com/bng44270/webhook-test.git",
}
}
To return the value of the attribute named clone_url
using
jq
, you would use the
following syntax:
jq -r '.repository.clone_url' <<< 'JSON'
After replacing JSON with the text representation of the JSON object, this command would return the HTTP repository clone URL. Using command substitution, the value of a JSON attribute can be assigned to a Bash variable for use within a script.
Hook #1: Simple Backup
The first hook I want to cover will create a backup of the repository on the Apache server hosting the web hook scripts. The above VirtualHost configuration will be used in this example. Here's the repository backup web hook script:
1 #!/bin/bash
2
3 REPODIR="/opt/hooks/html/repos"
4
5 json_resp() {
6 echo '{"result":"'"$([[ $1 -eq 0 ]] && echo "success"
↪|| echo "failure")"'"}'
7 }
8
9 POSTJSON="$(cat -)"
10
11 REPOURL="$(jq -r ".repository.clone_url" <<< "$POSTJSON")"
12 REPONAME="$(jq -r ".repository.name" <<< "$POSTJSON")"
13
14 echo "Content-type: text/json"
15 echo ""
16
17 if [ -d $REPODIR/$REPONAME ]; then
18 pushd .
19 cd $REPODIR/$REPONAME
20 git pull
21 json_resp $?
22 popd
23 else
24 mkdir $REPODIR/$REPONAME
25 git clone $REPOURL $REPODIR/$REPONAME
26 json_resp $?
27 fi
The REPODIR
variable at the beginning of the script indicates the directory
that will contain all repository directories. The json_resp
function
allows the code that generates a JSON response to be reused multiple
times in the script. Just like in the example above, the HTTP
POST
data is
captured in the POSTJSON
variable. In lines 11 and 12, the
clone_url
and
name attributes are pulled from the POSTJSON
variable using
jq
. Line 14
begins the creation of HTTP response headers. The if
block on
lines 17–27
determines whether the repository already has been cloned. If it has, the
script moves to the repository folder, pulls down repository changes and
returns to the original working directory. If the folder does not exist,
the directory is created, and the repository is cloned to the new directory.
Note the use of the $REPODIR
variable that was set at the beginning of the
script. Whether the repositor is cloned or updates are pulled down, the
json_resp
function is called to generate the response JSON, which will
contain a single attribute named "success" with a value of "true" or
"false" depending on the outcome of the respective git
commands.
Hook #2: Build and Package
Backing up repositories can be useful. With the vast number of build tools available on the command line, it makes sense to create a web hook that will deliver a built package for code in a repository. This could be built out into a robust solution filling the need for Continuous Integration/Deployment (CI/CD). Here's the build/deploy web hook script:
1 #!/bin/bash
2
3 WEBROOT="/opt/hooks/html/archive"
4 REPODIR="/opt/hooks/html/repos"
5 WEBURL="https://hooks.andydoestech.com/archive"
6
7 json_package() {
8 echo '{"result":"'$([[ $1 -eq 0 ]] && echo
↪"\"success\",\"url\":\"$1\"" ||
↪echo "\"package failure\"")"'}'
9 }
10
11 run_make() {
12 [[ -d $REPODIR/$REPONAME/build ]] && make -s -C
↪$REPODIR/$REPONAME clean
13 if [ $1 -eq 0 ]; then
14 make -s -C $REPODIR/$REPONAME
15 if [ -d $REPODIR/$REPONAME/build ]; then
16 FILENAME="$REPONAME-$COMMITTIME.tar.gz"
17 tar -czf $WEBROOT/$FILENAME -C
↪$REPODIR/$REPONAME/build .
18 json_package "$?" "$WEBURL/$FILENAME"
19 else
20 echo '{"result":"build failure"}'
21 fi
22 else
23 echo '{"result":"clone/pull failure"}'
24 fi
25 }
26
27 POSTJSON="$(cat -)"
28
29 REPOURL="$(jq -r ".repository.url" <<< "$POSTJSON")"
30 REPONAME="$(jq -r ".repository.name" <<< "$POSTJSON")"
31 COMMITTIME="$(jq -r '.commits[0].timestamp' <<<
↪"$POSTJSON" | date -d "$(cat -)" +"%m-%d-%YT%H-%M-%S")"
32
33 echo "Content-type: text/json"
34 echo ""
35
36 if [ -d $REPODIR/$REPONAME ]; then
37 pushd .
38 cd $REPODIR/$REPONAME
39 git pull
40 run_make $?
41 popd
42 else
43 mkdir $REPODIR/$REPONAME
44 git clone $REPOURL $REPODIR/$REPONAME
45 run_make $?
46 fi
In a similar manner to Hook #1, variables are defined at the beginning of
the script to specify the directory where repositories will be cloned, the
directory where build packages will be stored and the base URL of build
packages. The two functions defined on lines 7–25 will be used later in
the script. Lines 27–31 are capturing the JSON POST data and parsing out
attributes into shell variables using jq
. Note that the format of the date
in COMMITTIME
is being modified from its original form (this will make
sense later). Lines 33–46 are almost identical to Hook #1 in terms of
setting HTTP headers and cloning/pulling repository with an addition of a
call to the run_make
function. The return status of the clone/pull is
passed to the run_make
function. If the clone/pull ran
successfully, the
function assumes there is a Makefile in the root of the repository. The
Makefile is assumed to behave in the following manner:
-
When
make
is executed, the solution is built into a folder named "build" within the repository. -
When
make clean
is executed, the "build" folder is deleted.
Beginning on line 12, if the build folder exists, make clean
is executed to
remove it. If the make in line 13 is successful, an archive filename is
constructed using REPONAME
and COMMITTIME
. Note that the value of
COMMITTIME
contains no spaces for a proper filename. The status code of
the tar
command on line 17 is passed into the
json_package
function. If
the archive was created successfully, a JSON object containing two JSON
attributes are defined: result
is set to "success", and
url
is set to the
URL of the archive. If the archive was unable to be created, the result
attribute is set to "package failure".
GitHub provides many features, but without question, web hooks provides the DevOps engineer with tools to accomplish almost any task. Leveraging the functionality of Apache with CGI and Bash scripting in such a way that it can be consumed by GitHub allows for almost endless possibilities.
Resources
For more information on topics mentioned in this article, refer to the following links: