Responding to GitHub Updates

This post is fourth in a series, beginning with Automating Web Site Updates.

Most of the picture of the process we need is filled in now. We have to deal with what happens between a GitHub PR being merged and an updated website being deployed on Firebase Hosting. This post is going to just deal with responding to a GitHub PR merge.

We need to know when a PR is merged so we can kick off the rest of the update process. Lucky for us, GitHub has a feature that will tell us that: webhooks. At its core it’s a really simple idea: when an event you care about happens, GitHub will make a web request to a URL of your choosing with information about the event in its body. You just need to provide a web request handler to receive it. So before we set up the webhook, let’s figure out what we will use to receive those requests. We need:

  • to run our own custom code
  • when triggered by an HTTP (actually HTTPS) request
  • containing information about a merged PR
  • without costing much when nothing is happening (which in this case, is probably 99% or more of the time)

That sounds tailor-made for a Cloud Function. We can write code in Go, Node, or Python and say we want it triggered by an HTTPS request. Cloud Functions gives us a URL and runs our code whenever a request is sent to that URL. We don’t pay for anything except time and memory while our code is running, not while it is idle waiting for a notification. The only problem is that functions are limited in what they can do. They can’t run long jobs, they have only a small file system available, and they only provide a few language options. We can’t install other software in them, either. But none of that is a problem because we are not trying to handle the website update in the cloud function, we just need to kick that off when it’s appropriate (a subject for the next blog post).

So we will use the Google Cloud Platform console to create a new HTTP-triggered Python Cloud Function. For now, we’ll leave the default sample code in it; we just want to know the URL for the next step: setting up a GitHub webhook.

Authorized GitHub repository users can set up a GitHub webhook in the repository’s Settings section. There’s a section just for Webhooks, and a button to add a new webhook. After you click that, there are some choices to be made:

  • The Payload URL is the address that GitHub will send the request to. That’s our cloud function’s URL from the step above.
  • The Content type specifies the format of the body of the request GitHub will send. The default is application/w-www-form-urlencoded, which is what a web page might send when a user submits a form. Since we want to get a possibly complicated data structure from GitHub, the second option, application/json, is a better choice for us.
  • The Secret is a string (shared between GitHub and your receiving application) that GitHub will use to create a signature for each web request. This is a non-standard way to check that a request really comes from the GitHub webhook you created, and not somewhere else. I created a long random password with a password manager for this.
  • We finally reach the question “Which events would you like to trigger this webhook?” We can choose “just the push event”, but we don’t much care about pushes, we want to know about merges. The second option, “send me everything,” would certainly include the merge events, but we don’t want to be bothered about the vast majority of events we’d be told about then. So we can say “Let me select individual events” and just hear about Pull Requests. That choice still includes a lot of events we don’t care about (creating PRs, closing them unmerged, labeling them, and so on) but it seems to be the narrowest choice that includes PR merges.
  • And we’re going to want to make this webhook Active.

When we click the Add Webhook button, GitHub will send a test request to the URL to see if there’s something at that address accepting incoming data. That should pass, since we already created a cloud function there, but if not, that’s okay for now. We need it to work once the Cloud Function is finished.

Now that we have a webhook, every action on a PR on this repository will cause GitHub to send a JSON object to our cloud function. We need to verify that this request describes a PR merge, since we will get notification of all sorts of other PR events, too. And we need to make sure that this information really comes from our GitHub webhook, and not somebody trying to fool us into thinking a merge happened. Here’s an outline of what we need to do:

  • Load the JSON data in the request body into a Python object
    notification = request.get_json()
  • Check to see that this is a PR that is closed by a merge; if not, just exit, nothing to do here
    if notification.get('action') != 'closed':
    return 'OK', 200
    else:
    if notification.get('pull_request') is None:
    return 'OK', 200
    else:
    if not notification['pull_request'].get('merged'):
    return 'OK', 200
  • Check that the signature is valid; if not, just return a Forbidden response and exit, we aren’t going to deal with fake requests
    import hashlib, hmac, os
    secret = os.environ.get('SECRET', 'Missing!').encode()
    signature = request.headers.get('X-Hub-Signature')
    body = request.get_data()
    calc_sig = hmac.new(secret, body, hashlib.sha1)
    if signature != 'sha1={}'.format(calc_sig).hexdigest():
    return 'Forbidden', 403
  • Kick off the next step that will actually publish an updated website based on the contents of the repository
    update_the_site()

Notice that the secret for the signature, which was provided to GitHub when creating the webhook, is fetched from an environment variable. That returns a string, and it needs to be converted to bytes in order to send to the hash function. You can set up the necessary environment variable when creating or redeploying a cloud function. This is a better option than keeping the secret in the source code itself, which might be available to others in a source repository at some point.

Which leaves us with one big piece left to build, update_the_site(). That will be covered in the next post. Spoiler alert: the cloud function won’t be doing the update, it will just kick off some other tool to handle that.

So, our update process picture is nearly complete:

One thought on “Responding to GitHub Updates

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s