This post is fifth (and for now, last) in a series, beginning with Automating Web Site Updates.
We’ve been narrowing the “miracle” step in our solution. This post fills in the remaining gap.
We have a Cloud Function ready to kick off the missing step that will build and deploy the updated site. At a high level, this step will need to:
- Fetch a repository with static web content plus source directories that each need to be converted to web pages
- Convert each source directory to web pages in desired format
- Build new static website structure
- Deploy the pages to a Firebase hosting project
We need a place to run our code that uses some high level tools: git, firebase CLI, and anything that converts source to web pages. And we need a file system to build the new static site in. Plus, this process may take longer than a cloud function is allowed to run (or at least longer than the GitHub webhook is willing to wait for a response). Those requirements are why we couldn’t just do these steps in the cloud function that responds to the GitHub webhook. We need something more general purpose than that.
Cloud Run looks like a possible solution. The managed version is a lot like Cloud Functions in that it takes your code, runs it in response to HTTP requests, and only charges for resources while the code is running. But instead of just providing source code in a supported language, you provide Cloud Run with a container. That container could run any supporting software you need, not just the supported language environment of a cloud function. Cloud Run will even build the container for you, from your specifications.
Any negatives to using Cloud Run? There are several for this use case, though they can possibly be worked around:
- Cloud Run is still in beta, so it is subject to change before becoming final.
- Containers require maintenance. If a security update is needed for any software, the container needs to be rebuilt with the new versions.
- The container runs only so long as it is serving a web request, so if the requesting program is only willing to wait a short time (for example, 10 seconds for a GitHub webhook or 10 minutes for a Google Cloud Pub/Sub push subscription) we have to be able to build and deploy our site in that amount of time.
- In any case, the each run is limited to no more than 15 minutes at this time.
- The file system size is limited by the memory allocated to the service (no more than 2GB).
- Invocations of the service can be concurrent, so if you are building a site in the file system you have to be sure concurrent invocations don’t step on each other, and don’t use up all the memory.
Despite these negatives, I find trying to use Cloud Run for the problem to be an intriguing approach. I’m not going to use it here, but I’ll keep thinking out how it can solve problems like this one.
So what is the solution for the current problem? I’m going to go old school, and use a virtual machine for this. In Google terms, I’m going to use a Compute Engine instance. At first look this may seem to go against my goal to use “services that require little or no customization or coding on our part”. And I also said I don’t want to maintain any servers. But the way I’m going to use Compute Engine will not require any coding other than that specifically aimed at our business logic, and won’t need any server maintenance, either.
We will launch a virtual machine with a startup script that will:
- Install the standard tools we need (git, Firebase CLI, etc.)
- Fetch the source from a GitHub repository
- Build the web pages using existing tools specific to our needs
- Deploy the site to Firebase hosting
- Destroy itself when done
The first and last steps are key: this virtual machine installs what it needs when run and then deletes itself when it has finished the task. That way we aren’t paying for an idle machine standing by waiting for work to do; we’re only paying for what really use. Further, by creating a new machine for each task and then throwing it away, we don’t need to worry about updates – we always launch and then install the latest versions of the tools we’re using.
For more background on this technique of creating, using, then deleting Compute Engine instances, see this tutorial by Laurie White and me.
The virtual machine’s actions all need to be scripted in advance, so they can run without human intervention. Once the script is in place, we can enhance the cloud function from the last blog post to create a new Compute Engine instance that will run that script. The script needs to end by deleting the instance it runs on.
We aren’t going to build the whole solution here, just give the outline. Here’s what the script will look like:
#!/bin/sh apt update; apt install -y git # # Install other tools, get code from GitHub, run business # logic to build web site pages, deploy to Firebase hosting # -- Not included in this post # # Instance deletes itself below (see tutorial for details) export METADATA=metadata.google.internal/computeMetadata/v1/instance export NAME=$(curl -X GET http://$METADATA/name -H 'Metadata-Flavor: Google') export ZONE=$(curl -X GET http://$METADATA/zone -H 'Metadata-Flavor: Google') gcloud --quiet compute instances delete $NAME --zone=$ZONE
If we launch a machine with the startup script above (filled in with all the business logic specific details) it will pull our source content from GitHub, build website pages from that, then deploy it to our Firebase hosting site. Which leaves us with one more question: how do we launch such a machine from our Cloud Function (that unfinished update_the_site()
function from the last post)? We use the google-api-python-client library. It’s pretty low-level, but there’s good sample code available you can adapt to do this.
So that’s the pipeline now:
I’m going to put this topic to rest for a while, but there are tips and trips regarding secrets and permissions I’ll probably talk about soon.