How I set up my Ghost CMS blog to be deployable in under 5 minutes

A couple of weeks ago, I made a likely-doomed commitment to begin blogging on a consistent basis. Time will determine the value of my word, but as an optimistic pessimist I logged in to the back end of my blog and wrote my first post in almost a year. After a glass of celebratory bourbon, however, I realized it had been almost a year since I last upgraded the Ghost CMS platform that powers my blog and even longer since I provisioned the Digital Ocean Droplet that it runs on. I've learned quite a bit about running production-grade applications in the 11 months since joining Stripe as an infrastructure engineer and, after a second glass of contemplation bourbon, decided my lowly blog was worthy of a better deployment story.

At the very least, I wanted to make it as easy as possible to update the Ghost CMS platform that my blog runs on. Ghost makes it really simple with ghost update, so why not stop there and call it a day? There were two primary reasons why I wasn't satisfied with simply using the Ghost CLI to run regular updates. The first is that Ghost is written on a Node.js stack and if I know anything about Node.js it's that one day I'd run ghost update to find out my barely 4-month old version of Node.js is out of date and I would need to update it first before continuing. But guess what? Ubuntu's LTS release probably wouldn't support that version so down the rabbit hole I would go destroying the cleanliness of my droplet with third-party PPAs and hastily-run sudo commands. Not me! Luckily, Ghost provides a well-supported community Docker image which shields me from the terrors of unsupported Node.js versions. However, this introduces a different type of maintenance complexity – that is, managing Docker versions, image versions, mount points for SSL certificates, and a whole lot more. This brought me to reason number two why I was unsatisfied with ghost update: it's not just Ghost that needs to stay updated.

The truth is, even the simplest of websites and applications these days require a lot of configuration. They are likely hosted using a cloud provider like Digital Ocean or AWS, they have virtual firewalls configured, SSL certs, databases, software from official packages and third-party vendors, and configurations gluing it all together. Often times we bang our fingers across the keyboard to get something running and forget that all of that work is lost to the Bash history of years past – the work vanishes, but the system remains. Until it breaks. If there is one thing I've learned building infrastructure, it's that the longer something as general-purpose as a server sits and bakes, the lurkier and harder to maintain it becomes. Infrastructure should be immutable, even if it's a lowly singleton Digital Ocean Droplet running my minimally-trafficked blog.

And so, I decided that I should be able to destroy and recreate the entire droplet running my blog in a reproducible and reliable way with the minimal amount of effort. With a couple of commands, my blog should be running on a fresh droplet with the latest packages and dependencies without losing state. As I write this post, my blog is running on a 2 hour old droplet that was successfully bootstrapped with three commands from my MacBook. In this article, I will share exactly how I achieved it and some interesting findings along the way. As even the most basic of engineering projects often reveal, it was more difficult than I expected.

The concept

It all started out with a concept: reliably redeploy my blog at any time with as few manual steps as possible.

Just how many manual steps it would require was difficult to identify upfront, but I aimed for less than five. I also wanted all steps to be performed on my local machine, on the command line, with a duration of under five minutes. I've built systems like this at much larger scale so I knew it was possible, but also wanted to keep things simple considering I would be the sole maintainer of this thing. With that in mind, I went on to identifying the requirements of running my blog.

Identifying requirements

As simple as a blog can be (assuming you didn't create the CMS yourself), there's still a handful of moving parts. I took a look at the current state of my blog and identified a high level  list of requirements for spinning up an instance of it.

  • Firewall configuration for security
  • Digital Ocean Droplet to run the blog platform
  • Ghost CMS as the blog platform
  • DNS for samrapdev.com pointing to the droplet's IP

The next crucial step in building ephemeral infrastructure – that is, infrastructure that can be safely replaced at whim – is identifying the state that needs to be preserved. In the case of a Ghost blog, there are two pieces of state that need to be preserved across any instance:

  • The database, which stores the blog posts, comments, accounts, etc
  • The filesystem, which stores the themes, image uploads, and some settings for the blog.

I also have SSL certificates from Let's Encrypt and my public SSH key for passwordless authentication to the droplet. These will need to be preserved across each instance as well.

Identifying state is important here, as we don't want to lose the blog posts or our SSL certificates that serve the site securely! We'll come back to how to maintain the state in a bit, but for now let's continue thinking about how we want to provision the infrastructure and run the blog.

Basic setup

As mentioned, Ghost CMS provides a Docker image to run the platform on. Using this is a no-brainer as it prevents us from needing to install and configure a bunch of software on the droplet; just deploy the image and start it up! However, we do need to install some software on the box. At the very least we know we need to install Docker. We probably have other software we'll need to install, such as certbot for automatic SSL certificate renewal, and probably some generic Ubuntu configurations. Of course, we'll need to upload some Docker configurations as well and start the service. Obviously it'd be nice to automate all of this, so I chose to use Ansible to configure and install software on the droplet.

Configuring software on the droplet takes care of running Ghost CMS in Docker, but what about our firewalls, DNS entries, and the droplet itself? This is all infrastructure that can be managed in the Digital Ocean UI, but it would be nice to automate these steps as well. Luckily, this is super easy to do with Terraform. Terraform is an open source Hashicorp product that lets you define your infrastructure as code. With Terraform, we can define our Digital Ocean firewalls, Floating IPs, Droplets, and whatever else we might need using the Hashicorp Configuration Language, or HCL. Here's what a basic Terraform configuration looks like for spinning up a new Digital Ocean Droplet running Ubuntu 18.04:

resource "digitalocean_droplet" "samrapdev" {
  image  = "ubuntu-18-04-x64"
  name   = "samrapdev"
  region = "sfo2"
  size   = "s-1vcpu-3gb"
}

We can also add a cloud-init script and SSH keys in this configuration, which we can use to set up just the bare minimum during instance bootstrapping in order to run Ansbile once the machine is running.

Cool! So we can use Terraform to define our infrastructure as code and Ansible to configure the Droplet and deploy our Ghost CMS blog via Docker. The last thing to figure out is how to preserve the state we mentioned above. The simplest approach is to periodically back up the stateful parts of the filesystem and then download those backups onto whatever new Droplet we're spinning up. Digital Ocean has a service called Spaces, which provides S3-compatible object storage. We can create a Space (essentially an S3 Bucket) using Terraform, back up the state on a cron, and then download the backups via Ansible when we are provisioning the new instance:

Periodically back up the state from droplet-1 and restore it onto the new droplet-2

I built a basic command-line Go binary called systools to handle this exact behavior. It currently has two commands: backup and restore, which do exactly what they say. What's cool about this program is that you can pass it a single file or a directory and it handles both. It also creates a lockfile for each backup that points to the current and previous versions, allowing for a full history of backups. You can check it out on my GitHub.

Implementation

Provisioning and deployment of the blog is broken into tiers.  Tiers are logical groupings of steps that, when performed in order,  provide a running instance of the blog. Each tier provides a critical  piece of infrastructure and/or configuration.

Tier Name Description
0 Base tier SSH keys, Spaces, tags, and firewalls that persist across blog instances
1 Droplet tier Droplet creation and critical provisioning
2 Bootstrap tier Full configuration of the droplet and bootstrapping of the blog
3 Deployment tier Points Floating IP to new droplet

This setup requires that a DNS and single floating IP be created manually in the Digital Ocean web console. This is the only manual set up required prior to provisioning. Everything else is configured in code.

Base tier

The Base tier, tier0, only needs to be provisioned once upon project or account creation. It provides the SSH keys, tags, and firewalls that are shared between any instance of the blog. Additionally, it creates a Digital Ocean Space that is used for backup and restoration.

Droplet tier

The Droplet tier, tier1, is run when spinning up a new droplet. This will be done initially to run the blog and subsequently to swap out the underlying droplet. It contains a cloud-init script that runs the minimal provisioning of the droplet such that it is safely configured for SSH access which is required to run the following tier. Specifically, the cloud-init script creates a non-root user, locks down SSH access, and ensures that authentication uses the Base tier SSH keys.

Bootstrap tier

The Bootstrap tier, tier2, should always be run after and in conjunction with the Droplet tier. It uses Ansible to apply the host's configuration, install required packages, restore stateful backups (such as the blog's database and SSL certs), install crons, and finally start up the Docker service. After running the Bootstrap tier, the blog is ready to begin accepting secure HTTPS traffic.

The Bootstrap tier is idempotent and can be run on already-bootstrapped hosts. Once use-case for this is updating the Ghost version by downloading the latest Docker image. However, this is also a great candidate for just provisioning a brand new droplet by running tiers 1-3!

Deployment tier

The Deployment tier, tier3, is the final tier for provisioning the blog. It simply points the Digital Ocean Floating IP address to the newly created droplet by asking for the name given to the droplet in the Droplet tier. Assuming DNS is managed by Digital Ocean and points to that IP, users connecting to samrapdev.com will at that point be connecting to the newly provisioned droplet. At this point, it is safe to destroy the previous droplet.

Execution

This multi-tier setup lives under the same repository (samrap/samrapdev), with each tier in its own folder. This allows each tier to be ran individually. As mentioned above, tier0 only needs to be run once. To replace an instance, we simply run tiers 1-3:

# Create the new droplet
(cd tier1 ; terraform apply)

# Provision the droplet. The output from tier1 will include the new instance's IP address, which should be added to the Ansible hosts file before running this command.
(cd tier2 ; ansible-playbook -i hosts server.yml)

# Assign Floating IP to the new Droplet, effectively sending all traffic to samrapdev.com to the new Droplet
(cd tier3 ; terraform apply)

That's it! With these three commands (plus updating the Ansible configuration to include the new IP), I can tear down and spin up a new instance my blog on Digital Ocean

Future Work

Right now, if I were to publish this blog post and immediately replace the underlying Droplet, I would probably lose the article. This is because the backup job runs as a daily cron. It would be nice if I could somehow hook into events within Digital Ocean to fire off the backup job whenever I run Terraform. This isn't crucial though, since I'm the only one managing this infrastructure.

Additionally, I'd like to clean up the Ansible configuration as there are a lot of hardcoded values scattered throughout. Eventually, it would be nice to package this up as a general purpose solution for anyone looking to spin up a Ghost CMS blog of their own on Digital Ocean in a matter of minutes! Until then, at least I can rest easy knowing that all my software is up-to-date and my Droplet is young and happy.

Show Comments