Hosting Ghost in Kubernetes
Ghost is traditional meant to be installed on a single VM. This article will explore how to get Ghost running inside Kubernetes.
Ghost currently is geared towards only hosting 1 copy of Ghost per site, in fact this is explicitly written in the FAQs:
Ghost doesn’t support load-balanced clustering or multi-server setups of any description, there should only be one Ghost instance per site.
and also the hosting page:
Clustering or sharding is not supported in any way.
At my work amazee.io we specialise in hosting applications in Kubernetes (both native Kubernetes and Openshift), so I thought of this like a challenge to see how well Ghost would sit on a completely different architecture then a single VM in DigitalOcean (Ghost recommends this as a web host).
Why is Kubernetes so different from a traditional VM
Kubernetes runs your applications inside containers, and those containers run in pods. Your pod at any one time may be running, or it may not (e.g. it may be in the process of being moved to another node). Building resilient architectures in Kubernetes often involves ensuring your application can run in a Highly Available (HA) setup, which often means running 2 or more pods at any one time. It also means avoiding any single point of failures.
This way your application can survive the loss of a single node, or even Availability Zone with zero changes (it is transparent to your application).
This is a radical deviation from Ghost's recommended hosting approach.
Approach to getting Ghost running on Kubernetes
The approach I took was:
- Inherit from an existing official docker image for Ghost (to avoid me having to learn how to install Ghost). Github repo is here. I used this image as a builder image.
- Use a Node.js 14 based image with Alpine Linux as the Operating System (to ensure the image is as small as possible). Node.js 14 is also an LTS version (with security support to April 2023), it is also Ghost's recommended version to run.
- Use a Persistent Volume (PV) for all uploaded files (e.g. images, themes etc) in Ghost (path is
/var/lib/ghost/content/
). This way these files will survive pod restarts. The PV will beReadWriteMany
to ensure multiple pods can read and write to the PV at the same time. - Use a MySQL database through the Cloud provider in question (in my case AWS RDS Aurora). You never want to host a database yourself in Kubernetes, if you want it to be HA. You certainly do not want to use SQLite.
- Shim the supplied environment variables into the names that Ghost expects.
The above approach ended up working quite well, as you will see later on, the code to get this done is quite tiny.
What this ends up looking like
Here is a screenshot of LensIDE to which lists the pods running for this site (meta am I right):
I also created an Horizontal Pod Autoscaler (HPA) to show that you can scale this deployment with load automatically
The HPA likely will not be used (Ghost is not heavy on the CPU), but it is there in any case.
N.B. there appears to be some in-memory caching that Ghost does, this means that running 2 pods minimum will cause issues (e.g. some URLs will respond with HTTP 404 on newly created posts for example). Likely best to run with 1 pod for now.
Example repository
I created an example to show this in action, that you can use to give you a headstart:
Things still to do/gotchas
A few things I still need to solve, and/or write about:
- Email configuration - not essential for this blog, but it would be nice to actually get this working. Seems like just a few more environment variables to configure.
- The PV mount replaces the default files at
/var/lib/ghost/content/
, so the theme Casper will not be available the first time the application runs. Ghost gets grumpy about that. This likely can be fixed with another entrypoint to copy the files into the correct place during pod startup, if the files are missing. - How I setup caching in Fastly. Ghost does not play nicely with CDNs out of the box. I had to tune this.
- Fix Systemd complaints - does not seem to be in Alpine Linux
I will update this blog post as further details come in and I solve the above things.
Comments
Keen to hear from anyone else that in interested in this, or has questions or concerns about this approach.