Scaling Dedicated Game Servers with Kubernetes: Part 4 – Scaling Down

This is part four of a fivefour-part series on scaling game servers with Kubernetes.

In the previous three posts, we hosted our game servers on Kubernetes, measured and limited their resource usage, and scaled up the nodes in our cluster based on that usage. Now we need to tackle the harder problem: scaling down the nodes in our cluster as resources are no longer being used, while ensuring that in-progress games are not interrupted when a node is deleted.

On the surface, scaling down nodes in our cluster may seem particularly complicated. Each game server has in-memory state of the current game and multiple game clients are connected to an individual game server playing a game. Deleting arbitrary nodes could potentially disconnect active players — and that tends to make them angry! Therefore, we can only remove nodes from a cluster when a node is empty of dedicated game servers.

This means that if you are running on Google Kubernetes Engine (GKE), or similar, you can’t use a managed autoscaling system. To quote the documentation for the GKE autoscaler “Cluster autoscaler assumes that all replicated Pods can be restarted on some other node…” — which in our case is definitely not going to work, since it could easily delete nodes that have active players on them.

That being said, when looking at this situation more closely, we discover that we can break this down into three separate strategies that when combined together make scaling down a manageable problem that we can implement ourselves:

  1. Group game servers together to avoid fragmentation across the cluster
  2. Cordon nodes when CPU capacity is above the configured buffer
  3. Delete a cordoned node from the cluster once all the games on the node have exited

Let’s look at each of these detail.

Grouping Game Servers Together in the Cluster

We want to avoid fragmentation of game servers across the cluster so we don’t end up with a wayward small set of game servers still running across multiple nodes, which will prevent those nodes from being shut down and reclaiming their resources.

This means we don’t want a scheduling pattern that creates game server Pods on random nodes across our cluster like this:

Fragmented across the cluster

But instead want to have our game server Pods scheduled packed as tight as possible like this:

Fragmented across the cluster

To group our game servers together, we can take advantage of Kubernetes Pod PodAffinity configuration with the PreferredDuringSchedulingIgnoredDuringExecution option. This gives us the ability to tell Pods that we prefer to group them by the hostname of the node that they are currently on, which essentially means that Kubernetes will prefer to put a dedicated game server Pod on a node that already has a dedicated game server Pod on it already.

In an ideal world, we would want a dedicated game server Pod to be scheduled on the node with the most dedicated game server Pods, as long as that node also has enough spare CPU resources. We could definitely do this if we wanted to write our own custom scheduler for Kubernetes, but to keep this demo simple, we will stick with the PodAffinity solution. That being said, when we consider the short length of our games, and that we will be adding (and explaining) cordoning nodes shortly, this combination of techniques is good enough for our requirements, and removes the need for us to write additional complex code.

When we add the PodAffinity configuration to the previous post’s configuration, we end up with the following, which tells Kubernetes to put pods with the labels sessions: game on the same node as each other whenever possible.

Cordoning Nodes

Now that we have our game servers relatively well packed together in the cluster, we can discuss “cordoning nodes”. What does cordoning nodes really mean? Very simply, Kubernetes gives us the ability to tell the scheduler: “Hey scheduler, don’t schedule anything new on this node here”. This ensures that no new Pods get scheduled on that node. In fact, in some places in the Kubernetes documentation, this is simply referred to as marking a node unschedulable.

Cordoning nodes

In the code below, if you focus on the section s.bufferCount < available you will see that we make a request to cordon nodes if the amount of CPU buffer we currently have is greater than what we have set as our need. We’ve stripped some parts out for brevity, but you can see the original here.

As you can also see from the code above, we can uncorden any available cordoned nodes in the cluster if we drop below the configured CPU buffer. This is faster than adding a whole new node, so it’s important to check for cordoned nodes before adding a whole new node from scratch.  Because of this we also have a configured delay on how long before a cordoned node is deleted (you can see the source here) to limit thrashing on creating and deleting nodes in the cluster unnecessarily.

This is a pretty great start. However, when we want to cordon nodes, we want to cordon only the nodes that have the least number of game server Pods on them, as in this instance, they are most likely to empty first as game sessions come to an end.

Thanks to the Kubernetes API, it’s relatively straightforward to count the number of game server Pods on each Node, and sort them in ascending order.  From there we can do arithmetic to determine if we still remain above the desired CPU buffer if we cordon each of the available nodes. If so, we can safely cordon those nodes.

Removing Nodes from the Cluster

Now that we have nodes in our clusters being cordoned, it is just a matter of waiting until the cordoned node is empty of game server Pods before deleting it. The code below also makes sure the node count never drops below a configured minimum as a nice baseline for capacity within our cluster.

You can see this in the code below, and in the original context:


We’ve successfully containerised our game servers, scaled them up as demand increases, and now scaled our Kubernetes cluster down, so we don’t have to pay for underutilised machinery — all powered by the APIs and capabilities that Kubernetes makes available out of the box. While it would take more work to turn this into a production level system, you can already see how to take advantage of the many building blocks available to you.

Before we finish, I would like to apologise for the delay in producing the fourth part in this series. If you saw the announcement, you may have guessed that a lot of my time got taken up developing and releasing Agones, the open source, productised version of this series of posts on running game servers on Kubernetes.

On that note, this will also be the last installment in this series. I had already completed the work to implement scaling down before starting on Agones, and rather than build out new functionality for global cluster management on Paddle Soccer, I’m going to focus those efforts building out awesome new features for Agones and bring it up from its current 0.1 alpha release, to a full 1.0, production-ready milestone.

I’m very excited about the future of Agones, and if my series of blog posts have inspired you, watch the GitHub repository, join the Slack, follow us on Twitter and get involved the mailing list. We’re actively seeking more contributors, and would love to have you involved.

Lastly, I welcome questions and comments here, or reach out to me via Twitter. You can also see my presentation at GDC and GCAP from 2017 on this topic, as well as check out the code in GitHub.

All posts in this series:

  1. Containerising and Deploying
  2. Managing CPU and Memory
  3. Scaling Up Nodes
  4. Scaling Down Nodes
  5. Running Globally

Scaling Dedicated Game Servers with Kubernetes: Part 3 – Scaling Up Nodes

This is part three of a fivefour-part series on scaling game servers with Kubernetes.

In the previous two posts we looked at hosting dedicated game servers on Kubernetes and measuring and limiting their memory and CPU resources. In this instalment we look at how we can use the CPU information from the previous post to determine when we need to scale up our Kubernetes cluster because we’ve run out of room for more game servers as our player base increases.

Separating Apps and Game Servers

The first step we should make before starting to write code to increase the size of the Kubernetes cluster, is to separate our applications — such as match makers, the game server controllers, and the soon-to-be-written node scaler — onto different nodes in the cluster than where the game servers would be running.

This has several benefits:

  1. The resource usage of our applications is now going to have no effect on the game servers, as they are on different machines. This means that if the matchmaker has a CPU spike for some reason, there is an extra barrier to ensure there is no way it could unduly affect a dedicated game server in play.
  2. It makes scaling up and down capacity for dedicated game servers easier – as we only need to look at game server usage across a specific set of nodes, rather than all potential containers across the entire cluster.
  3. We can use bigger machines with more CPU cores and memory for the game server nodes, and smaller machines with less cores and memory for the controller applications as they need less resources, in this instance. We essentially are able to pick the right size of machine for the job at hand. This is gives us great flexibility while still being cost effective.

Kubernetes makes setting up a heterogenous cluster relatively straightforward and gives us the tools to specify where Pods are scheduled within the cluster – via the power of Node Selectors on our Pods.

It’s worth noting that that there is also a more sophisticated Node Affinity feature in beta, but we don’t need it for this example, so we’ll ignore its extra complexity for now.

To get started, we need to assign labels (a set of key-value pairs) to the nodes in our cluster. This is exactly the same as you would have seen if you’ve ever created Pods with Deployments and exposed them with Services, but applied to nodes instead. I’m using Google Cloud Platform’s Container Engine, and it uses Node Pools to apply labels to nodes in the cluster as they are created and set up heterogenous clusters – but you can also do similar things on other cloud providers, as well as directly through the Kubernetes API or the command line client.

In this example, I added the labels role:apps and role:game-server to the appropriate nodes in my cluster. We can then add a nodeSelector option to our Kubernetes configurations to control which nodes in the cluster Pods are scheduled onto.

For example, here is the configuration for the matchmaker application, where you can see the nodeSelector set to role:apps to ensure it has container instances created only on the application nodes (those tagged with the “apps” role).

By the same token, we can adjust the configuration from the previous article to make all the dedicated game server Pods schedule just on the machines we specifically designated for them, i.e. those tagged with role: game-server:

Note that in my sample code, I use the Kubernetes API to provide a configuration identical to the one above, but the yaml version is easier to understand, and it is the format we’ve been using throughout this series.

A Strategy for Scaling Up

Kubernetes on cloud providers tends to come with automated scaling capabilities, such as the Google Cloud Platform Cluster Autoscaler, but since they are generally built for stateless applications, and our dedicated game servers store the game simulation in memory, they won’t work in this case. However, with the tools that Kubernetes gives us, it’s not particularly difficult to build our own custom Kubernetes cluster autoscaler!

Scaling up and down the nodes in a Kubernetes cluster probably makes more sense for a cloud environment, since we only want to pay for the resources that we need/use. If we were running in our own premises, it may make less sense to change the size of our Kubernetes cluster, and we could just run a large cluster(s) across all the machines we own and leave them at a static size, since adding and removing physical machines is far more onerous than on the Cloud and wouldn’t necessarily save us money since we own/lease the machines for much longer periods.

There are multiple potential strategies for determining when you want to scale up the number of nodes in your cluster, but for this example we’ll keep things relatively simple:

  • Define a minimum and maximum number of nodes for game servers, and make sure we are within that limit.
  • Use CPU resource capacity and usage as our metric to track how many dedicated game servers we can fit on a node in our cluster (in this example we’re going to assume we always have enough memory).
  • Define a buffer of CPU capacity for a set number of game servers at all times in the cluster. I.e. add more nodes if at any point you couldn’t add n number of servers to the cluster without running out of CPU resources in the cluster at any point in time.
  • Whenever a new dedicated game server is started, calculate if we need to add a new node in the cluster because the CPU capacity across the nodes is under the buffer amount.
  • As a fail-safe, every n seconds, also calculate if we need to add a new node to the cluster because the measured CPU capacity resources are under the buffer.

Creating a Node Scaler

The node scaler essentially runs an event loop to carry out the strategy outlined above.

Using Go in combination with the native Kubernetes Go client library makes this relatively straightforward to implement, as you can see below in the Start() function of my node scaler.

Note that I’ve removed most of the error handling and other boilerplate to make the event loop clearer, but the original code is here if you are interested.

For those of you who aren’t as familiar with Go, let’s break this down a little bit:

  1. kube.ClientSet() – we have a small piece of utility code, which returns to us a Kubernetes ClientSet that gives us access to the Kubernetes API of the cluster that we are running on.
  2. gw, _ := s.newGameWatcher – Kubernetes has APIs that allow you to watch for changes across the cluster. In this particular case, the code here returns a data structure containing a Go Channel (essentially a blocking-queue), specifically, that will return a value whenever a Pod for a game is added or deleted in the cluster.  Look here for the full source for the gameWatcher.
  3. tick := time.Tick(s.tick) – this creates another Go Channel that blocks until a given time, in this case 10 seconds, and then returns a value. If you would like to look at it, here is the reference for time.Tick.
  4. The main event loop is under the “// ^^^ MAIN EVENT LOOP HERE ^^^” comment. Within this code block is a select statement. This essentially declares that the system will block until either the channel or the tick channel (firing every 10s) returns a value, and then execute s.scaleNodes(). This means that a scaleNodes command will fire whenever a game server is added/removed or every 10 seconds.
  5. s.scaleNodes() – run the scale node strategy as outlined above.

Within s.scaleNodes() we query the CPU limits that we set on each Pod, as well as the total CPU available on each Kubernetes node within the cluster, through the Kubernetes API. We can see the configured CPU limits in the Pod specification via the Rest API and Go Client, which gives us the ability to track how much CPU each of our game servers is taking up, as well as any of the Kubernetes management Pods that may also exist on the node. Through the Node specification, the Go client can also track the amount of CPU capacity available in each node. From here it is a case of summing up the amount of CPU used by Pods, subtracting it from the capacity for each node, and then determining if one or more nodes need to be added to the cluster, such that we can maintain that buffer space for new game servers to be created in.

If you dig into the code in this example, you’ll see that we are using the APIs on Google Cloud Platform to add new nodes to the cluster. The APIs that are provided for Google Compute Engine Managed Instance Groups allow us to add (and remove) instances from the Nodepool in the Kubernetes cluster. That being said, any cloud provider will have similar APIs to let you do the same thing, and here you can see the interface we’ve defined to abstract this implementation detail in such a way that it could be easily modified to work with another provider.

Deploying the Node Scaler

Below you can see the deployment YAML for the node scaler. As you can see, environment variables are used to set all the configuration options, including:

  • Which nodes in the cluster should be managed
  • How much CPU each dedicated game server needs
  • The minimum and maximum number of nodes
  • How much buffer should exist at all times

You may have noticed that we set the deployment to have replicas: 1. We did this because  we always want to have only one instance of the node scaler active in our Kubernetes cluster at any given point in time. This ensures that we do not have more than one process attempting to scale up, and eventually scale down, our nodes within the cluster, which could definitely lead to race conditions and likely cause all kinds of weirdness.

Similarly, to ensure that the node scaler is properly shut down before creating a new instance of it if we want to update the node scaler, we also configure strategy.type: Recreate so that Kubernetes will destroy the currently running node scaler Pod before recreating the newer version on updates, also avoiding any potential race conditions.

See it in Action

Once we have deployed our node scaler, let’s tail the logs and see it in action. In the video below, we see via the logs that when we have one node in the cluster assigned to game servers, we have capacity to potentially start forty dedicated game servers, and have configured a requirement of a buffer of 30 dedicated game servers. As we fill the available CPU capacity with running dedicated game servers via the matchmaker, pay attention to how the number of game servers that can be created in the remaining space drops and eventually, a new node is added to maintain the buffer!

Next Steps

The fact that we can do this without having to build so much of the foundation is one of the things that gets me so excited about Kubernetes. While we touched on the Kubernetes client in the first post in this series, in this post we’ve really started to take advantage of it. This is what I feel the true power of Kubernetes really is – an integrated set of tools for running software over a large cluster, that you have a huge amount of control over. In this instance, we haven’t had to write code to spin up and spin down dedicated game servers in very specific ways – we could just leverage Pods. When we want to take control and react to events within the Kubernetes cluster itself, we have the Watch APIs that enable us to do just that! It’s quite amazing the core set of utility that Kubernetes gives you out of the box that many of us have been building ourselves for years and years.

That all being said, scaling up nodes and game servers in our cluster is the comparatively easy part; scaling down is a trickier proposition. We’ll need to make sure nodes don’t have game servers on them before shutting them down, while also ensuring that game servers don’t end up widely fragmented across the cluster, but in the next post in this series we’ll look at how Kubernetes can also help in these areas as well!

In the meantime, as with the previous posts – I welcome questions and comments here, or reach out to me via Twitter. You can see my presentation at GDC this year as well as check out the code in GitHub, which is still being actively worked on!

All posts in this series:

    1. Containerising and Deploying
    2. Managing CPU and Memory
    3. Scaling Up Nodes
    4. Scaling Down Nodes
    5. Running Globally

Scaling Dedicated Game Servers with Kubernetes: Part 2 – Managing CPU and Memory

This is part two of a fivefour-part series on scaling game servers with Kubernetes.

In the part 1 of this series, we looked at how we could take our dedicated game server, containerise it with Docker, and then host and manage it on Kubernetes, which was an excellent start. However since our Kubernetes clusters are generally of a fixed size we could potentially run out of available capacity to run all the game server containers we need to match all the players who want to play our game – and that would be a very bad thing.

There are lots of options for scaling up and down Kubernetes clusters, and we will go deeper into a custom Kubernetes node scaler in future posts. First we have to work out one very important thing: How much CPU and memory does my game server actually take up?

Without this knowledge, there is no way we can match up the CPU and/or memory utilisation of our game server with the available resources in our Kubernetes cluster, and therefore know how many game servers can be run within a cluster of a given size. We also won’t be able to calculate how many nodes to add when the number of players in our game increases (as it naturally will 😉) and we want to make sure the cluster is large enough to accommodate all of them. We will also want to scale down our cluster if traffic starts to drop to take advantage of cost savings by removing cluster nodes that aren’t being used.

Kubernetes Dashboard

The great thing is, visualisation of CPU and memory usage is another feature that Kubernetes provides out of the box. Kubernetes is bundled with a dashboard that gives us a nice visual representation of what is happening inside our cluster – such as listing Pods and Services, providing us with graphs of CPU, memory usage, and more.

(The Kubernetes dashboard!)

To securely connect to the Kubernetes dashboard, Kubernetes’ command line tool has the command:

kubectl proxy

Which when run, will return with:

Starting to serve on

This creates a proxied connection to the Kubernetes master, (assuming you have permission to access the cluster) which hosts the Kubernetes cluster’s API and the dashboard that is displayed above.

If we now point our browser to http://localhost:8001/ui, we will see the dashboard.

Determine CPU and Memory Usage

You may have noticed that the dashboard gives us aggregate statistics of CPU and memory across the cluster, but it can also provide us with the same information at the Pod level as well! Therefore, all we need to do to determine how much CPU and memory our game server is using, is deploy a Pod containing our game server (which we set up in the previous post), do some load tests by running multiple game sessions on it, and have a look at the provided graphs.

I did this for the Paddle Soccer game, and you can see one of the results below:

In the test above, this simple dedicated game server peaks its usage at around 0.08 CPU cores and just over 34 Megabytes memory. This was the maximum usage we saw when doing load tests on the dedicated game server, so we’ll draw a line in the sand here and say that this is the upper limit of what our server utilises, add some buffer, and plan accordingly.

Restricting CPU and Memory Usage

One of the very useful capabilities of software containers such as Docker, is that it has the ability to put constraints on both CPU and memory usage of the running container, and therefore the processes within it.  Kubernetes exposes this to us through its Pod configuration, which means that we can explicitly ensure that CPU and memory usage won’t go over a certain threshold and adversely impact other game servers running on the same node. Given how sensitive game servers can be to CPU fluctuations, this is a very useful thing!

Therefore, if we want to limit the CPU usage of the Pod that contains our game server, we can do so by updating our previous Pod definition:

It’s worth noting that in my sample code, I use the Kubernetes API to provide a configuration identical to the one above, but the yaml version is easier to understand, and it is the format we’ve been using throughout this series.

In the above example we have set a limit of “0.1”, which ensures that our game server only ever has access to one tenth of a CPU on the node that it is running on. We did this by adding the resources section with a corresponding limits and cpu section in our yaml for the game server container definition.

I chose to set a maximum CPU usage of 0.1 to give some padding to the game server 0.08 core usage result we saw above, while still letting me fit 10 game servers per core on each Kubernetes cluster node, which should scale well for our needs.

We could also do similar limits with memory usage, but for the sake of simplicity we’re going to just be limiting CPU usage, and ultimately using just CPU for our scaling metrics as well. If you want to learn more, the “Managing Compute Resources for Containers” documentation is also an excellent guide on these Kubernetes features.

Next Steps

Now that we have a known CPU usage and limits on our game servers, we will combine this in our next posts with the capacity of nodes that Kubernetes API reports, to enable scaling up and scaling down the cluster size in relation to the number of game servers requested on the cluster.

In the meantime, as with the previous post – I welcome questions and comments here, or reach out to me via Twitter. You can see my presentation at GDC this year on this topic as well as check out the code in GitHub, which is still being actively worked on!

All posts in this series:

  1. Containerising and Deploying
  2. Managing CPU and Memory
  3. Scaling Up Nodes
  4. Scaling Down Nodes
  5. Running Globally

Scaling Dedicated Game Servers with Kubernetes: Part 1 – Containerising and Deploying

This is part one of a fivefour-part series on scaling game servers with Kubernetes.

For the past year and a half, I’ve been researching and building a variety of multiplayer games. Particularly those in the MMO or FPS genres, as they have some unique and very interesting requirements for communication protocols and scaling strategies. Due to my background with software containers and Kubernetes, I started to explore how they could be used to to manage and run dedicated game servers at scale. This became an area of exploration for me, and honestly I’ve been quite impressed with the results.

Why Are You Doing This Thing?

While containers and Kubernetes are cool technologies, why would we want to run game servers on this platform?

  • Game server scaling is hard, and often the work of proprietary software – software containers and Kubernetes should make it easier, with less coding.
  • Containers give us a single deployable artifact that can be used to run game servers. This removes the need to install dependencies or configure machines during deployment, as well as greatly increases confidence that software will run the same on development and testing as it will in production.
  • The combination of software containers and Kubernetes lets us build on top of a solid foundation for running essentially any type of software at scale – from deployment, health checking, log aggregation, scaling and more, with APIs to control these things at almost all levels.
  • At its core, Kubernetes is really just a cluster management solution that works for almost any type of software. Running dedicated games at scale requires us to manage game server processes across a cluster of machines – so we can take advantage of the work already done in this area, and just tailor it to fit our specific need.
  • Both of these projects are open source, and actively developed, so we can also take advantage of.any new features that are developed going forward

If you want to learn more about the intricacies of designing and developing communication and backends for FPS/MMO multiplayer games from the ground up I suggest starting with the articles on Gaffer on Games or, if you prefer books, The Development and Deployment of Multiplayer Games.


I will put one proviso on all of this. While I am aware of more than a few companies that are containerising their game servers and running them in production, I am not aware of any that are doing so on Kubernetes. I have heard of some that are experimenting with it, but don’t have any confirmed production uses at this stage. That being said, Kubernetes itself is being used by many large enterprises, and I feel this technology combination is solid, super interesting, and could potentially save game studios a lot of time. However, I am still working out where all the edges are. That being said, I will be happily sharing the results of all my research – and would love to hear from others on their experiences.

Paddle Soccer

(Paddle Soccer in action. What a game!)

To test out my theories, I created a very simple Unity based game called Paddle Soccer, which is essentially exactly as described. It’s a two-player, online game in which each player is a paddle, and they play soccer, attempting to score goals against each other. It has a Unity client as well as a Unity dedicated server. It takes advantage of the Unity High Level Networking API to provide the game state synchronisation and UDP transport protocol between the servers and the client. If you are curious, all the code is available on GitHub for your perusal.

It’s worth noting that this is a session-based game; i.e.. you play for a while, and then the game finishes and you go back to the lobby to play again, so we will be focusing on that kind of scaling as well as using that design to our advantage when deciding when to add or remove server instances. That being said, in theory these techniques would work with a MMO type game, with some adjustment.

Paddle Soccer Architecture

Paddle Soccer uses a traditional overall architecture for session-based multiplayer games:

Architecture diagram

  1. Players connect to a matchmaker service, which pairs them together, using Redis to help facilitate this.
  2. Once two players are joined for a game session, the matchmaker talks to the game server manager, to get it to provide a game server on our cluster of machines.
  3. The game server manager creates a new instance of a game server that runs on one of the machines in the cluster
  4. The game server manager also grabs the IP address and the port that the game server is running on, and passes that back the matchmaker service
  5. The matchmaker service passes the IP and port back to the players’ clients
  6. …and finally the players connect directly to the game server and can now start playing the game against each other

Since we don’t want to build this type of cluster management and game server orchestration ourselves, we can rely on the power and capabilities of containers and Kubernetes to handle as much of this work as possible.

Containerising the Game Server

The first step in this process is putting the game server into a software container, so that Kubernetes can deploy it. Putting the game server inside a Docker container is essentially the same as containerising any other piece of software.

If this is not something you have done before, you’ll want to follow Docker’s tutorial, or if you like books, have a read of The Docker Book: Containerization is the new virtualization.

Here is the Dockerfile that is used to put the Unity dedicated game server in a container:

Because Docker runs as root by default, I like to create a new user and run all my processes inside a container under that account. Therefore, I’ve created a “unity” user for the game server and copied the game server into its home directory. As part of my build process, I create a tarball of my dedicated game server, and it’s been built such that it will run on a Linux operating system.

The only other interesting thing I do, is that when I set the ENTRYPOINT (the process to run when the container starts) I tell Unity to output the logs to /dev/stdout (standard out, i.e. display in the foreground), as that is where Docker and Kubernetes will take logs to aggregate from.

From here I am able to build this image and push it to a Docker registry, so that I can share and deploy this image to my Kubernetes cluster. I use Google Cloud Platform’s private Container Registry for this, so that I have a private and secure repository of my Docker images.

Running the Game Server

For more traditional systems, Kubernetes provides several really useful constructs, including the ability to run multiple instances of an application across a cluster of machines and great tooling to load-balance between them.  However, for game servers, this is the direct opposite of what we want. Game servers usually maintain stateful data about players and the game in memory, and require very low latency connections to maintain the synchronicity of that state with game clients such that players do not notice a delay. Therefore, we need to have a direct connection to the game server without any intermediaries in the way adding latency, as every millisecond counts.

The first step is to run the game server. Each instance is not the same as the other, as they are stateful, so we can’t use a Deployment like we would for most stateless systems (such as web servers). Instead, we will lean on the most basic building blocks of deploying software on Kubernetes – the Pod.

A Pod is simply one or more containers that run together with some shared resources, such as an IP address and port space. In this particular instance, we will only have one container per Pod, so if it makes things easier to understand, just think of Pod as synonymous with software container for the duration of article.

Connecting Directly to the Container

Normally, a container runs in its own network namespace and it isn’t directly connectable via the host without some work to forward the open ports inside the running container to the host. Running containers on Kubernetes is no different – usually you use a Kubernetes Service as a load balancer to expose one or more backing containers. However, for game servers, that just won’t work, due to the low latency requirement for network traffic.

If you want to learn about the basics of deploying to Kubernetes, try the interactive tutorials.

Fortunately, Kubernetes allows Pods to use the host networking namespace directly by setting hostNetwork to true when configuring the Pod. Since the container runs on the same kernel as the host, this gives us a direct network connection without additional latencies, and means we can connect directly to the IP of the machine the Pod is running on and connect directly to the running container.

While my example code makes a direct API call against Kubernetes to create the Pod, common practice is to keep your pod definitions in  YAML files that are sent to the Kubernetes cluster through the command line tool kubectl. Here’s an example of a YAML file that tells Kubernetes to create a Pod for the dedicated game server, so that we can discuss the finer details of what is going on:

Let’s break this down:

  1. kind
    Tell Kubernetes that we want a Pod!
  2. metadata > generateName
    Tell Kubernetes to generate a unique name for this Pod within the cluster, with the prefix “game-”
  3. spec > hostNetwork
    Since this is set to true, the Pod will run in the same network namespace as the host.
  4. spec > restartPolicy
    By default, Kubernetes will restart a container if it falls over. In this instance, we don’t want that to happen, as we have game state in memory, and if the server crashes, it’s very hard to restart back where the game was originally.
  5. spec > containers > image
    Tells Kubernetes which container image to deploy to the Pod. Here we are using the container image we created earlier for the dedicated game server.
  6. spec > containers > env > SESSION_NAME
    We are going to pass into the container the cluster-unique name for the Pod as an environment variable SESSION_NAME, as we will use it later. This is powered by the Kubernetes Downward API.

If we deploy this YAML file to Kubernetes with the kubectl command line tool, and we know what port it is going to open, we can use the command line tools and/or the Kubernetes API to find the IP of the node in the Kubernetes cluster it is running on, and send that to the game client so it can connect directly!

Since we can also create a Pod via the Kubernetes API, Paddle Soccer has a game server management system called sessions, which has a /create handler to create new instances of the game server on Kubernetes. When called, it will create a game server as a Pod with the above details. This can then be invoked via a matchmaking service whenever it has a need for a new game server to be started to allow two players to play a game!

We can also use the built-in Kubernetes API to determine which node in the cluster the new Pod is on, by looking it up from its generated Pod name. In turn, we can then look up the external IP of the node, and now we know what IP address to send to game clients.

This solves some problems for us already:

  • We have a prebuilt solution for deploying a server to our cluster of machines through container images and Kubernetes.
  • Kubernetes manages scheduling the game servers across the cluster, without us having to write our own bin-packing algorithm to optimise our resource usage.
  • New versions of the game server can be deployed through standard Docker/Kubernetes mechanisms; we don’t need to write our own.
  • We get all sorts of goodies for free – from log aggregation to performance monitoring and more.
  • We don’t have to write much code (~500 LOC) to coordinate game servers across a cluster of machines.

Port Management

Since we will likely have multiple dedicated game servers running on each of the nodes within our Kubernetes cluster, they will each need their own port to run on. Unfortunately, this isn’t something that Kubernetes will help us with, but solving this problem isn’t particularly difficult.

The first step is to decide on a range of ports that you want traffic to go through. This makes things easier for network rules for your cluster (if you don’t want to add/remove network rules on the fly), but also makes things easier for your players if they ever need to setup port forwarding or the like on their own networks.

To solve this problem, I tend to keep things as simple as possible: I pass the port range that could be used as two environment variables when creating my pod, and have the Unity dedicated server randomly select a value between that range, until it opens a socket successfully.

You can see the Paddle Soccer Unity game server doing exactly this:

Each call to SelectPort chooses a random port within a range, to be opened on StartServer invocation. StartServer will return false if it was unable to open a port and start the server.

You may also have noticed the call to instance.Register. This is because Kubernetes doesn’t give us any way to introspect what port this container started on, so we’ll need to write our own. To that end, the Paddle Soccer game server manager has a simple /register REST endpoint backed by Redis for storage that takes the Pod name that Kubernetes provides (which we pass through by environment variable), and stores the port the server started on. It also provides a /get endpoint for looking up what port the game server started on.  This has been packaged along with the REST endpoints that create game servers, so we have a single service for managing game servers within Kubernetes.

Here is the dedicated game server registering code:

You can see where the game server takes the environment variable SESSION_NAME with the cluster-unique Pod name and combines it with the port. This combinations is then sent as a JSON packet to the /register handler of the game server manager, sessions’  /register handler.

Putting it All Together

If we combine this with the Paddle Soccer game clients, and a very simple matchmaker, we end up with the following:

Kubernetes API step through

  1. One player’s client connects to the matchmaker service, and it does nothing, since it needs two players to play
  2. A second player’s client connects to the matchmaker service, and the matchmaker determines it needs a game server to connect these two players to, so it sends a request to the game server manager
  3. The game server manager makes a call to the Kubernetes API to tell it to start a Pod in the cluster with the dedicated game server inside it
  4. The dedicated game server starts
  5. The dedicated game server registers itself with the game server manager, telling it what port it started on
  6. The game server manager grabs the aforementioned port information, and the IP information for the Pod from Kubernetes and passes it back to the Matchmaker
  7. The matchmaker passes the port and IP information back to the two player clients
  8. The clients now connect directly to the dedicated game server, and play the game

Et Voilà! We have a multiplayer dedicated game running in our cluster!

In this example, a relatively small amount of custom code (~500 loc) was able to deploy, create, and manage game servers across a large cluster of machines by leveraging the power of software containers and Kubernetes.Honestly, it’s pretty awesome the power that containers and Kubernetes gives you!

This is but part one in the series however! In the next episode, we will look at how we can use APIs to scale up our Kubernetes cluster as demand for our game increases.

In the meantime, I welcome questions and comments here, or reach out to me via Twitter. You can see my presentation at GDC this year on this topic as well as check out the code in GitHub, and still actively being worked on!

All posts in this series:

  1. Containerising and Deploying
  2. Managing CPU and Memory
  3. Scaling Up Nodes
  4. Scaling Down Nodes
  5. Running Globally

Cloud NEXT for Game Developers

If you are a game developer that works on any kind of mobile, multiplayer or online internet connected game, you may want to attend Google Cloud’s premier conference in San Francisco, Cloud NEXT, especially since it is conveniently timed right after GDC. Even if you aren’t attending GDC, if you are building any backend infrastructure for online or multiplayer games, analytics or machine learning over player data, there is plenty to see at Cloud NEXT.

That said, there are over 200 breakout sessions over the three days of the event, and it can be hard to work out which sessions might actually be useful to you, as a game developer, so this guide is here to help!

While there are wide variety of sessions that would be applicable if you do any kind of backend development for all types of online games, here are a few that are directly impactful for game developers, and are a definite must see at Cloud NEXT.

Open Persistent, Multiplayer Worlds

The session I’ll be co-presenting with Rob Whitehead, CTO of Improbable is Building Massive Online Worlds with SpatialOS and Google Cloud Platform.

Late last year we announced our partnership with Improbable who run their product SpatialOS on Google Cloud Platform. This is particularly exciting because it makes building almost any kind of large, persistent open world with multitudes of players as well as autonomous agents, such as NPCs, in the hands of indie developers. It’s built as a back end as a service, so SpatialOS will manage scaling to meet your needs without your team’s intervention, saving your team a lot of infrastructure management responsibilities and the costs associated.

Location Based Games

With the explosion that was Pokemon Go, location based gaming is also becoming a really hot topic. The session Location-based gaming: trends and outlook will look at emerging trends in this space and how you can use Google tools to get an accurate representation of the world around you and use them within your game. I particularly love the combination of this technology with augmented reality games. I’m predicting we’ll see a lot of these types of games coming out in the future.


While not specifically a session directly for game developers, if you need GPUs for your virtual machines to render models, or run games in real time in the cloud, I would also put Supercharge performance using GPUs in the cloud as a must see session. I’m particularly excited about games streaming alternate viewing cameras in online games, especially in e-sports, via running connected client instances games in the cloud – and for this, you have to have GPUs.

Game Analytics

Finally, if you want to see the Product Manager for Games at Google Cloud Platform, Daniel Grachanin, who is co-presenting this talk, or are looking for a very practical session on how you can use Big Data to instrument and introspect your game, you should check out the session Game changer: Google Cloud’s powerful analytics tools to collect and visualize game telemetry and data. I always really enjoy talks like this, because so often players don’t end up doing what you would expect them to do as a game designer. Using tools like this gives you insight into how your players are actually engaging with in your game, which is always fascinating.

While on the surface it may not seem like Cloud NEXT is an event that is aimed at game developers. But if you are looking to build any kind of backend infrastructure to support your game, there is a plethora of information available at this event. From the talks above, to many sessions and bootcamps on general application development, security, big data and analytics, machine learning and more.

Finally, if you come – make sure to wander up and say hello!

Recording: Creating interactive multiplayer experiences with Firebase

Earlier this year, I had the honour of presenting at Google I/O 2016, on how to build multiplayer games with Firebase.

If you aren’t familiar with Firebase, it’s a fantastic Backend-as-a-Service that has several features, most notably the Realtime Database, that can cut down the amount of time for certain types of games quite drastically.

In this presentation, I build a simple web based game, using only client side Javascript (no frameworks!) and Firebase, that shows off real time chat, some very simple matchmaking, and a warped version of Rock-Paper-Scissors that can be played over the internet: Happy-Angry-Surprised.

The premise of the game is such – a web cam takes a picture of your face, and a picture of the person you are playing. Your facial emotion is determined via Cloud Vision API, and the winner is determined by:

  1. Happy beats Angry
  2. Surprised beats Happy
  3. Angry beats Surprised

Game screenshot

That’s really about it!

This was an incredibly fun demo to build, and also presentation to give. I especially love that I am able to play the game with an audience member when I demo the final application – it’s a load of fun.

If you want to see the code, it’s available on Github for your download and perusal as well.

One Year as a Developer Advocate

A little while ago, I passed my first year mark of working for Google. This also marked the first anniversary of me moving from a (long?) career as a software engineer, into a full time position in Developer Relations, and more specifically Developer Advocacy.

I wanted to write a post about this, as when I often talk to my fellow tech-community members, I discuss the transition from a software engineer, into that of a Developer Advocate, I normally get a response along the lines of:

“Oh, I don’t think I could stop being an engineer”

“How is it going not writing code anymore?”

And various things along those lines.  Usually at that point, we go into a more detailed and nuanced conversation about what it means to be a Developer Advocate, at least from my, somewhat limited perspective, and those sort of comments tend to fade into the background.

So, in lieu of having this conversation with everyone over and over again, I thought I’d write down some of my thoughts and feeling about what I feel it is to be a Developer Advocate, the transition from being a full time software engineer, and some of what I’ve been up to for the past year, since joining Google, and more specifically Google Cloud Platform – hopefully you will find it interesting.

Before I get started, I’ll throw out a big disclaimer – the whole developer relations space, and the various roles that are found in it, seems to vary quite greatly as you move from company to company, team to team, and even within individuals within teams – please take this as my own personal opinion, and feel free to take it with a large handful of salt (I’ve only been professionally at this for a year!!!). What I call “Developer Advocacy”, other people may call other things, and may blend into other jobs in other places.

So what does it mean to me to be a Developer Advocate? I feel that at its most core level, it means that my job is to advocate for external developers of a given product. Which is about as literal a meaning as you can get.

This means that I want to communicate to external developers, and tell them about the product that I represent (in this case Google Cloud Platform), both to get them excited about our product, but also to give them enough information about it, so they can know whether it fits their needs.  To belabour this point a little deeper – I advocate for developers – so if there is a better product for your needs, I feel my job as a Developer Advocate is to point you in that direction.  My job is not sales (which is a totally different, yet just as important role), I don’t have quota, so if you are happy using the tools and services you use, then that is fantastic, if I can just give you some more information that may be of service down the line, then great, if it’s not, that’s totally fine as well.

I have people skills!

Advocating for developers also means internally as well. This is the part that many people don’t see, and often the perception can be that as a job, we are entirely outward facing. I want to be able to hand a developer the best version of a product that I can, because I am on your side, so that means providing feedback as well internally – to Product Managers, documentation, engineering, and anywhere else I can see where there is a place to make an improvement.

I’ve found this role has suited me quite well. Even when I was a full time software engineer, I always had a strong interest in how to build community, how to effectively engage with people and the whole world of social interaction. In many ways, the open source projects I started, and subsequent communities I helped build up around those projects was a natural extension of that interest I held. In that open source arena, I quickly learned that the greatest project in the world wouldn’t have any users if all you did was whisper its praise into a well.  Getting the message out into the world, in an interesting and engaging way was the only way to (a) get people interested in the product and (b) get people to be invested enough to contribute back to the project, via feedback, pull requests or otherwise, and thereby improve the project overall.

On top of that, I invested heavily into the communities I was involved with.  I was an Adobe Community Professional for many years, I presented at a variety of conferences, including Adobe MAX, I ran a conference in Melbourne for a number of years, as well as quite a few other endeavours in similar vein.  So when the opportunity raised its head to move into the developer relations space, it was really just a flipping of software engineer being greater than advocacy, to having advocacy be greater than software engineer in my day to day life.

I will readily admit that it was an interesting mental shift making that flip. I know that I personally (and from anecdotal evidence I have from talking to peers, many feel the same), self identified strongly as a “software engineer” first and foremost, and it was an initial struggle to think that I was going to be giving that up, before I truly understood this role, and also let go of some cognitive dissonance.  Now that I’ve had some time in this position, I can say that in this job, I’ve never felt that I’ve stopped being a software engineer. In fact, being a software engineer is a critical part of me being able to do my job effectively, but now I just use it in a slightly different way.  Yes, I do sometimes miss solving the large architectural problems or consistently writing code all day long, but the flip side is, the code I write is generally only the code I want to write. If I want to spend some time working out how I can use Haskell on Google Cloud Platform, I can make that decision to do so (this is on my todo list, once I get a better understanding of Haskell). On top of that, no one is going to call me at 3am to let me know a server is on fire, and I need to get up and fix it. So there are pros and cons both ways.

So, given the above – what does a typical “day at the office” look like for me? Honestly, and one of my favourite things about this job, is that there are really no “typical days”. Each day vastly differs from the next, but here is a most likely incomplete list of thing I’ve been up to in the past year just to give you an idea of what sort of things I’ve been involved with.

Doing a Lot of Research

This may not be immediately obvious from the outside, but I spent a lot of time reading articles, running through online training, building proof of concepts, just to learn new things.  I was thankful that I had had prior experience with Google Cloud Platform before joining but I quickly found (a) that I really only had a small understanding of the platform, which was naturally biased towards the types of applications I was previously building on it and (b) Google Cloud Platform is constantly releasing new things, so choosing which products were going to be relevant to the audience I feel I identify with and learning about them is a constant, ongoing task.

This wasn’t just limited to Google Cloud either – I spent an inordinate amount of time playing with Docker containers, in some rather bizarre and interesting ways, to try and get a deep understanding of the platform (and also because it’s a ridiculous amount of fun).

On top of that, in the past five months or so I’ve been increasingly game development focused. Over the past few years, I’ve had a romantic notion of one day getting more involved with game development, and working on Google Cloud has allowed me to foster that. Which means I spent a lot of time both reading and playing with various tools and libraries, but also talking to various people across Google and outside from sales to marketing, product management and game developers to get truly understand what game developers need out of cloud infrastructure, and also how having an internet connected game changes the types of games that can be built and how they are developed in the first place. I’ve still got a lot more to learn in in this area, as it is massive and varied, but it’s been an absolute delight to get involved in something I’ve been getting increasingly passionate about as I get older.

Writing Software

It’s true! I do spend time writing software! Crazy!

More often than not, this usually consists of building fun and interesting demonstrations of Google Cloud Platform, which eventually ends up being content for presentations, and also as a perfect avenue for truly having an understanding of the product(s) that I’m talking about, so this goes hand in hand with the research section above.

Sometimes this is just open source, for open source’s sake, because it’s an interest, or useful to what I do.  The demos I build also become open source projects as well, although that being said, I’m currently behind on releasing about 3 things, just because I’ve got to send them through code reviews and the like.

Presenting at Conferences and Events

Unsurprisingly, I do a fair amount of presentation at conference and other events, such as meetups.  I’ve had the pleasure of attending some fantastic events, and particularly enjoyed having the opportunity to present at Google I/O this year.

I find a distinct pleasure in getting up on stage and building a narrative that people can use to understand a topic, and having them be able to walk away having a deeper understanding than they had before.  I work very hard to make sure people can walk away from a presentation with enough of an understanding to know whether the topic at hand is something they are keen to explore further, or if the given topic is not relevant to them at this time, which in my mind, is just as important information.  Also, getting a laugh out of an audience is just a great feeling as well.

Creating Online Content

Online content has also been an interesting adventure. I’ve always been a relatively intermittent blogger, and while this job has me writing somewhat more than before, I’m nowhere near what I would call “prolific”. That being said – I did have the wonderful opportunity to start the weekly Google Cloud Platform Podcast with my teammate Francesc Campoy Flores.

Podcast setup

Credit: Robert Kubis @ GCP NEXT, San Francisco

I’ve been an avid podcast listener for a long time, and started the randomly released 2 Developer Down Under podcast with Kai Koenig several years ago, but the Google Cloud Podcast has become a far more professional endeavour, and a much larger audience. It has given me the opportunity to interview both a wide variety of people inside of Google, and a slew of customer and users of Google Cloud outside of Google as well.  We’ve had guests from from San Francisco to South Africa, and feedback to the podcast has been consistently positive, which has been incredible.

Interacting with the Community

Interacting with the community is also an integral part of what I do as a Developer Advocate. This can be in person at meetups and conferences, via Hangouts for in depth customer meetings as well as online via Twitter and the Google Cloud Slack community that I founded with teammates Sandeep Dinesh and Ray Tsang.

If you are interested in Google Cloud Platform, the Slack community has grown by leaps and bounds, and is a great place for Google Cloud users to interact with each other and share experiences. There are also a large number of Googlers that usually hang out there for you to chat with as well.  I’m very happy to see how much the community has come together through a shared passion for Google Cloud Platform, especially considering how widely dispersed they all are.

Overall, community interaction is a hugely valuable activity for me, because this is where I really get to see what software people are building “out in the wild”, ideally help them find solutions that will aid them in their endeavours as well as mentally match that back to what we are doing at Google so we can build a better service.

Product Feedback

Doing all of the above, gives me the research to effectively provide feedback to the right people in Google Cloud platform. Without my external facing endeavours, it would simply just be solely my opinion (and feel free to give that the level of respect you feel is appropriate), and nowhere near as valuable as it is when combined with the voices of people I meet at events, interact with online and talk to on the podcast.

Feedback takes the form of reports from events, that are shared far and wide, regular old bug filing, to reporting just general pain points. It can be sit down meetings with product managers to discuss existing features, or even proposing new features that I feel would be a great addition to a product.

I feel this is the part that most people don’t think of when they think of developer advocacy, which is totally understandable, it’s not something that is immediately obvious to the outside world, but I feel that closing this loop, from external to internal advocacy is an important part of what I do in my job, and developer advocacy in general – because if I’m not making sure the product is getting better, I’m not making sure the people I advocate for – the developers – are getting a better product.

Overall, I have to admit, I’m adoring this new role. I feel like it’s been a very natural progression for me, and most importantly, it’s an incredible amount of fun building crazy-silly demos, running around and talking to people and generally just spending every day involved in some super, super cool technology at Google.

On top of that, this is honestly, the best team I’ve ever worked on, with the widest variety of incredibly intelligent and creative people I’ve had the pleasure of being in a team with.

Hopefully, that gives you a better understanding of what it means to be a Developer Advocate, or at least my current perspective on the subject.  Please add comments, questions or suggestions below, this is a fascinating topic for me, so always keen to engage in meaningful discourse around it.  Alternatively, if you want to talk privately about developer relations, pretty much anything to do with the cloud, or are looking for a speaker for an event, or almost anything at all vaguely related to tech, don’t hesitate to drop me a line via the the contact methods above.  I love to talk to people, and am always looking to have a chat.

Brute 0.4.0 – From CLJX to Reader Conditionals

This release of Brute, provides no new features, but is instead a migration from using CLJX as the support for both Clojure and ClojureScript at once, to upgrading to Clojure 1.7, and utilising the (relatively?) new feature of Reader Conditionals.

In terms of changing the implementing code, this was a fairly straightforward task. It was essentially a case of changing file extensions from cljx to cljc, moving the code out of the cljx folders and back into the src directory, and then converting the markers for where clj implementations would be switched out with cljs implementations, depending on which platform it was running on.

For example, the function I had to generate UUIDs, used to looks like this:

But now it looks like this:

As you can see, there are some minor syntactic changes, but essentially it’s the same structure and format. Not a hard thing to do with some time and some patience.

The more interesting part was more finding a test framework that worked well with reader conditionals. Previously, I had all my tests implemented in Midje, but found that the autotest functionality doesn’t work with reader conditionals, which for me was it’s biggest feature.  It also didn’t have core support for ClojureScript, but instead was implemented by a different developer as part of the purnam library. While the implementation for ClojureScript is quite good, it hadn’t been touched in ten months, which has me concerned.

After hunting around a bit, I ended up settling on using good ol’ clojure.test and cljs.test. It’s well supported across both platforms, and as I recently discovered has amazing integration with Cursive Clojure!  It took me a little while to get all the tests ported across, but otherwise the experience was also very smooth.  Now I have a testing platform I can continue to develop on, and I know will continue to support both Clojure and ClojureScript for the foreseeable future.

I have some minor enhancements for Brute that I will probably jump on at some point in the near future, but as always, feature requests and bug reports are always appreciated.

Recording: Containers in Production – Crazy? Awesome? or Crazy Awesome!

Late last year I was invited to participate in a panel discussion at New Relic’s FutureStack conference on using Software Containers (which generally means Docker) in production environments.  We had an excellent discussion about what is good about the ecosystem, as well as what needs to improve, as well as best practices and approaches for running Containers in production environments – at either the large scale, or the small.

I was super happy that I managed to get one of my main talking points about Containers into the conversation, specifically that Containers do not necessarily equal micro-services. I’ve seen people feel that these two are inextricably linked, and they definitely are not. You are perfectly able to leverage the benefits of Containers, without going down the extra complexity of micro-services (which has it’s own set of pros and cons) if you do not want to.

During the discussion I reference a talk by John Wilks, discussing Borg at Google, and how it directly influences the design decisions behind Kubernetes. It’s one of my favourite presentations, and well worth a watch as well.

Futurestack was a great event, and it was a pleasure to be able to attend.

Recording: Wrapping Clojure Tooling in Containers (Part 2)

A few weeks ago, I had the distinct opportunity to attend and present at clojure/conj in Philadelphia. This was the first time for me attending the event, but it had been on my list of conferences to be at for a very long time.  Now that I live in San Francisco, it’s great that I can take advantage of the situation and get to attend the events I had watched enviously from across the ocean. Especially since I’ve been playing with and (very occasionally) working with Clojure for a while now.

Wrapping Clojure Tooling in Containers(link to video)

The talk I gave at the conference was an update on a previous talk I had done at the local Clojure Meetup. The difference being that when I wrote the original talk, I was attempting to build Docker development environments that were one size fits all, by leveraging ZSH. Since then, I’ve switched to developing with per-project Docker development environments, powered by Makefiles that are shipped along with, and contain all the dependencies for, said project. This talk represented this change, as well as several other tips and tricks I’ve discovered along the way (cross platform GUIs running in containers anyone?)

Hopefully you can’t tell, but the first section of the presentation was given without the slides. We had some technical difficulties with my laptop and the projector, but the excellent people at the event managed to get it all working just in the nick of time for the demo, and while it did cut my presentation time a little short, I still had time to cover the points I wanted to cover.

If you are interested in the code that powers this talk, it is all up on github, so you can pick it apart at your leisure.