We all love procedural generation! A little bit of randomness goes a long way in keeping a game fresh and unexpected. But this unpredictability can also make the designer’s job more difficult – how do we ensure that randomised elements don’t throw off our game’s balance?
Over the years, Roguelike designers have developed many ways of tackling this problem. In this article I’m going to look at some common techniques and also explain some of the tricks I’ve used in my own game. Let’s get started!
Note: In this article I’m talking about balance for single-player games (that is, maintaining an optimal level of challenge for the player). This is different to balance in multiplayer games (that is, making sure each player has a fair set of options).
The Law of Large Numbers
A match structure is perhaps the simplest way to balance randomness. If your game is split up into separate “runs” or matches, then over a large number of these matches everything will roughly balance out. This is the law of large numbers:
As a sample size grows, the sample’s average converges with the theoretical average.
So while any particular match might be too easy or too hard, the “opposite” match is bound to show up eventually and even things out.
This balancing method has several advantages:
- It’s easy to implement.
- The variety between runs creates extra novelty and excitement.
- It can easily be combined with other methods.
But when used haphazardly it can also cause problems. Let’s take a concrete example.
Spelunky is a Rogue-inspired platformer where you jump, climb and dodge your way through a perilous cave system. You’ll periodically find items stored in boxes or offered at shops. But not all these items are equally powerful! The Jetpack, in particular, is extremely strong. It lets you fly for a short distance with each jump, allowing you to easily cross a wide variety of the game’s obstacles. The Spelunky wiki says this:
“The extraordinary mobility granted by the jetpack means that, for most intents and purposes, it replaces the Rope, Spring Shoes, Climbing Gloves and Cape.”
As far as I can tell, the game’s items are distributed pure-randomly. This means you can find a jetpack on the first level if you’re lucky enough. But if you do, you’ll then find that the rest of your run is substantially easier. You’ll be able to fly over all sorts of obstacles that otherwise would have risked taking damage or death.
Somebody needs rescuing in the top-right corner. Let me just zoooom up over there and sort it right out.
On other runs you might just get lots of Ropes from your item boxes. You can throw a rope in the air and then use it to climb higher in that particular location – so they’re sort of like a single-use Jetpack (and they also require more skill to use). A high-Rope run will be substantially harder than an early-Jetpack run.
A little bit of variance can be fun, encouraging the player to try harder to beat a deck stacked against them. But too much can make a player feel frustrated, like they’re just waiting for a high roll. And when a high roll does come along, the player might even feel like they didn’t really deserve to win that match! When randomness has such a large impact on difficulty, it’s hard to know if you’re truly building skills or just getting lucky sometimes.
You could mitigate some of this by giving the jetpack a limited number of uses. That way it only gives you a temporary boost and doesn’t unbalance your entire run. This is actually sort of like applying the law of averages inside your match (as opposed to just across matches) because at some points in the match the player will have more bonuses than at others.
That said, many players hate having powerups taken away from them! So if we want better balancing inside of any particular match, we might want to look at some other techniques.
Some games are tight and discrete, like Chess for example. Each of your moves are clearly delineated, with their own predictable outcomes. This makes chess a great test of tactical skill, but it also makes the game’s difficulty hard to fiddle with.
How do you make chess 10% harder, or 10% easier? It’s not immediately obvious. Maybe you can give one player an extra pawn – but it’s not clear how much that would actually help. Considering Chess’ complexly interconnecting movement rules, that bonus pawn might eventually end up blocking the path of a more important piece!
Other games (especially videogames) are loose and continuous. In these games, the line between success and failure is much fuzzier. Suppose you have to dodge an enemy attack. You might just barely dodge, or you might dodge by a lot. Either way you’ll avoid being hit, but the exact way in which you dodged also has a subtle impact on how you’ll dodge future attacks.
Let’s consider Enter the Gungeon, a rogue-inspired “bullet hell” game where you dodge hundreds (if not thousands) of tiny bullets in every battle. Dodging all these bullets makes up the game’s primary challenge.
The upshot of this is that Enter the Gungeon’s difficulty is easy to fiddle with. To make the game roughly 10% easier or harder, we can just change the number of bullets by that much! (in practice they also have other levers for modifying difficulty, like bullet speed and enemy health)
By basing the game around a dexterity challenge, we can offload the balancing from our generation onto this challenge. This makes our lives much easier.
But what if you’re not making a dexterity based game? Patch Quest (my own game) arguably has more in common with Chess than Enter the Gungeon, despite being Roguelike-inspired. The game is discrete, tile-based and gives you time to plan out your moves. As a consequence, balancing the randomised elements of Patch Quest has been pretty tricky! I talked about some of my ideas in part 1 of this article series. But I’ve done a lot of tinkering since then, and I’ve made some key improvements to the generation algorithm.
I’ll share these improvements with you, but first let’s have a quick refresher on Patch Quest’s generation system.
I’ve talked about content layering in previous articles. The idea is to split your content up into separate, independently varying layers – so that you can mix and match these pieces to create a wide variety of situations.
A simple example would be Pokemon. Any given Pokemon has health, stats, moves, an item, an ability, and (possibly) a status ailment. Each of these is on a separate layer, and in combination they can create thousands of unique Pokemon.
The core idea here is to build your content out of small, customisable and variable building blocks rather than by hand-coding all the rules for each piece of content (in practice, many Pokemon do have fiddly, hidden rules – but let’s ignore that right now for simplicity).
Patch Quest is a tile-based game. These tiles have 3 content layers: Terrain, Scenery and Creature. Terrain and scenery are stuck to their tile – but creatures can move about freely. In the image below, you can see all of these on one tile: a Sea Tree (Scenery) and a Spin Star (Creature) on Prickleweed (Terrain).
These layers make a lot of sense for tiles. Having only one creature per tile simplifies a lot of the rules. You can clearly see the creature, and you can’t step onto its tile (unless you swap places with it). Creatures also interact with the scenery and the terrain of the tile they’re standing on. Having these 3 independently varying layers lets me create a wide variety of situations – all of which have sensible, clearly defined rules.
But Patch Quest has a coarser level of subdivision than tiles. I talked in part 1 about subdividing your game world into chunks, each of which is a self-contained challenge that we balance individually. In Patch Quest, the world is chunked into Patches (unsurprisingly). At the start of each run these patches are shuffled, so that traversing them always requires mixing up your tactics.
Some example patches made using the old generation algorithm
When generating my Patches I naively used these same 3 layers: each Patch could have a Creature, Scenery and Terrain. These elements were then distributed over the Patch’s various tiles. You can see this above – each Patch has only 3 elements (plus a secondary terrain type called the “basic” terrain).
But this approach created a host of balancing problems! Let’s take a look at 2 key examples.
Some sceneries, like rocks, block you from entering that tile. On the patch below we can see 3 rocks (they always spawn in threes). But some creatures also block you from entering their tile! Those 2 blue shells are actually Hermit Crabs – creatures that act like rocks when far away, but some of them spring to life when you get close.
This means that 5 out of 9 tiles are impassable (at least some of the time). Of the remaining 4 passable tiles, half of them also have the Water terrain hazard. This Patch would be very, very difficult to cross. Some Patches might even have no solutions!
On the other hand, we can get particularly easy Patches. The Patch below has a Steagull (an annoying but not particularly dangerous creature that steals your items) and some Soreheal Berries (which cure Sore damage, helping you). This Patch would be pretty trivial to traverse – especially if you don’t have any valuable items.
A little bit of variety in Patch difficulty is actually a good thing! It keeps the game fresh and unexpected for longer. But the problem here is that the difficulty swings themselves were happening randomly. Sometimes there would just be no reasonable way to get past, and at other times the game would give you several trivial Patches in a row.
These problems were both caused by the types of layers I chose. “Terrain, Creature and Scenery” made sense at the tile level, where they keep the game’s basic rules simple and clean. But at the Patch level, they aren’t actually connected to the gameplay role that a given piece of content aims to fill. In the first example, both cards are filling the role of “Obstacle”. In the second example, there aren’t enough cards filling the role of “Hazard”. So the first big improvement I made to the algorithm was to choose more appropriate content layers for Patches:
- Centerpiece: This layer is for everything that doesn’t play nicely with other content (like rocks that block your path, or large sand dunes that take up a lot of space).
- Terrain: I kept the terrain layer, because having only one terrain hazard per Patch (in addition to the “basic” terrain type) keeps Patches visually simple, at a glance.
- Assist: This layer is for sceneries that help you, either by healing you or by providing you with item.
- Hazard: This layer is for everything else – all of the creatures and sceneries that aren’t Centerpieces and aren’t Assists. Unlike the other layers, a Patch can have multiple cards from this layer.
Each patch can have up to 4 cards – which could mean one from each layer. But since multiple hazards are allowed, a Patch could also have extra hazards if some of the other layers are empty.
Let’s look at how this would solve our examples. In the first case, rocks and Hermit Crabs are both considered Centerpieces – because they both take up a lot of space. This means that the new generation algorithm would never spawn these on the same patch (it would maybe pick one of them and then fill out the patch with some different hazards). In the second example, we have an Assist, Hazard and Terrain – but the Centerpiece slot is empty! This means we could put another hazard down on that Patch, like an aggressive creature.
So just by choosing more appropriate content layers (layers that actually fit the roles of the gameplay we’re trying to encourage) we can greatly improve the balance of our procedurally generated chunks.
The best part is that these new Patch layers don’t interfere with the old tile layers. Each piece of content now just has both a “Patch Layer” (Centerpiece, Terrain, Assist, Hazard) and a “Tile Layer” (Terrain, Scenery, Creature). When generating Patches we can query one of these, and when generating tiles we can query the other. This flexibility is a strength of layering your content.
This only gets us halfway, though! We’ve made each Patch contain a variety of sensible gameplay roles – but we still don’t have any means of granularly modifying a Patch’s difficulty (like we discussed in the “Dexterity” section). This brings us to the second key improvement.
The idea behind generation points is simple. First, give each piece of content a number of points, and give each chunk a target score. During the generation process, we choose a set of contents that bring us close to that target score. By altering the target score, we can alter the difficulty of that chunk.
But, as is always the case with game development, the devil is in the details. In part 1 I talked about using Decks (“endless bag random”) to keep the variance of your generated world in-check. But drawing cards from a deck and choosing cards based on their points are quite different ideas, and to combine them we’re gonna need a new algorithm. Put on your programmer caps!
Let’s start by considering an example “Hazards-layer” deck with 3 Spin Stars, 2 Thunder Fish, 2 Poislugs and one copy of the very dangerous Warnet Swarm (a new enemy from the “Deadly Dungeons” Patch Quest update).
A quick refresher on endless decks: Once all cards have been dealt, the deck is refilled and reshuffled, meaning we can keep dealing cards forever (but in guaranteed ratios). The deck has duplicates to increase the rate at which that card appears.
Using this deck, we deal a lookahead list. All of the cards in this list are scored relative to how many points they give and close they are to the top of the deck. The lowest scoring card in the lookahead list is then chosen.
Let’s see an example. Below, we can see a half-filled Patch generator, and a lookahead list created from the above deck.
Our Patch generator has already chosen Yellow Rocks as a Centerpiece (which take up a lot of space, making them worth 2 points) and Sinkweed as Terrain (which isn’t particularly dangerous, and is only worth 1 point). It has no Assist. This means the patch has 2 empty slots, and 3 empty points.
Given this Patch generator, we can go down the lookahead list and score the first appearance of each type of card. To score the card, we add (A) how close it gets us to the points total and (B) its position in the deck divided by 3 (here, the number “3” is just intended to make this factor less important in the final score).
The first card would give us 1 point (meaning we are now 2 points away from our target). It’s in position 1 of the list, so we also add (⅓) to its score – giving us a total score of (2 + ⅓) = 2.33.
Once we’ve done this for every type of card in the list, we choose the first occurrence of the lowest scoring card (in this case, the Poislug in position 2). We remove it from the lookahead list and then place it in slot C of our Patch generator – meaning it will have 3 of 4 slots filled, for 5 out of 6 points (Our endless deck can then deal a replacement card onto the tail end of the lookahead list.)
Since we have extra space and spare points in our Patch Generator, we can run this algorithm again. This time it will choose the Starfish in slot one (for 0 + ⅓ = 0.33). The Patch generator has now been filled up, and the algorithm stops.
(If the order of cards had been different, we would have chosen different cards. For example, if the Warnets had been slightly closer to the top we would have chosen them – because they perfectly fill out the points meter. The Patch generator would therefore stop generation early, after filling just 3 slots. That’s okay though, because Warnets are a particularly dangerous enemy. This is actually the reason why we have a points target)
Phew! We made it through the technical section.
By adding a lookahead list to our deck, and prioritising cards that bring us closer to a points total, we can have a granular control over the generation difficulty. To raise the difficulty, just raise the target points (and maybe even increase the number of content slots). In Patch Quest, the world is laid out such that the patches have varying difficulties, creating controlled difficulty swings.
Just for fun, let’s look at some Patches made by the new and improved generation system.
I talked about Patch Quest for much of this article – but these techniques can be applied to a wide variety of games. Let’s recap, and consider some key takeaways.
- A match structure can improve the balance of many games by averaging difficulty over many matches. This even works in games that aren’t traditional Roguelikes (such as the Dungeon Run mode in Hearthstone).
- A secondary challenge (like dexterity or memorisation) can offload balancing from the generation onto the challenge itself.
- Layering your content into sensible roles allows you to ensure most or all roles are filled for a given chunk.
- Choosing content based on a points system allows you to granularly tweak generation difficulty (if your generation uses decks, you can add a lookahead list to achieve this).
I hope this article has been helpful in some way!
The Patch Quest “Deadly Dungeons” update released today! (get it for free) This update was focused on increasing the games level of challenge – and contains new dungeons that will test your tactical skills. Why not take a look at the update video? It will run you through the full changes.