Famicom Party

17. Object Pools

Table of Contents:

With the ability to control the spaceship, our sample project is starting to feel like a real game and not just a tech demo. Let's continue to build on that momentum by introducing enemy spaceships. For now, we won't worry about detecting collisions - that will come in the next chapter. Instead, the main focus in this chapter will be object pools, a technique for handling multiple objects in a flexible way.

Why Pooling?

In the context of this book, we'll consider an object pool to be a fixed block of memory that is used to hold data about a particular type of entity in a game. Each entity needs to track certain information—what tiles it is drawn with, its X and Y positions, etc.—and that information will need to be stored somewhere in memory. In games that are not turn-based, entities are updated nearly every frame, meaning the best place to store an entity's data is in zero-page RAM.

In modern game development environments, adding a few more entities when needed is not a concern. Modern systems can easily spare a few bytes (or kilobytes, or megabytes) for additional entities. On the NES, though, zero-page space is limited and finite, so every byte has to be planned in advance. Setting up an object pool guarantees that the game will never use more memory than we allocated in advance, and lets us more easily plan out how the 256 bytes of zero-page space will be used.

Object pools also help from a gameplay perspective. If we use an object pool for player projectiles and give it three slots, then the player can only have three projectiles on screen at one time. Without that kind of limitation, a player could use a turbo controller (or tap rapidly) to fill the screen with projectiles, taking up all of the OAM slots, making a number of sprites invisible due to the eight-sprites-per-scanline limitation and likely causing massive slowdown due to the additional demands on collision detection.

Designing Pools

An object pool should be flexible enough to accommodate a range of entity types, so long as those entities have some shared role. It is common to put entities that can harm the player into a pool together, separate from entities that are neutral or helpful. An alternative approach could be keeping objects that participate in collision detection in one pool, and those that do not in another. There is no set way of deciding what goes into a pool together and what does not. In general, a good object pool design will make it easy to iterate through the entities in the pool to do common tasks, without needing to write logic that excludes certain entities. As an example, if you put both enemies and player projectiles into a single pool, collision detection would be extremely complicated, since you would need to constantly test whether any two items can actually affect one another. Separate pools for enemies and player projectiles let you cycle through the entities in one pool and compare them to entities from the other, greatly reducing the complexity.

Once you've decided which types of entities will be pooled together, the next step is establishing what data you will be storing for each entity. For reasons we'll get to soon, each entity in a pool will need to store the same data as every other entity in the pool. For our space shooter game, we will make an enemies pool, where each enemy has the following data:

This is a good starting point, but there is plenty of other data you could choose to store. You could give each enemy a health counter, allowing enemies to take more than one attack to defeat. Enemies will likely need some kind of "AI" behavior; you could keep a list of behaviors somewhere else in code and reference the current behavior pattern with an ID number. Enemies could also have animations, which would require you to store the current animation state (animation type and frame number) for each entity. Note that even if we add all of these properties, the pool is still quite flexible. Giving pool entities a "type" value could translate into nearly anything, since the code that transforms a type number into sprites drawn on screen can live outside of the pool. This pool could easily handle enemies ranging from a single sprite tile to giant boss monsters without any changes.

Similarly, let's create a pool to store player projectiles. Player projectiles do not share a role with enemies, and since they don't vary much from one another, the pool design can be simpler as well:

Here we assume that all projectiles have the same velocity and use the same graphics, so all we need to track is their position on screen. If you wanted to add different weapons (or weapon upgrades), you would need to add more data to the entities in the pool.

Implementing object pools

In most modern programming languages, if you wanted to track a set of similar entities, you would reach for something like an array of objects (or a list of instances, or a slice of structs; the terminology here varies by language). Here's an example of that approach in JavaScript:

let enemiesPool = [
    x: 57,
    y: 12,
    type: 3,
    active: true
    x: 36,
    y: 12,
    type: 3,
    active: false

This method makes it very easy to loop through the objects within the array, and then take action on each object in turn. On the NES, however, this approach is very inefficient. Using this sort of setup (which is called array of structures), NES assembly code to loop through all entities would need to:

  1. Store a "current object address" and "current object index" somewhere, as well as the length of each field and the total length of each object.
  2. Access each property of the current object by using a table of offsets from the start of the object (load the offset of the property into the X register, then e.g. LDA current_obj_address, X).
  3. To get to the next object, add the object length to the current object's starting address.
  4. Check each time you move to the next object that the current object index has not exceeded the total length of the array.

In contrast, due to the 6502's wealth of addressing modes, a more popular approach on the NES is the structure of arrays. To go back to our previous JavaScript example, a structure of arrays approach would look like this:

let enemiesPool = {
  x: [57, 36],
  y: [12, 12],
  type: [3, 3],
  active: [true, false]

While this approach might seem counter-intuitive at first, it works very well with the 6502's indexed addressing modes. Here is our hypothetical pool traversal code under structure of arrays:

  1. Store a current object index in the X register.
  2. To look up a property, use indexed addressing with the label that marks the start of that array, e.g. LDA enemy_types, X to get the type of the current enemy.
  3. To go to the next object in the pool, increment the X register.
  4. Check each time you increment if the current object index has exceeded the length of the pool.

The number of steps is the same, but each step is much simpler and also faster to execute.

Now that we have both a design and an implementation approach for the pool, let's actually create it. We'll need to add the following to our .segment "ZEROPAGE":


; enemy object pool
enemy_x_pos: .res NUM_ENEMIES
enemy_y_pos: .res NUM_ENEMIES
enemy_x_vels: .res NUM_ENEMIES
enemy_y_vels: .res NUM_ENEMIES
enemy_flags: .res NUM_ENEMIES

; player bullet pool
bullet_xs: .res 3
bullet_ys: .res 3

; export all of this
.exportzp enemy_x_pos, enemy_y_pos
.exportzp enemy_x_vels, enemy_y_vels
.exportzp enemy_flags

With this code, we've reserved 31 bytes of zero-page RAM across two pools, to store the state of our non-player objects in a structure of arrays format. The enemy_flags byte for each enemy is a series of bitfields; here, I'll use three bits to store the enemy type (giving us eight possible enemy types), one bit to store whether or not the enemy at this slot is 'active' (should be drawn to the screen), and four bits reserved for future use.

To make use of these pools, we will want to cycle through them each frame, making any updates as needed. For now, this will be relatively simple: creating new enemies if there are open slots, updating the positions of existing enemies, and freeing up enemy slots when they have exited the screen. In the future, this loop is where we will add collision detection, "AI" behavior, and animation updates.

Because there will be so much going on in this loop, there is actually one more thing to do before we implement it. Up to this point, our main code has set things up and then just waited for NMIs to occur, with all actual logic being part of the NMI handler. While our projects were simple, this was sufficient, but as we add more and more functionality, we will quickly run out of CPU time. Vblank (the time during which the NMI handler runs) lasts for about 2,250 CPU cycles, which is a decent amount but not a whole lot of time. In contrast, the CPU spends about 27,000 cycles outside of Vblank each frame. Currently, that time is wasted just waiting for the next Vblank. A better approach would be to spend that time setting things up for the next Vblank, and letting our NMI handler focus on sending those changes to the PPU. Nearly all of this section is based on "The frame and NMIs", from the NESDev Wiki.

Moving code from NMI to main

To make the most of Vblank time, we'll want to limit the NMI handler to only handling things that occur every frame - even if the game is paused, or slowdown is occurring. Generally those items will include:

Everything else should be handled as part of the main loop, since there is so much more CPU time available there. In fact, there is so much CPU time available that we will need to do something to keep the CPU busy while we wait for the next Vblank, or else our game logic will race far ahead of what is being displayed on the screen. To start, we need a way to track whether or not Vblank (an NMI) has occured yet. I'll use one byte of zeropage to track whether the non-Vblank code should "sleep" (wait for NMI), or run itself.

sleeping: .res 1

If sleeping is non-zero, we are waiting for the next NMI before doing anything else. The end of our main loop can change from the current "jump forever":

  JMP forever

To something that takes sleeping into account:

  INC sleeping
  LDA sleeping
  BNE sleep

  JMP mainloop

Here, we increment sleeping so it is not zero and then repeatedly load it, until a load results in zero. How does sleeping revert to zero? In our NMI handler. At the end of the NMI handler, we add:

  LDA #$00
  STA sleeping

So, the entire flow is now:

Now, the main loop and the NMI handler pass control back and forth to one another, letting us make the most of both. Note that the end of our main proc jumps to a new label - mainloop - instead of jumping back to the beginning of main. main begins with some initialization that we don't want to repeat every frame, so we need to add a mainloop label within main at the point that we want to run repeatedly.

It's time to move code that is not essential to NMIs out into the main loop. Looking through the NMI handler from the last chapter, the following operations are worth keeping in NMI (because they can only be done during Vblank):

The rest (including calculating what values to write to PPUCTRL and decrementing scroll) should be moved to the main loop. If we remove those items, our new, shorter NMI handler looks like this:

.proc nmi_handler

  ; copy sprite data to OAM
  LDA #$00
  LDA #$02

  ; set PPUCTRL
  LDA ppuctrl_settings

  ; set scroll values
  LDA #$00 ; X scroll first
  LDA scroll

  ; all done
  LDA #$00
  STA sleeping


Notice that the NMI handler has a new beginning and ending. Previously, all of our game logic occurred in the NMI handler. Nothing else could ever interrupt our game logic or stealthily change the memory addresses we were working with, but now it's very possible for something in main code to change memory locations or registers that NMI expects to use, and vice-versa. (We literally just did that sort of thing, by having both NMI and main make use of sleeping.) This means that our NMI handler must now act like any other subroutine, saving the state of registers and restoring them when it is done.

Next, let's move the items we removed from the NMI handler into the main loop. Here is just the mainloop portion of our new .proc main:

  ; Read controllers.
  JSR read_controller1

  ; Update the player and prep to draw
  JSR update_player
  JSR draw_player

  ; Check if PPUCTRL needs to change
  LDA scroll ; did we reach the end of a nametable?
  BNE update_scroll
  ; if yes,
  ; Update base nametable
  LDA ppuctrl_settings
  EOR #%00000010 ; flip bit 1 to its opposite
  STA ppuctrl_settings
  ; Reset scroll to 240
  LDA #240
  STA scroll

  DEC scroll

  ; Done processing; wait for next Vblank
  INC sleeping
  LDA sleeping
  BNE sleep

  JMP mainloop

Most of this is copied as-is from the old NMI handler, but note that we are not writing to PPUCTRL here. Writes to the PPU can only occur during Vblank (in the NMI handler), or else you will get distracting visual artifacts on the screen.

With the main loop and NMI now separated, we have ample time to make use of our new object pools in the main loop to generate, draw, and manage enemies and projectiles in the game.

Enemy routines

We'll need some enemies to manage, of course. I've created graphics tiles for two enemy types, seen here:

Our two enemy types - space turtles and galactic snakes. I am not an artist but I'm trying my best.

I'm going to call the turtles "type 1" and the snakes "type 2", and we will use those values in the "type" field for each enemy in the pool. Those types will let us easily determine which subroutines to call in order to update a particular enemy and draw it. For better organization, I am going to put enemy subroutines into their own file, enemies.asm, and import them into the main file.

The drawing subroutine is the simpler of the two. If all of our enemy types can be drawn with 2x2 tile metasprites, they can share a single drawing routine that picks tiles to draw based on enemy type. If the enemies are different sizes, then multiple drawing subroutines will be required. Instead of calling draw_enemy directly, the code that loops through the enemy pool will need to check the enemy's type to determine the appropriate draw function. Using current_enemy as an index into various arrays, the drawing subroutine can:

The drawing subroutine will need to make use of both index registers: one to provide an offset into each enemy info array, and one to keep track of the offset into OAM as each byte is written. Additionally, the X register will need to do double duty, generally using current_enemy for offsets, but changing to the current enemy's type (stored in zeropage as current_enemy_type) when loading tile numbers. I am using bit 7 of the enemy_flags byte as a marker for "active" or "inactive". As discussed in a previous chapter, that bit can be turned on with ORA #%10000000, and turned off with EOR #%10000000. Here is the drawing subroutine (in enemies.asm):

126.export draw_enemy
127.proc draw_enemy
128 PHP
129 PHA
130 TXA
131 PHA
132 TYA
133 PHA
135 ; First, check if the enemy is active.
136 LDX current_enemy
137 LDA enemy_flags,X
138 AND #%10000000
139 BNE continue
140 JMP done
143 ; Find the appropriate OAM address offset
144 ; by starting at $0210 (after the player
145 ; sprites) and adding $10 for each enemy
146 ; until we hit the current index.
147 LDA #$10
148 LDX current_enemy
149 BEQ oam_address_found
151 CLC
152 ADC #$10
153 DEX
154 BNE find_address
157 LDX current_enemy
158 TAY ; use Y to hold OAM address offset
160 ; Find the current enemy's type and
161 ; store it for later use. The enemy type
162 ; is in bits 0-2 of enemy_flags.
163 LDA enemy_flags, X
164 AND #%00000111
165 STA current_enemy_type
167 ; enemy top-left
168 LDA enemy_y_pos, X
169 STA $0200, Y
170 INY
171 LDX current_enemy_type
172 LDA enemy_top_lefts, X
173 STA $0200, Y
174 INY
175 LDA enemy_palettes, X
176 STA $0200, Y
177 INY
178 LDX current_enemy
179 LDA enemy_x_pos, X
180 STA $0200, Y
181 INY
183 ; enemy top-right
184 LDA enemy_y_pos, X
185 STA $0200, Y
186 INY
187 LDX current_enemy_type
188 LDA enemy_top_rights, X
189 STA $0200, Y
190 INY
191 LDA enemy_palettes, X
192 STA $0200, Y
193 INY
194 LDX current_enemy
195 LDA enemy_x_pos, X
196 CLC
197 ADC #$08
198 STA $0200, Y
199 INY
201 ; enemy bottom-left
202 LDA enemy_y_pos, X
203 CLC
204 ADC #$08
205 STA $0200, Y
206 INY
207 LDX current_enemy_type
208 LDA enemy_bottom_lefts, X
209 STA $0200,Y
210 INY
211 LDA enemy_palettes, X
212 STA $0200, Y
213 INY
214 LDX current_enemy
215 LDA enemy_x_pos, X
216 STA $0200, Y
217 INY
219 ; enemy bottom-right
220 LDA enemy_y_pos, X
221 CLC
222 ADC #$08
223 STA $0200, Y
224 INY
225 LDX current_enemy_type
226 LDA enemy_bottom_rights, X
227 STA $0200,Y
228 INY
229 LDA enemy_palettes, X
230 STA $0200,Y
231 INY
232 LDX current_enemy
233 LDA enemy_x_pos, X
234 CLC
235 ADC #$08
236 STA $0200, Y
239 PLA
240 TAY
241 PLA
242 TAX
243 PLA
244 PLP
245 RTS

Notice that on lines 139-140, a branch (BNE) skips over a JMP to near the end of the subroutine, rather than branching directly to that label. This is intentional; a branch command, when assembled into machine code, takes a relative movement as its operand, and because that operand is one (signed) byte, a branch can only move up to 128 bytes backward or 127 bytes forward. The done label is more than 127 bytes ahead of where we would branch, so only a JMP can be used here.

In keeping with the structure-of-arrays model, the tile numbers for each enemy type are stored in four arrays: enemy_top_lefts, enemy_top_rights, enemy_bottom_lefts, and enemy_bottom_rights. The enemy type serves as the index into each of these arrays, so e.g. the first enemy type (type zero) will be the first element of each array. Here, for reference, are those arrays, down in .segment "RODATA". Each array lists the appropriate turtle tile, followed by the appropriate snake tile. These arrays are stored in RODATA.

248.segment "RODATA"
251.byte $09, $0d
253.byte $0b, $0e
255.byte $0a, $0f
257.byte $0c, $10

For enemy behavior, I'll keep things simple for now with a single subroutine that increments the enemy's Y position by its Y velocity (from enemy_y_vels) and, if it is greater than 239 (the bottom of the screen), marks it as inactive. This subroutine is also in enemies.asm.

12.export update_enemy
13.proc update_enemy
14 PHP
15 PHA
16 TXA
17 PHA
18 TYA
19 PHA
21 ; Check if this enemy is active.
22 LDX current_enemy
23 LDA enemy_flags, X
24 AND #%10000000
25 BEQ done
27 ; Update Y position.
28 LDA enemy_y_pos, X
29 CLC
30 ADC enemy_y_vels, X
31 STA enemy_y_pos, X
33 ; Set inactive if Y >= 239
34 CPY #239
35 BCC done
36 LDA enemy_flags, X
37 EOR #%10000000
38 STA enemy_flags, X
41 PLA
42 TAY
43 PLA
44 TAX
45 PLA
46 PLP
47 RTS

With these subroutines written, we are ready (finally?) to begin spawning enemies.

Spawning enemies via object pool

There are many approaches to spawning enemies, but let's keep it simple for now. Each frame, we will use CPU time (outside of NMI) to loop through the enemies list. If an enemy slot is active, we will call update_enemy. If a slot is inactive, we will start a timer that counts down from 20 to zero, at which point we will spawn an enemy in the first inactive slot our loop finds. This gives us a new enemy 1/3 of a second (20 frames) after an enemy goes off screen. When not in use, the timer will store $ff, to give a clear difference between "counting down to zero" and "waiting to start counting down".

First, the code to loop through enemies and update them. Here is our new subroutine, process_enemies, in enemies.asm:

.export process_enemies
.proc process_enemies
  ; Push registers onto the stack

  ; Start with enemy zero.
  LDX #$00

  STX current_enemy
  LDA enemy_flags, X
  ; Check if active (bit 7 set)
  AND #%10000000
  BEQ spawn_or_timer
  ; If we get here, the enemy is active,
  ; so call update_enemy
  JSR update_enemy
  ; Then, get ready for the next loop.
  JMP prep_next_loop
  ; Start a timer if it is not already running.
  LDA enemy_timer
  BEQ spawn_enemy ; If zero, time to spawn
  CMP #20 ; Otherwise, see if it's running
  ; If carry is set, enemy_timer > 20
  BCC prep_next_loop

  LDA #20
  STA enemy_timer
  JMP prep_next_loop
	; TODO!

  BNE enemy

  ; Done with all enemies. Decrement
  ; enemy spawn timer if 20 or less
  ; (and not zero)
  LDA enemy_timer
  BEQ done
  CMP #20
  BEQ decrement
  BCS done
  DEC enemy_timer

  ; Restore registers, then return


This is a large subroutine, but the pieces are fairly straightforward. The loop that processes enemies starts at the enemy label. Using the X register as the index of which enemy number is currently being processed, the loop first checks if the current enemy is active by testing its entry in the enemy_flags array. If the enemy is active, update_enemy is called and the rest of the loop logic is skipped. Otherwise, a new zeropage address (enemy_timer) is checked. A value of 255 means that the timer is not currently in use; when a timer starts, it is set to 20 and counts down to zero. If enemy_timer is zero, BEQ spawn_enemy branches to the portion of code that spawns an enemy (to be described next). If it is not zero, a second check tests if enemy_timer is greater than 20. If so, the timer is not currently in use, so a new timer can be started. Otherwise, the timer is already in use and we are done processing this enemy.

How should new enemies be spawned? Depending on the kind of game you are looking to make, a good approach might be to implement a random number generator to select the enemy type to spawn, or hard-coding particular patterns of enemies that appear during a stage. The whole idea of having different "modes" in a game - whether that's different playstyles, different stages, or even simple things like a title screen or pause state - will be its own separate chapter. Since this chapter is already quite long, I'm going to opt for something much simpler instead: specific slots are always specific enemy types. The first three slots will be turtles, which will move at a speed of 1px per frame, and the last two will be snakes, which will move twice as fast. Here is the setup code at the beginning of our main subroutine:

115 ; set up enemy slots
116 LDA #$00
117 STA current_enemy
118 STA current_enemy_type
120 LDX #$00
122 LDA #$00 ; turtle
123 STA enemy_flags,X
124 LDA #$01
125 STA enemy_y_vels,X
126 INX
127 CPX #$03
128 BNE turtle_data
129 ; X is now $03, no need to reset
131 LDA #$01
132 STA enemy_flags,X
133 LDA #$02
134 STA enemy_y_vels,X
135 INX
136 CPX #$05
137 BNE snake_data
139 LDX #$00
140 LDA #$10
142 STA enemy_x_pos,X
143 CLC
144 ADC #$20
145 INX
147 BNE setup_enemy_x

Notice also that each enemy slot has a specific X coordinate, so that they do not overlap with one another.

Now, when it's time to spawn a new enemy, the first inactive slot our loop finds is marked as active and has its Y position set to zero. Here is that spawn_enemy label from the process_enemies subroutine, with "TODO" replaced by actual code:

  ; Set this slot as active
  ; (set bit 7 to "1")
  LDA enemy_flags,X
  ORA #%10000000
  STA enemy_flags,X
  ; Set y position to zero
  LDA #$00
  STA enemy_y_pos,X
  ; IMPORTANT: reset the timer!
  LDA #$ff
  STA enemy_timer

When all of the code above is assembled and linked, the result shows off our new object pools in action!

Notice how our object pool limits the game to having no more than five enemies on screen at a time. Just having different Y velocities already introduces some measure of randomness (or at least "random-seeming") in how enemies appear on screen.


Here are a few changes you might like to try out. See how each of them affects the "feel" of the game, and experiment with other changes to what we've already done here. To get you started, here is a zip file with the source code from this chapter.