“Making Games for the Atari 2600”

Chapter 1, page 7: The “LDA Cycle” diagram should show LDA $##34 instead of LDA $##23. The Data Bus values should be $AD, $34, $12, and $7F.

Chapter 3, page 25: The DataArray variable starts at $83, not $82.

Chapter 4, page 28: The ZeroZP loop clears every zero-page value except for address $0. (One way to fix this is to add a STA $0 after the loop ends.)

Chapter 6, page 40: The comment for the sta PF0 line should read “set the PF0 playfield pattern register”.

Chapter 8, page 52: lda ColorFrame,y should be lda ColorFrame0,y; also the comments on the Frame0 color table can be ignored.

Chapter 9, page 57: Says “We’ve timed everything so that the store will place exactly on cycle 23 when zero is passed” but the example shows A (the horizontal position) loaded with #70, not #0.

Chapter 9, page 58: The Atari 2600 game “Raiders of the Lost Ark” was designed by Howard Scott Warshaw, not Warren Robinett.

Chapter 12, page 69: The target timer value is computed by (N ∗ 76 - 13) / 64. There is also a special case for two timer events in a line (see xmacro.h)

Chapter 17: Forgot to describe NUSIZ register bits in detail:

Binary        Hex Value   Description
00xxxx@       $00         1 pixel wide
01xxxx@       $10         2 pixels wide
10xxxx@       $20         4 pixels wide
11xxxx@       $30         8 pixels wide

The VERTICAL_SYNC macro takes 4 scanlines to complete, not 3. Therefore most of the examples add up to 263 scanlines, not 262. Change TIMER_SETUP 30 to TIMER_SETUP 29 to fix them.

Chapter 35: The bank switching examples should set the S (Stack) register to #$FF at startup. (The real NMOS 6502 sets the stack pointer to #$FD at power-up, but the emulator doesn’t emulate this undocumented behavior.) The example has also been rewritten with additional macros to make it a bit clearer, and the origin moved to $1000.

“Making 8-Bit Arcade Games in C”

To future-proof the code examples from compiler updates, some code has been changed.

Some interrupt handers must start at a specific address, and the SDCC compiler currently does not have an easy way to force this. Therefore, the current approach is to put padding bytes between functions so that when lined up in memory, the interrupt handlers are at the right address.

We’ve modified the Galaxian-Scramble demo game to make this more predictable. We insert padding bytes at the end of the start() routine’s __asm block:

.ds   0x66 - (. - _start)

The .ds directive means “insert N bytes here.” (. - _start) is the current program counter subtracted from the _start label. We do this to convert the program counter to an absolute constant (stuff specific to the SDCC assembler). Then we subtract that value from 0x66, which is where we want our next function to reside.

We also use the __naked function decorator so that the compiler doesn’t insert a RET instruction or stack frame instructions, modifying our code length.

The end result of this is that our next C function definition will reside at address 0x66, where we want it.

The bcd_add() function has also been changed, as the inline assembly made assumptions about the code that are not portable across compiler flags. It also now uses the __naked decorator.

In the VIC Dual sections, RAM is defined as starting at $8000, but the code defines it at the mirrored location $c000. Note that it is a 32 x 32 array (1024 bytes) but only 28 rows (896 bytes) are visible.

p. 41: The line typedef unsigned char sbyte should define a signed char type, not unsigned char.

Shift/Rotate Chapter: The function definition for halve_score has a redundant LD IY,#_score instruction.

On page 115, the field _unused is padding for the AttackingEnemy struct on page 114, but there is no such field in the example code.

“Designing Video Game Hardware in Verilog”

Chapter 9: The constants in the book differ from what’s in the hvsync_generator.v file, which are:

parameter V_TOP    =   5; // vertical top border
parameter V_BOTTOM =  14; // vertical bottom border
parameter V_SYNC   =   3; // vertical sync # lines

The simulated CRT values are based on NTSC, the goal being to get a 256x240 pixel display like a NES. NTSC needs 262 lines, and 3 lines of VSYNC. I chose the top and bottom borders to center the frame on my TV when using the FPGA.

In the simulator, the horizontal scanlines are shortened (23 + 7 + 23 cycles) to save CPU time, while still being long enough to do stuff (read RAM etc.) between scanlines. The FPGA examples use a full 381-cycle scanline, but have the same vertical timing.

The Digits and 7 Segment Decoder examples use arrays of 5-bit words. The expression used to index them is (xofs ^ 3'b111) which reverses the bits left-to-right, but also indexes bit indices 5-7 which are out of range. Verilator returns 0 for this case, but some FPGA toolchains return undefined values. To fix this, replace the g assignment with this:

wire g = display_on && (xofs >= 3'b011) && bits[xofs ^ 3'b111];

In the RAM Text Display example, the cells are updating improperly when the beam is offscreen. This change to line 75 fixes it:

ram_writeenable <= display_on && rom_yofs == 7;

In the Sprite Rotation example, some 3rd party toolchains may have issues with the syntax. You may have to make these sorts of changes:

function [3:0] trunc_int_to_4(input integer val);
  trunc_int_to_4 = val[3:0];
endfunction

0: sin_16x4 = trunc_int_to_4(y);

player_x_fixed <= player_x_fixed + 12'(sin_16x4(player_rot+8));

(Updated Aug 1 2019)