By Andrew Davie (adapted by Duane Alan Hahn)
Table of Contents
This session we're going to have a preliminary look at vertical movement of sprites.
In the previous sessions we have seen that there are two 8-pixel wide sprites, each represented by a single 8-bit register in the TIA itself. The TIA displays the contents of the sprite registers at the same horizontal position on each scanline, corresponding to where on an earlier scanline the RESP0 or RESP1 register was toggled. We explored how to use this knowledge to develop some generic "position at horizontal pixel x" code which greatly simplified the movement of sprites in a horizontal direction.
Instead of having to work with the odd RESPx timing, we have abstracted that aspect of the hardware and now reference the sprite position through a variable in RAM, and our code positions the sprite to the pixel number indicated by this variable.
Let's now have a look at how to position a sprite vertically.
Our examples so far have shown how sprites appear as a vertical strip the entire height of the screen. This is due, of course, to the single byte of sprite data (8 bits = 8 pixels) being duplicated by the TIA (for each sprite) on each scanline. If we change the data held in the TIA sprite graphics registers (ie: in GRP0 or GRP1), then the next time the TIA draws the relevant sprite, we see a change in the shape that the TIA draws on-screen. We still see 8 pixels, directly under the 8 pixels of the same sprite on the previous scanline—but if we've changed the relevant GRPx register then we will see different pixels on (solid) and different pixels off (transparent).
To achieve vertical movement of 'a sprite'—and by this, we mean a recognizable shape like a balloon, for example—we need to modify the data that we are writing to the GRPx register. When we're on scanlines where the shape is not visible, then we should be writing 0 to the GRPx register—and when on scanlines where the shape is visible, we should be writing the appropriate line of that shape to the GRPx register. Doing this quickly, and with little RAM or ROM usage, is the trickiest bit. Conceptually, it's quite simple.
There are several ways to tackle the problem of writing the right line of the shape on the right line of the screen, and nothing when the shape isn't on the line we're drawing. Some of them take extra ROM, some require more RAM, and some of them require more cycles per line.
Most kernels keep one of the registers as a 'line counter' for use in indexing into tables of data for playfield graphics—so that the correct line of data is placed in the graphics registers for each scanline. The kernels we've created so far also use this line counter to determine when we have done sufficient lines in our kernel. For example. . .
ldx #0 ;2 Kernel lda PF0Table,x ;4 sta PF0 ;3 lda PF1Table,x ;4 sta PF1 ;3 lda PF2Table,x ;4 sta PF2 ;3 sta WSYNC ;3 inx ;2 cpx #192 ;2 bne Kernel ;3(2)
The above code segment shows a loop which iterates the X register from 0 to 192 while it writes three playfield registers on each of the scanlines it 'generates'. We've covered all of this in previous sessions. The numbers after the semicolon (the comment area) indicate the number of cycles that instruction will take (not taking into account possible page-boundary crossing, etc). We can see that this simple symmetrical playfield modification will take at least 31 cycles of our available 76 cycles just to do the three playfield registers on each scanline. That leaves only 45 cycles to do sprites, missiles, ball—and let's not forget the other three playfield writes if we're doing an asymmetrical playfield.
Clearly, our scanline loop is extremely starved of cycles, and any code we put in there must be extremely efficient. The biggest waste in the code above is the comparison. Remember earlier we indicated that the 6502 has a flags register, and some of these flags are set/cleared automatically after certain operations (on loads and arithmetic operations—including register increments and decrements, the negative and zero flags are automatically set/cleared). From now on we're going to use the 'standard' way of looping and instead of specifically comparing a line count with a desired value (eg: counting up to 192), we'll switch to starting at our top value and decrementing the line counter and branching UNTIL the counter gets to 0. By using our knowledge about the automatic flag setting, we are able to remove the comparison from our loop. . .
ldx #192 ;2 Kernel lda PF0Table,x ;4 sta PF0 ;3 lda PF1Table,x ;4 sta PF1 ;3 lda PF2Table,x ;4 sta PF2 ;3 sta WSYNC ;3 dex ;2 bne Kernel ;3(2)
The trick here is that the 'dex' instruction will set the Z (zero) flag to 1 if the x register is zero after the instruction has executed, and 0 if it is non-zero. The 'bne' instruction stands for "branch if Z is zero" or more memorably "branch if the result was not equal (to zero)". In short, the branch will be taken if the x register is non-zero. Thus we have removed two cycles from our inner scanline loop. But at what cost? Since the loop is now counting 'down' instead of 'up', our tables will now be accessed upside-down (that is, the first scanline will show data from the bottom of the tables), and our whole playfield will 'flip' upside-down. That's fine—the solution for this is to change the tables themselves so they are upside-down, too!
All of that was a bit of a diversion—but it's important to understand how we are accessing our data in an upside-down fashion merely for the purposes of efficiency—in this case, saving us just 2 cycles per scanline. But those 2 cycles are some 2.6% of the time we have, and every little bit counts.
Even with this improvement, we have just 47 cycles left to do everything else. Let's have a look at what we need to add to this to get sprites up and running. Assume we are loading our sprite data from a table, just as with the playfield data. We'd need to add. . .
lda Sprite0Data,x ;4 sta GRP0 ;3
That's 7 cycles, which is OK—but we find that we have an immovable (we have no ability to change the vertical position) block of sprite data the whole height of the screen—read from the table 'Sprite0Data'. This setup would also require that our sprite data table is 192 lines high.
Let's assume, just for a minute, that Sprite0Data was in RAM. Then we'd have the ability to use this kernel to do the display and have another part of our program draw different shapes into that RAM table (ie: if we were drawing a Pac-Man sprite, we could have the first 20 'lines' of the table with 0, then the next 16 lines with the shape for the Pac-Man sprite, then the remainder with 0). To move this sprite up or down, we'd simply change where in the RAM table we were drawing the sprite—and when our kernel came to do the display, it wouldn't really care where the sprite was, it would just draw the continuous strip of sprite data from the RAM table, and voila! Vertically moving sprites.
And this is exactly how the Atari home computers manage vertical movement of sprites. They, too, have a single register holding the sprite data—and they, too, modify this register on-the-fly to change the shape of the sprite that is being shown on each scanline. But the difference is that the Atari computers have a bit of hardware which does EXACTLY what our little kernel above does—that is, copy sprite data from a RAM buffer into the hardware sprite register.
The problem for Atari 2600 kernels is that we simply don't have 192 bytes of RAM to spend on a draw buffer/table for each player sprite. In fact, we only have 128 bytes RAM total for our entire program! So it's a nice solution—and certainly one that should be used if you are programming for some cartridge format with ample RAM—because it provides extremely quick (7 cycles) drawing of sprites.
But for normal usage, this technique is not possible or practical.
Unfortunately, the available alternatives are costly—in terms of processing time. The quickest 'generic sprite draw' that I'm aware of at the moment takes 18 cycles. Given our 47 cycles remaining in the scanline, 36 of these would be taken up drawing just two sprites—and that makes asymmetrical playfield, balls and missiles a very problematic task. How can we fit all of these into the remaining 11 cycles of time?
The short answer is: we can't. And this is why many games revert to what is termed a "2 scanline kernel". Instead of trying to fit ALL of the updates into a single scanline, the 2 scanline kernel tries to fit all of the updates into two scanlines—taking advantage of the TIA's persistent state so that registers which have been modified on one scanline will remain the same until next modified. A typical two scanline kernel will modify the playfield (left side), sprite 0, playfield (right side) on the first scanline, then the playfield (left side), sprite 1, playfield (right side) on the second scanline—and then repeat the process.
The upshot of this is that our sprites have a maximum resolution of two scanlines—that is, we can only modify the shape of a sprite once every two lines—and in fact each sprite is updated on alternate lines. There's a bit of hardware (a graphics delay of 1 scanline) to compensate for this, so that the sprites APPEAR to update on the same scanline. This interesting hardware capability shows clearly that the designers of the '2600 were well aware of the time limitations inherent in trying to update playfield registers, sprites missiles and ball in a single scanline—and that they designed the hardware accordingly to mask this problem.
But we're not concerned with two scanline kernels this session. Please be aware that they are extremely common—and many games extend this concept to multiple-scanline kernels—where different tasks are performed in each scanline, and after n scanlines this process repeats to build up the screen out of 'meta-scanlines'. It's a useful technique to get around the limitations of cycles per line.
Before we continue, let's have a think about what we want a sprite draw to do—it's fine to be able to display a sprite shape anywhere on the screen (we've already touched on the horizontal positioning, and now we're well on the way to understanding how the vertical positioning works)—but sprites typically animate. How can we use the code shown so far to animate our sprites as well?
If we used the Atari computer method—presented above—of using a 'strip' of RAM to represent the table from which data is written to the screen, and modifying the data written to that table, then the problem is fairly simple—we just write different shapes to the table. But if we don't HAVE a RAM table, and we're forced to use a ROM table, then to get different shapes onscreen, we're going to have to use different tables. We can't modify the contents of tables in ROM! But the code above has the table hardwired into the code itself. That is. . .
lda Sprite0Data,x sta GRP0
The problem here is that the address of the table is hardwired at the time we write our code—and the assembler will happily predetermine where this table is in the ROM, and the code will always fetch the data from the same table. What we really want to do with a sprite routine is not only fetch the data from a table—but also be able to change WHICH table we fetch the data from.
And here is an ideal use for a new addressing mode of the 6502.
In the above code, 'zp' is a zero page two-byte variable which holds a memory address. The 6502 takes the contents of that variable (ie: the address of our table), adds the y register to it, and then uses the resulting address as our location from which to load a byte of data. It's quite an expensive instruction, taking 5 cycles to execute.
But now our code for drawing sprites (in principle) can look like this. . .
lda (SpriteTablePtr),y sta GRP0
The problem this introduces is that the Y register is used for indexing the data table, whereas we were previously using the X register. There's no way around this—the addressing mode does not work with the X register! So let's change our kernel around a bit, and instead of using the X register to count the scanlines, we'll switch to the Y register. . .
ldy #192 ;2 Kernel lda PF0Table,y ;4 sta PF0 ;3 lda PF1Table,y ;4 sta PF1 ;3 lda PF2Table,y ;4 sta PF2 ;3 lda (SpriteTablePtr),y; 5 sta GRP0; 3 sta WSYNC ;3 dey ;2 bne Kernel ;3(2)
This is a bit better—now (as long as we previously setup the zero page 2-byte variable to point to our table) we are able to display any sprite shape that we desire, using the one bit of code. Here's what you'd need to do to setup your variable to point to the sprite shape data. . .
lda #<Sprite0Data sta SpriteTablePtr lda #>SPrite0Data sta SpriteTablePtr+1
Additionally, the variable should be defined in the RAM segment like this. . .
SpriteTablePtr ds 2
Now let's review all of that and make sure we understand exactly what is happening. . . We have a zero page variable (2 bytes long) which holds the address of the sprite table containing the shape we want to display. Addresses are 16-bits long, and we've already seen how the 6502 represents 16-bit addresses by a pair of bytes—the low byte followed by the high byte (little-endian order). So into our sprite pointer variable, we are writing this byte-pair. The '>' operator tells the assembler to use the high byte of an address, and the '<' operator tells the assembler to use the low byte of an address. These are standard operators, but there's another way to do it. . .
lda #address&0xFF ; low byte sta var lda #address/256 ; high byte sta var+1
Other ways exist. It doesn't really matter which one you use—the result is the same. We end up with a zero page variable which POINTS to the table which is used to give the data for the shape of the sprite. In fact, the variable points to the very start of the table.
And this is our new problem! As we have earlier seen, if we had a RAM table, then we could move the sprite up and down by drawing it into that table and let our kernel display the whole 'strip' of sprite data. The effect would be that the sprite moved up and down on screen. But because we don't have that much RAM, we must programmatically determine on which scanline(s) the sprite data is to be displayed from the table, and which scanline(s) should contain 0-data for the sprite.
Essentially the process consists of comparing the current line-counter (the Y register) with the vertical position required for the sprite. If the counter comparison indicates that the sprite should be visible on the current scanline, then the data is fetched from the table—else a 0 value is used for the sprite data. Rather than stepping through the entire process and deriving the optimum result, we're going to just drop in the method used by nearly all games these days. . .
sec ; 2 can often be guaranteed, and omitted tya ; 2 sbc SpriteEnd ; 3 adc #SPRITE_HEIGHT ; 2 bcs .MBDraw3 ; 2(3) nop ; 2 nop ; 2 sec ; 2 bcs .skipMBDraw3 ; 3 .MBDraw3 lda (Sprite),y ; 5 sta GRP0 ; 3 .skipMBDraw3
Now here things start to get a bit complex! What the above code shows is a sprite draw routine which effectively takes a constant 18 cycles of time to either draw the sprite data from a table (when it's visible), or skip the draw entirely (when it's not visible). There are a few assumptions here. . .
So, that's a bit much to deal with in one whack—and to be honest you don't really need to understand the intricacies. Basically the code has two different sections—one where the sprite data is drawn from the table, and one where the draw is skipped. Each section is carefully timed so that after they rejoin at the bottom, they have both taken EXACTLY the same number of cycles to execute.
Thomas Jentzsch has presented more optimal code, in the form of his 'skipdraw' routine - and frankly, I've not bothered taking the time to fully understand how it works, either! These sections of code are pretty much guaranteed to work efficiently and correctly, provided you setup the variables properly.
Thomas said that Dennis Debro explained it quite well, so here's Skipdraw Explained by Dennis Debro:
; Thomas Jentzsch Skipdraw ;======================================================================== ;The best way, I knew until now, was (if y contains linecounter): tya ; 2 ; sec ; 2 <- this can sometimes be avoided sbc SpriteEnd ; 3 adc #SPRITEHEIGHT ; 2 bcx .skipDraw ; 2 = 9-11 cycles ; ... ; --------- or ------------ ;If you like using illegal opcodes, you can use dcp (dec,cmp) here: lda #SPRITEHEIGHT ; 2 dcp SpriteEnd ; 5 initial value has to be adjusted bcx .skipDraw ; 2 = 9 ; ... ;Advantages: ;- state of carry flag doesn't matter anymore (may save 2 cycles) ;- a remains constant, could be useful for a 2nd sprite ;- you could use the content of SpriteEnd instead of y for accessing sprite data ;- ??? ;======================================================================== ;An Example: ; ; skipDraw routine for right player TXA ; 2 A-> Current scannline SEC ; 2 Set Carry SBC slowP1YCoordFromBottom+1 ; 3 ADC #SPRITEHEIGHT+1 ; 2 calc if sprite is drawn BCC skipDrawRight ; 2/3 To skip or not to skip? TAY ; 2 lda P1Graphic,y ; 4 continueRight: STA GRP0 ;----- this part outside of kernel skipDrawRight ; 3 from BCC LDA #0 ; 2 BEQ continueRight ; 3 Return...
In the meantime, though we have covered a lot of ground today I hope you will understand the basic principles of vertical sprite movement. In summary. . .
That should keep you busy. Enjoy!
Other Assembly Language Tutorials
Session 23: Moving Sprites Vertically
This book was written in English, not computerese. It's written for Atari users, not for professional programmers (though they might find it useful).
This book only assumes a working knowledge of BASIC. It was designed to speak directly to the amateur programmer, the part-time computerist. It should help you make the transition from BASIC to machine language with relative ease.
The 6502 Instruction Set broken down into 6 groups.
Nice, simple instruction set in little boxes (not made out of ticky-tacky).
This book shows how to put together a large machine language program. All of the fundamentals were covered in Machine Language for Beginners. What remains is to put the rules to use by constructing a working program, to take the theory into the field and show how machine language is done.
An easy-to-read page from The Second Book Of Machine Language.
A useful page from Assembly Language Programming for the Atari Computers.
Continually strives to remain the largest and most complete source for 6502-related information in the world.
By John Pickens. Updated by Bruce Clark.
Below are direct links to the most important pages.
Goes over each of the internal registers and their use.
Gives a summary of whole instruction set.
Describes each of the 6502 memory addressing modes.
Describes the complete instruction set in detail.
Cycle counting is an important aspect of Atari 2600 programming. It makes possible the positioning of sprites, the drawing of six-digit scores, non-mirrored playfield graphics and many other cool TIA tricks that keep every game from looking like Combat.
Atari 2600 programming is different from any other kind of programming in many ways. Just one of these ways is the flow of the program.
The "bankswitching bible." Also check out the Atari 2600 Fun Facts and Information Guide and this post about bankswitching by SeaGtGruff at AtariAge.
Atari 2600 programming specs (HTML version).
Links to useful information, tools, source code, and documentation.
Atari 2600 programming site based on Garon's "The Dig," which is now dead.
Includes interactive color charts, an NTSC/PAL color conversion tool, and Atari 2600 color compatibility tools that can help you quickly find colors that go great together.
Adapted information and charts related to Atari 2600 music and sound.
A guide and a check list for finished carts.
A multi-platform Atari 2600 VCS emulator. It has a built-in debugger to help you with your works in progress or you can use it to study classic games.
A very good emulator that can also be embedded on your own web site so people can play the games you make online. It's much better than JStella.
If assembly language seems a little too hard, don't worry. You can always try to make Atari 2600 games the faster, easier way with batari Basic.
View this page and any external web sites at your own risk. I am not responsible for any possible spiritual, emotional, physical, financial or any other damage to you, your friends, family, ancestors, or descendants in the past, present, or future, living or dead, in this dimension or any other.
Use any example programs at your own risk. I am not responsible if they blow up your computer or melt your Atari 2600. Use assembly language at your own risk. I am not responsible if assembly language makes you cry or gives you brain damage.