By Andrew Davie (adapted by Duane Alan Hahn)
Support this site with PayPal.
Table of Contents
One of the joys of writing '2600 programs involves the quest for efficiency—both in processing time used, and in ROM space required for the code. Every now and then, modern-day '2600 programmers will become obsessed with some fairly trivial task and try to see how efficient they can make it.
If you were about to go up on the Space Shuttle, you wouldn't expect them to just put in the key, turn it on, and take off. You'd like the very first thing they do is to make sure that all those switches are set to their correct positions. When our Atari 2600 (which, I might point out in a tenuous link to the previous sentence, is of the same vintage as the Space Shuttle) powers-up, we should assume that the 6502, RAM and TIA (and other systems) are in a fairly unknown state. It is considered good practice to initialize these systems. Unless you really, *really* know what you're doing, it can save you problems later on.
At the end of this session I'll present a highly optimized (and best of all, totally obscure ) piece of code which manages to initialize the 6502, all of RAM *and* the TIA using just 9 bytes of code-size. That's quite amazing, really. But first, we're going to do it the 'long' way, and learn a little bit more about the 6502 while we're doing it.
We've already been introduced to the three registers of the 6502—A, X, and Y. X and Y are known as index registers (we'll see why, very soon) and A is our accumulator—the register used to do most of the calculations (addition, subtraction, etc).
Let's have a look at the process of clearing (writing 0 to) all of our RAM. Our earlier discussions of the memory architecture of the 6502 showed that the '2600 has just 128 bytes ($80 bytes) of RAM, starting at address $80. So, our RAM occupies memory from $80 - $FF inclusive. Since we know how to write to memory (remember the "stx COLUBK" we used to write a color to the TIA background color register), it should be apparent that we could do this. . .
lda #0 ; load the value 0 into the accumulator sta $80 ; store accumulator to location $80 sta $81 ; store accumulator to location $81 sta $82 ; store accumulator to location $82 sta $83 ; store accumulator to location $83 sta $84 ; store accumulator to location $84 sta $85 ; store accumulator to location $85 ; 119 more lines to store 0 into location $86 - $FC. . . sta $FD ; store accumulator to location $FD sta $FE ; store accumulator to location $FE sta $FF ; store accumulator to location $FF
You're right, that's ugly! The code above uses 258 bytes of ROM (2 bytes for each store, and 2 for the initial accumulator load). We can't possibly afford that—and especially since I've already told you that it's possible to initialize the 6502 registers, clear RAM, *AND* initialize the TIA in just 9 bytes total!
The index registers have their name for a reason. They are useful in exactly the situation above, where we have a series of values we want to read or write to or from memory. Have a look at this next bit of code, and we'll walk through what it does. . .
ldx #0 lda #0 ClearRAM sta $80,x inx cpx #$80 bne ClearRAM
Firstly, this code is nowhere-near efficient, but it does do the same job as our first attempt and uses only 11 bytes. It achieves this saving by performing the clear in a loop, writing 0 (the accumulator) to one RAM location every iteration. The key is the "sta $80,x" line. In this "addressing mode", the 6502 adds the destination address ($80 in this example—remember, this is the start of RAM) to the current value of the X register—giving it a final address—and uses that final address as the source/destination for the operation.
We have initialized X to 0, and increment it every time through the loop. The line "cpx #$80" is a comparison, which causes the 6502 to check the value of X against the number $80 (remember, we have $80 bytes of RAM, so this is basically saying "has the loop done 128 ($80) iterations yet?". The next line "bne ClearRAM" will transfer program flow back to the label "ClearRAM" every time that comparison returns "no". The end result being that the loop will iterate exactly 128 times, and that the indexing will end up writing to 128 consecutive memory locations starting at $80.
ldx #$80 lda #0 ClearRAM sta 0,x inx bne ClearRAM
Well, that's not a LOT different, but we're now using only 9 bytes to clear RAM—somehow we've managed to get rid of that comparison! And how come we're writing to 0,x not $80,x? All will be revealed. . .
When the 6502 performs operations on registers, it keeps track of certain properties of the numbers in those registers. In particular, it has internal flags which indicate if the number it last used was zero or non-zero, positive or negative, and also various other properties related to the last calculation it did. We'll get to all of that later. All of these flags are stored in an 8-bit register called the "flags register". We don't have easy direct access to this register, but we do have instructions which base their operation on the flags themselves.
We've already used one of these operations—the "bne ClearRAM" we used in our earlier version of the code. This instruction, as noted "will transfer program flow back to the label "ClearRAM" every time that comparison returns "no". The comparison returns "no" by setting the zero/non-zero flag in the processor's flags register!
In actuality, this zero/non-zero flag is also set or cleared upon a load to a register, an increment or decrement of register or memory, and whenever a calculation is done on the accumulator. Whenever a value in these circumstances is zero, then the zero flag is set. Whenever the result is non-zero, the zero flag is cleared. So, we don't even need to compare for anything being 0—as long as we have just done one of the operations mentioned (load, increment, etc)—then we know that the zero flag (and possibly others) will tell us something about the number. The 6502 documentation gives extensive information for all instructions about what flags are set/cleared, under what circumstance.
We briefly discussed how index registers, only holding 8-bit values "wrap-around" from $FF (%11111111) to 0 when incremented, and from 0 to $FF when decremented. Our code above is using this "trick" by incrementing the X-register and using the knowledge that the zero-flag will always be non-zero after this operation, unless X is 0. And X will only be 0 if it was previously $FF. Instead of having X be a "counter" to give 128 iterations, this time we're using it as the actual address and looping it from $80 (the start of RAM) to $FF (the end of RAM) + 1. SO our store (which, remember, takes the address in the instruction, adds the value of the X register and uses that as the final address) is now "sta 0,x". Since X holds the correct address to write to, we are adding 0 to that.
I would *highly* recommend that you don't worry too much about this sort of optimization while you're learning. The version with the comparison is perfectly adequate, safe, and easy to understand. But sometimes you find that you do need the extra cycles or bytes (the optimized version, above, is 160 cycles faster—and that's 160x3 color clocs = 480 color clocks = more than two whole scanlines !! quicker). So you can see how crucial timing can be—by taking out a single instruction (the "cpx #$80") in a loop, and rearranging how our loop counted, we saved more than two scanlines—(very) roughly 1% of the total processing time available in one frame of a TV picture!
Initializing the TIA is a similar process to initializing the RAM—we just want to write 0 to all memory locations from 0 to $7F (where the TIA lives!). This is safe—trust me—and we don't really need to know what we're writing to at this stage, just that after doing this the TIA will be nice and happy. We could do this in a second loop, similar to the first, but how about this. . .
ldx #0 lda #0 Clear sta $80,x ; clear a byte of RAM sta 0,x ; clear a byte of TIA register inx cpx #$80 bne Clear
That's a perfectly adequate solution. Easy to read and maintain, and reasonably quick. We could, however, take advantage of the fact that RAM and the TIA are consecutive in memory (TIA from 0 - $7F, immediately followed by RAM $80 - $FF) and do the clear in one go. . .
ldx #0 lda #0 Clear sta 0,x inx bne Clear
The above example uses 9 bytes, again, but now clears RAM and TIA in one 'go' by iterating the index register (which is the effective address when used in "sta 0,x") from 0 to 0 (ie: increments 256 times and then wraps back to 0 and the loop halts). This is starting to get into "elegant" territory, something the experienced guys strive for!
Furthermore, after this code has completed, X = 0 and A = 0—a nice known state for two of the 3 6502 registers.
That's all I'm going to explain for the initialization at this stage—we should insert this code just after the "Reset" label and before the "StartOfFrame" label. This would cause the code to be executed only on a system reset, not every frame (as, every frame, the code branches back to the "StartOfFrame" for the beginning of the next frame).
Before we end today's session, though, I thought I'd share the "magical" 9-byte system clear with you. There's simply no way that I would expect you to understand this bit of code at the moment—it pulls every trick in the book—but this should give you some taste of just how obscure a bit of code CAN be, and how beautifully elegant clever coding can do amazing things.
; CLEARS ALL VARIABLES, STACK ; INIT STACK POINTER ; ALSO CLEARS TIA REGISTERS ; DOES THIS BY "WRAPPING" THE STACK - UNUSUAL LDX #0 TXS PHA ; BEST WAY TO GET SP=$FF, X=0 TXA CLEAR PHA DEX BNE CLEAR ; 9 BYTES TOTAL FOR CLEARING STACK, MEMORY ; STACK POINTER NOW $FF, A=X==0
Though the above was a truly magical piece of code, I've since developed an EIGHT byte solution to the problem of clearing RAM and initializing the stack and registers.
ldx #0 txa Clear dex txs pha bne Clear
After the above, X=A=0, and all of RAM and the TIA has been initialized to 0, and the stack pointer is initialized to $FF. Amazing!
See you next time!
Other Assembly Language Tutorials
Session 12: Initialization
This book was written in English, not computerese. It's written for Atari users, not for professional programmers (though they might find it useful).
This book only assumes a working knowledge of BASIC. It was designed to speak directly to the amateur programmer, the part-time computerist. It should help you make the transition from BASIC to machine language with relative ease.
The 6502 Instruction Set broken down into 6 groups.
Nice, simple instruction set in little boxes (not made out of ticky-tacky).
This book shows how to put together a large machine language program. All of the fundamentals were covered in Machine Language for Beginners. What remains is to put the rules to use by constructing a working program, to take the theory into the field and show how machine language is done.
An easy-to-read page from The Second Book Of Machine Language.
A useful page from Assembly Language Programming for the Atari Computers.
Continually strives to remain the largest and most complete source for 6502-related information in the world.
By John Pickens. Updated by Bruce Clark.
Below are direct links to the most important pages.
Goes over each of the internal registers and their use.
Gives a summary of whole instruction set.
Describes each of the 6502 memory addressing modes.
Describes the complete instruction set in detail.
Cycle counting is an important aspect of Atari 2600 programming. It makes possible the positioning of sprites, the drawing of six-digit scores, non-mirrored playfield graphics and many other cool TIA tricks that keep every game from looking like Combat.
Atari 2600 programming is different from any other kind of programming in many ways. Just one of these ways is the flow of the program.
The "bankswitching bible." Also check out the Atari 2600 Fun Facts and Information Guide and this post about bankswitching by SeaGtGruff at AtariAge.
Atari 2600 programming specs (HTML version).
Links to useful information, tools, source code, and documentation.
Atari 2600 programming site based on Garon's "The Dig," which is now dead.
Includes interactive color charts, an NTSC/PAL color conversion tool, and Atari 2600 color compatibility tools that can help you quickly find colors that go great together.
Adapted information and charts related to Atari 2600 music and sound.
A guide and a check list for finished carts.
A multi-platform Atari 2600 VCS emulator. It has a built-in debugger to help you with your works in progress or you can use it to study classic games.
A very good emulator that can also be embedded on your own web site so people can play the games you make online. It's much better than JStella.
If assembly language seems a little too hard, don't worry. You can always try to make Atari 2600 games the faster, easier way with batari Basic.
View this page and any external web sites at your own risk. I am not responsible for any possible spiritual, emotional, physical, financial or any other damage to you, your friends, family, ancestors, or descendants in the past, present, or future, living or dead, in this dimension or any other.
Use any example programs at your own risk. I am not responsible if they blow up your computer or melt your Atari 2600. Use assembly language at your own risk. I am not responsible if assembly language makes you cry or gives you brain damage.