Atari 2600 Programming for Newbies

Session 16: Letting the Assembler do the Work

By Andrew Davie (adapted by Duane Alan Hahn, a.k.a. Random Terrain)

As an Amazon Associate I earn from qualifying purchases.

Page Table of Contents

Original Session

Original Session 16 at AtariAge

This session we're going to have a brief look at how DASM (our assembler) builds up the binary ROM file, and how we can use DASM to help us organize our RAM.

As we've discovered, DASM keeps a list of all symbols and as it is assembling our code, it assigns values (= numbers, or addresses) to those symbols. When it is creating the binary ROM image, it replaces opcodes (=instructions) with appropriate values representing the opcode, and it replaces symbols with the value of the symbol from its internal symbol table.

Symbols

OK, that basic process should be clear by now. When we view our symbol table (which is output when we use the -v5 switch on our command-line when assembling a file), we will see that there are some symbols which are unused (the used ones have (R ) after them, in the symbol table output). We can see, then, that it is not necessary for a symbol to actually be in the ROM binary file for it to have a value. There are several reasons why we'd want to have a symbol with a value, but not have that symbol "do anything" or relate to anything in the binary.

For example, we could use a symbol as a switch to tell the compiler which section of code to compile. A symbol could be used as a value to tell us how many scanlines to draw… eg:


SCANLINES = 312; PAL






;...later



    iny

    cpy #SCANLINES   ; at the end?

    bne AnotherLine     ; do another line

We can even implement a compile-time PAL/NTSC switch something like this…


PAL = 0

NTSC = 1



SYSTEM = PAL   ; change this to PAL or NTSC





#if SYSTEM = PAL

   ; insert PAL-only code here

#endif



#if SYSTEM = NTSC

   ; insert NTSC-only code here

#endif

This sort of use of symbols to drive the DASM assembly process can be quite useful when you want various sections of code to behave differently—for whatever reason. You might have a test bit of code which you can conditionally compile by defining a symbol as in the above example.

Variables

Now that we're comfortable with DASM's use of symbols as part of the compilation process, let's have a look at how we've been managing our RAM so far…


VARIABLE = $80  ; variable using the 1st byte of RAM

VAR2 = $81      ; another variable using the 2nd byte of RAM 

VAR3 = $82      ; etc

That's perfectly fine—and as we already know, lines like this will add the symbols to DASM's internal symbol table, and whenever DASM sees those symbols it will instead use the associated value. Consider the following example…


VARIABLE = $80  ; variable using 1st TWO bytes of RAM

VAR2 = $82      ; another variable must start after the
                ; 1st variable's space

In this case we've created a 2-byte variable starting at the beginning of RAM. So the second variable has to start at $82 instead of $81—because the first variable requires locations $80 and $81. The above will work fine—but there's no clear correspondence between the variable declaration (which is really just assigning a number/address to a symbol) and the amount of space required for the variable. Furthermore, if we later decided that we really needed 4 bytes (instead of 2) for VARIABLE, then we'd have to 'shift down' all following variables—that is, VAR2 would have to be changed to $84, etc.. This is not only extremely annoying and time-consuming, it is a disaster waiting to happen—because you humans are fallible.

What we really want to do is let DASM manage the calculation of the variable/symbol addresses, and simply say "here's a variable, and it's this big". And fortunately, we can do that.

First, let's consider 'normal code'


    ORG $8000

LABEL1    .byte 1,34,12,3

LABEL2    .byte 0

When assembled, DASM will assign $8000 as the value of the symbol LABEL1, and $8004 as the value of the symbol LABEL2 (that is, it assembles the code, and starting at location $8000 (which is also the value of LABEL1) we will see 4 bytes (1, 34, 12, 3) and then another byte (0) which is at $8004—the value of the symbol LABEL2.

.byte

Note, the '.byte' instruction (actually it's called a pseudo-op, as it's an instruction to the assembler, not an actual 6502 instruction) is just a way of telling DASM to put particular byte values in the ROM at that location.

Remember when we wrote 'NOP' to insert a no-operation instruction—which causes the 6502 to execute a 2 cycle delay? When we looked at the listing file, we saw that the NOP was replaced in the ROM image by the value $EA. Well, instead of letting DASM work out what the op-code's value is, we can actually just put that value in ourselves, using a .byte instruction to DASM. Example…


    .byte $EA   ; a NOP instruction!

Now, this isn't often done—but there are extremely rare cases where you might want to do this (typically with extremely obscure and highly godlike optimizations). We won't worry about that for now. But it's important to understand that just like DASM—which simply replaces a list of instructions with their values, we can just as easily do the same thing and put the values there ourselves.

ds (Define Space)

Now it's easy to see how DASM gets its values for the labels from the address of the data it is currently assembling—in the earlier example, we started assembly (the ORG pseudo-op) at $8000, and then DASM encountered the label LAB1—which was given the value $8000, etc. We then inserted 4 bytes with the '.byte' pseudo-op. Instead of '.byte' which places specific values into the output binary file, we could have used the 'ds' pseudo-op—which stands for 'define space'. For example, the following would give the same two addresses to LAB1 and LAB2 as the above example, but the data put into the binary would differ…


    ORG $8000

LAB1 ds 4

LAB2 ds 1

Typically, the 'ds' pseudo-op will place 0's in the ROM—as many bytes as specified in the value after the 'ds'. In the above example, we'll see 4 0's starting at $8000 followed by another at $8004.

Now let's consider our RAM… which starts at $80. What would we have if we did something like this…?


    ORG $80 ; start of RAM

VARIABLE  ds 3 ; define 3 bytes of space for this variable

VAR2  ds 1     ; define 1 byte of space for this one

VAR3  ds 2     ; define 2 etc..

Now that's much nicer, isn't it! It won't work, though :) The problem is, DASM will quite happily assemble this—and it will correctly assign values $80 to VARIABLE, $83 to VAR2 and $84 to VAR3—but it will ALSO generate a binary ROM image containing data at locations $80-$85. That's RAM, not ROM—and it most definitely doesn't belong in a ROM binary. In fact, our ROM would now also be HUGE—because DASM would figure that it needs to create an image from location $80 - $FFFF (ie: it will be about 64K, not 4K).

What we need to do is tell DASM that we're really just using this code-writing-style to calculate the values of the symbols, and not actually creating binary data for our ROM. And we can do that. Let's plunge right in…

SEG.U


    SEG.U variables

    ORG $80

VARIABLE  ds 3 ; define 3 bytes of space for this variable

VAR2  ds 1     ; define 1 byte of space for this one

VAR3  ds 2     ; define 2 etc..

The addition is the 'SEG.U' pseudo-op, followed by a segment name. This is telling DASM that all the following code (until a next 'SEG' pseudo-op is encountered) is an uninitialized segment. When it encounters a 'segment' defined like this, DASM will not generate actual binary data in the ROM—but it will still correctly calculate the address data for the symbols.

Note: It is important to give the segment a name (though this parameter is optional, you should choose a unique name for each segment). Naming segments assists the assembler in keeping track of exactly which parts of your code are initialized and uninitialized.

If you now go back and have a close look at the vcs.h file, you may begin to understand exactly how the values for all of the TIA registers are actually defined/calculated. Yes, they're defined as an uninitialized segment starting at a specific location. Typically this start-location is 0, and each register is assigned one byte. We keep the register symbols in the correct order and let DASM work out the addresses for us. There's a reason for this—to do with bankswitching cartridge formats—but the general lesson here is that it's nice to let DASM do the work for us—particularly when defining variables—and let it worry about the actual addresses of stuff—we just tell it the size.

One final word on the SEG pseudo-op. Though it is not strictly necessary, all of our code uses it. Without the .U extension, SEG will create binary data for our ROM. With the .U, SEG just allows DASM to populate its symbol table with names/values.

So from now on, let's define variables 'the proper way'. We'll use an uninitialized segment starting at $80, and give each variable a size using the 'ds' pseudo-op. And don't forget after our variable definitions to place another 'SEG' which will effectively tell DASM to start generating binary ROM data. Here's an example…


  SEG.U vars  ; the label "vars" will appear in our symbol
              ; table's segment list
  ORG $80     ; start of RAM

Variable ds 1  ; a 1-byte variable





    SEG    ; end of uninitialized segment - start of ROM binary

    ORG $F000
; code....

Variable Overlays

This is as good a place as any to mention variable overlays. This is a handy 'trick' you can use to re-use RAM by assigning different usage (=meaning) to RAM locations based on the premise that some RAM locations are only needed for some parts of a game, and some for others. If you have two variables which do not clash in terms of the area in the code they are used, then there's no real reason why those variables can't use the same RAM location.

Stella List Posting

Here's my original post to the stella list on this issue (7/Feb/2001).

As I'm trying to optimize RAM usage, I'd been using a general scratchpad variable ('temp') and using that in the code wherever I need to. I managed the allocation and meaning of the variables manually. That is, I might know that 'temp+1' is the variable for the line #, etc., etc. It works, but it is prone to error.

So, I was thinking of a better way, and came up with this...


    org $80   ; start of our overlay section

temp        ds 8           ; general area for variable overlays

   ; other RAM variable declarations here....




   ; and now come the 'overlays'... these effectively use the
   ; 'temp' RAM location, referenced by other names...



   ; overlay section 1

    org temp         ; <--- this is the bit that is the trick


overlayvar1    ds 1        ; effectively 'temp'

overlayvar2    ds 2        ; effectively 'temp+1'

overlayvar3    ds 2        ; effectively 'temp+3'



   ; overlay section 2

    org temp                ; ANOTHER overlay on the 'temp'

variable

linecounter    ds 1         ; effectively 'temp'

indirect        ds 2         ; effectively 'temp+1'

   ; etc...



   ; overlay section 3

    org temp

sect3var        ds 8

   ; can't add more in this overlay (#3) as it's already
   ; used all of temp's size

This all works fine... as long as you remember that when you are using variables in overlays, you can't use two different overlays at the same time. That is, the same routine (or section of code) CANNOT use variables in overlay section 1 AND overlay section 2. It's not that much of a restriction, and allows you to use nice variable names throughout your code.

Just be careful your overlays don't get bigger than the general area allocated for each section.

The advantages of this system are that you can CLEARLY see what your variables are, and you only have to change sizes/declarations/usage in a single place (the RAM overlay declaration) … not hunt through your code when you decide to change usage.

/end of posting to stella

To summarize, we declare one 'variable' which is a block of RAM which is used for sharing RAM. This is our overlay section. We then declare each of our Overlays by setting the origin to the start of the overlay section and define new variables. This works because the assembler is generating an UNINITIALIZED segment for our RAM variables. What that means is that we're just using the assembler to assign values to labels (to its symbols), but not actually generating ROM data. So each overlay section starts in the same spot, and defines variables (ie: assigns addresses to labels) starting at that spot. We essentially share RAM locations for those variables, with other variables which are also defined the same way.

I've used this technique now for many demos. It can give the effect of dramatically increasing available RAM. Just have to be careful that you don't try and use two variables sharing the same location at any time. With a bit of careful management it comes naturally.

Here's a generic 'shell' with comments I use for overlay RAM variables...


    ; This overlay variable is used for the overlay variables. That's OK.

    ; However, it is positioned at the END of the variables so, if on the

    ; off chance we're overlapping stack space and variable, it is LIKELY
    
    ; that won't be a problem, as the temp variables (especially the
    
    ; latter ones) are only used in rare occasions.



    ; FOR SAFETY, DO NOT USE THIS AREA DIRECTLY (ie: NEVER reference
    
    ; 'Overlay' in the code). ADD AN OVERLAY FOR EACH
    
    ; ROUTINE'S USE, SO CLASHES CAN BE EASILY CHECKED.


Overlay    ds 0;   ; --> overlay (share) variables
                   ; (make sure this is as big as the biggest overlay
                   ; subsection)




;----------------------------------------------------------------------------
;
; OVERLAYS!

; These variables are overlays, and should be managed with care. That is,
; variables are ALREADY DEFINED, and we're reusing RAM for other purposes.


; EACH OF THESE ARE VARIABLES (TEMPORARY) USED BY ONE ROUTINE (AND IT'S
; SUBROUTINES) THAT IS, LOCAL VARIABLES. USE 'EM FREELY, THEY COST NOTHING.


; TOTAL SPACE USED BY ANY OVERLAY GROUP SHOULD BE <= SIZE OF 'Overlay'


;----------------------------------------------------------------------------



                org Overlay



   ; ANIMATION/LOGIC SYSTEM

   ; place variables here


;----------------------------------------------------------------------------



                org Overlay



   ; DRAWING SYSTEM

   ; place variables here



   ; etc

Hope that's clear enough.

Summary

That will do nicely for this session—see you next time!