Atari 2600 Programming for Newbies

Session 9: 6502 and DASM - Assembling the Basics

By Andrew Davie (adapted by Duane Alan Hahn, a.k.a. Random Terrain)

As an Amazon Associate I earn from qualifying purchases.

Page Table of Contents

Original Session

Original Session 9 at AtariAge

This session we're going to have a look at the assembler "DASM", what it does, how it does it, why it does it, and how to get it to do it.

The job of an assembler is to convert our source code into a binary image which can be run by the 6502. This conversion process ultimately replaces the mnemonics (the words representing the 6502 instructions we use when writing in assembler) and the symbols (the various names we use for things, such as labels to which we can branch, and various other things like the names of TIA registers, etc) with numerical values.

So ultimately, all the assembler needs to do is figure out a numerical value for all the things which become part of the binary—and place that value in the appropriate place in the binary.

NOP

We've already had a brief introduction to a 6502 instruction—the one called NOP. This is the no-operation instruction which simply takes 2 cycles to execute. Whenever we enter NOP into our source code, the assembler recognizes this as a 6502 instruction and inserts into the binary the value $EA. This shows that there can be a simple 1:1 relationship between source-code and the binary.

NOP is a single-byte instruction—all it requires is the opcode, and the 6502 will happily execute it. Some instructions require additional 'parametersA value that is passed to a routine.

(Adapted from Webopedia.)'—the 'operandsIn all computer languages, expressions consist of two types of components: operands and operators. Operands are the objects that are manipulated and operators are the symbols that represent specific actions. For example, in the expression

5 + x

x and 5 are operands and + is an operator. All expressions have at least one operand.

(From Webopedia.)'. The 6502 microprocessor can use an additional 1 or 2 bytes of operand data for some instructions, so the total number of bytes for a 6502 'instruction' can be 1, 2 or 3.

DASM

DASM is the assembler used by most (if not all) modern-day '2600 programmers. It is a multi-platform assembler written in 1988 by Matt Dillon (you should all find his email address and send him a "thank-you" sometime). It's a great tool.

DASM isn't just capable of assembling 6502 (and variant) code—it also has inbuilt capability to assemble code for several other microprocessors. Consequently, one of the very first things that it is necessary to do in our source code is tell DASM what processor the source code is written for…


     processor 6502

This should be just about the first line in any '2600 program you write. If you don't include it, DASM will probably get confused and spit out errors. That's simply because it is trying to assemble your code as if it were written for another processor.

We've just seen how mnemonics (the standard names for instructions) are converted into numerical values by the assembler. Another job the assembler does is convert labels and symbols into values. We've already encountered both of these in our previous sessions, but you may not be familiar with their names.

Symbol Table

Whenever DASM is doing its job assembling, it keeps a list of all the 'words' it encounters in a file in an internal structure called a symbol table. Think of a symbol as a name for something. Remember the 'sta WSYNC' instruction we used to halt the 6502 and wait for the scanline to be rendered? The 'sta' is the instruction, and 'WSYNC' is a symbol. When it first encounters this symbol, DASM doesn't know much about it, other than what it's called (ie: 'WSYNC'). What DASM needs to do is work out what the *value* of that symbol is, so that it can insert that value into the binary file.

When it's assembling, DASM puts all the symbols it finds into its symbol table—and associated with each of these is a value. If it doesn't 'know' the value, that's OK—DASM will keep assembling the rest of the file quite happily. At some point, something in the code might tell DASM what the value for a symbol actually IS—in which case DASM will put that value in its symbol table alongside the symbol. So whenever that symbol is used anywhere, DASM now knows its correct value to put into the binary file.

In fact, it is absolutely necessary for all symbols which go into the binary file to be given values at some point. DASM can't guess values—it's up to you, the programmer, to make sure this happens. A symbol doesn't have to be given a value at any PARTICULAR point in the code, but it does have to be given a value somewhere in the code. DASM will make multiple 'passes'—basically going through the code from beginning to end again and again until it manages to resolve all the symbols to correct values.

vcs.h

We've already seen in some sample code how 'sta WSYNC' appears in our binary file as the bytes $85 $02. The first byte $85 is the 'sta' instruction (one variant of many—but let's keep it simple for now) and it is followed by a single byte giving the address of the location into which the byte in the 'A' register is to be stored. We can see this address is location 2 in memory. Somehow, DASM has figured out from the code that the symbol WSYNC has a value of 2, and when it creates the binary file it replaces all occurrences of the symbol with the numeric value 2.

How did it get the value 2? Remember, WSYNC is one of the TIA registers. It appears to the 6502 as a memory location, as the TIA registers are 'mapped' into locations 0 - $7F. The file 'vcs.h' defines (in a roundabout way) the values and names (symbols) for all of the TIA registers. By including the file 'vcs.h' as a part of the assembly for any source file, we automatically tell DASM the correct numeric value for all of the TIA register 'names'.

That's why, at the top of most files, just after the processor statement, we see…


     include "vcs.h"

You don't really need to know much about vcs.h at this stage—but be aware that a 'standardized' version of this file is distributed with the DASM assembler as the '2600 support files package. I would advise you to always use the latest and greatest version of this file. Standards help us all.

So now we know basically what DASM does with symbols—it keeps an internal list of symbols—and their values, if known. DASM will keep going through the code and 'resolving' the symbols into numeric values, until it is complete (or it couldn't find ANYTHING to resolve, in which case it gives an error). Once all symbols have been resolved, your code has been completely processed by the assembler, and it creates the binary image/file for you—and assembly is complete.

DASM Summary

To summarize: DASM converts source-code consisting of instructions (mnemonics) and symbols into a binary form which can be run by the 6502. The assembler converts mnemonics into opcodes (numbers), and symbols into numbers which it calculates the value of during the assembly process.

Command Line

DASM is a command-line program—that is, it runs under DOS (or whatever platform you happen to choose, provided you have a runnable version for that platform). DASM is provided with full source-code (it's written in C) so as long as you have a C-compiler handy, you can port it to just about any platform under the sun.

It does come with a manual—and it's always a good idea to familiarize yourself with its capabilities. In the interests of getting you up and running quickly, so you can actually assemble the sample kernel posted a session or two ago, here's what you need to type on the command-line…


 dasm kernel.asm -lkernel.txt -f3 -v5 -okernel.bin

This is assuming that the file to assemble is named 'kernel.asm' (.asm is a standard prefix for assembler files, but some prefer to use .s—you can use whatever you want, really, but I always use .asm). Anything prefixed with a minus-sign ('-') is a 'switch'—which tells DASM something about what it is required to do. The -l switch we discussed very briefly, and that tells DASM to create a listing file—in this case, it will write a listing to the file 'kernel.txt'. The -o switch tells DASM what file to use for the output binary—in this case, the binary will be written to 'kernel.bin'. That file can be loaded into an emulator, or burned on an EPROM—it is the ROM file, in other words.

The other switches '-f3' and '-v5' control some internals of DASM—and for now just assume you need these whenever you assemble with DASM. Remember, if you're curious you can always read the manual!

Output

If all goes well, DASM will output something like this…


DASM V2.20.05, Macro Assembler (C)1988-2003

START OF PASS: 1

----------------------------------------------------------------------

SEGMENT NAME                 INIT PC  INIT RPC FINAL PC FINAL RPC

                             f000                            f000

RIOT                     [u] 0280                            0280

TIA_REGISTERS_READ       [u] 0000                            0000

TIA_REGISTERS_WRITE      [u] 0000                            0000

INITIAL CODE SEGMENT         0000 ????                       0000 ????

----------------------------------------------------------------------

1 references to unknown symbols.

0 events requiring another assembler pass.

--- Symbol List (sorted by symbol)

AUDC0                    0015

AUDC1                    0016

AUDF0                    0017

AUDF1                    0018

AUDV0                    0019

AUDV1                    001a

COLUBK                   0009              (R )

COLUP0                   0006

COLUP1                   0007

COLUPF                   0008

CTRLPF                   000a

CXBLPF                   0006

CXCLR                    002c

CXM0FB                   0004

CXM0P                    0000

CXM1FB                   0005

CXM1P                    0001

CXP0FB                   0002

CXP1FB                   0003

CXPPMM                   0007

ENABL                    001f

ENAM0                    001d

ENAM1                    001e

GRP0                     001b

GRP1                     001c

HMBL                     0024

HMCLR                    002b

HMM0                     0022

HMM1                     0023

HMOVE                    002a

HMP0                     0020

HMP1                     0021

INPT0                    0008

INPT1                    0009

INPT2                    000a

INPT3                    000b

INPT4                    000c

INPT5                    000d

INTIM                    0284

NUSIZ0                   0004

NUSIZ1                   0005

Overscan                 f02c              (R )

PF0                      000d

PF1                      000e

PF2                      000f

Picture                  f01d              (R )

REFP0                    000b

REFP1                    000c

RESBL                    0014

Reset                    f000              (R )

RESM0                    0012

RESM1                    0013

RESMP0                   0028

RESMP1                   0029

RESP0                    0010

RESP1                    0011

RSYNC                    0003

StartOfFrame             f000              (R )

SWACNT                   0281

SWBCNT                   0283

SWCHA                    0280

SWCHB                    0282

T1024T                   0297

TIA_BASE_ADDRESS         0000              (R )

TIM1T                    0294

TIM64T                   0296

TIM8T                    0295

TIMINT                   0285

VBLANK                   0001              (R )

VDELBL                   0027

VDELP0                   0025

VDELP1                   0026

VerticalBlank            f014              (R )

VSYNC                    0000              (R )

WSYNC                    0002              (R )

--- End of Symbol List.

Complete.

Here we can actually SEE the symbol table, and the numeric values that DASM has assigned to the symbols. If you look at the listing file, wherever any of these symbols is used, you will see the corresponding number in the symbol table has been inserted into the binary.

There are lots of symbols there, as the vcs.h file defines just about everything you'll ever need to do with the TIA. The symbols which are actually USED in your code are marked with a (R )—indicating 'referenced'.

Now you should be able to go and assemble the sample kernel I provided earlier. Don't be afraid to have a play with things, and see what happens! Experimenting is a big part of learning.

Summary

Soon we'll start playing with some TIA registers and seeing what happens to our screen when we do that! For now, though, make sure you are able to assemble and run the first kernel. If you have any problems, ask for assistance and I'm sure somebody will leap to your aid.