A.N.A.L.O.G. ISSUE 13 / SEPTEMBER 1983 / PAGE 66
A.N.A.L.O.G. ISSUE 72 / MAY 1989 / PAGE 27

Boot Camp

by Tom Hudson

With this issue, ANALOG begins a new column. Boot Camp will examine assembly language on the ATARI computer systems, while presenting useful subroutines to illustrate the techniques discussed in the column.

The Ground Rules

Before starting to learn assembly language, let’s lay down the ground rules.

First, you should have a good reference guide to assembly-language operation codes. I suggest 6502 Assembly-Language Programming by Lance Leventhal (OSBORNE/McGraw-Hill). Of course, there are many such books, and the final choice of what book to use is up to you. Just be sure it covers the 6502 operation codes clearly and completely.

Second, since many program concepts will be shown in BASIC, you should have a working knowledge of BASIC. Assembly language requires a solid background in programming logic, and working in BASIC helps develop this skill. In addition, assembly-programming concepts can be grasped more easily if they are first shown in a language the reader is familiar with, such as BASIC. This should not be a problem for most readers, since BASIC is usually the first language learned by personal-computer owners. Therefore, from this point on, I will assume that all readers of this column are fluent in BASIC.

Third, you will need an assembler/editor package. All assembly-language listings in this column will be compatible with the ATARI Assembler/Editor cartridge and OSS’s EASMD and MAC/65 assemblers. You can use other assemblers, but some code conversion may be necessary.

Fourth, you should be able to read flowcharts. Flowcharts are a good way to visualize a program’s operation before actually writing any code.

Numbering Systems

Everybody is familiar with the decimal numbering system. We all use this numbering system in everyday mathematical calculations. The word “decimal” is derived from the Latin decem, or ten. Therefore, this numbering system is known as “Base 10”, since there are ten digits, 0–9. Let’s take a closer look at the decimal numbering system. Figure 1 shows how a six-digit Base 10 number can be broken down into individual digits. Each digit can range from 0–9 in value.

105104103102101100  POSITION VALUE
 2   6   5   0   7   3
 │   │   │   │   │   │
 │   │   │   │   │   └─> 3 × 100 =      3
 │   │   │   │   │
 │   │   │   │   └─────> 7 × 101 =     70
 │   │   │   │
 │   │   │   └─────────> 0 × 102 =      0
 │   │   │
 │   │   └─────────────> 5 × 103 =   5000
 │   │
 │   └─────────────────> 6 × 104 =  60000
 └─────────────────────> 2 × 105 = 200000
                                    265073 (BASE 10)

                    Figure 1

Above each digit is that digit’s position value. The position value is the amount each digit is multiplied by to get the actual value of the digit. You will notice that the position values are shown in powers of 10, since we are working in Base 10. The 1’s position is shown as 10 to the 0 power. Any time a number is raised to the 0 power, the result is 1. Therefore, to get the value of the 3 in the last position of the number, we would calculate:


In this case the calculation would be:

3 × 1 = 3

We would conclude that the last position in the number would have a value of 3.

The next position, containing the digit 7, has a position value of 10 to the first power, or 10. The calculation of this digit’s value would be:

7 × 10 = 70

When we repeat this calculation for each digit in the number and add all the values, we will obtain the value of the number, 265,073 (Base 10).

Here’s another concept that we may not think about, but is very interesting. What happens to the number if we shift all the digits to the left, as shown in Figure 2?

106105104103102101100  POSITION VALUE
 2   6   5   0   7   3   0
 │   │   │   │   │   │   │
 │   │   │   │   │   │   └─> 0 × 100 =       0
 │   │   │   │   │   │
 │   │   │   │   │   └─────> 3 × 101 =      30
 │   │   │   │   │
 │   │   │   │   └─────────> 7 × 102 =     700
 │   │   │   │
 │   │   │   └─────────────> 0 × 103 =       0
 │   │   │
 │   │   └─────────────────> 5 × 104 =   50000
 │   │
 │   └─────────────────────> 6 × 104 =  600000
 └─────────────────────────> 2 × 106 = 2000000
                                        2650730 (BASE 10)

                    Figure 2

By looking at the final results, you can see that the number has effectively been multiplied by 10, with a result of 2,650,730 (Base 10)!

Why did this happen? The answer is actually very simple. When each digit is shifted to the left, its position value is increased by a power of 10. The resulting number will be ten times larger than if it were not shifted. Try shifting the number to the right and see what happens.

What Do We Care About All This?

Now that we know exactly how our normal numbering system works, let’s apply what we know to a different system, binary.

The word “binary” comes from the Latin bis, or “double”. As you may know, digital computers work with two electrical states, on and off. This situation is perfectly suited for the binary numbering system, or Base 2.

The binary numbering system uses only two digits, 0 and 1, but the principle of the numbering system is the same as Base 10. Figure 3 shows a number in Base 2 and how it can be converted to Base 10.

 27 26 25 24 23 22 21 20  POSITION VALUE
 1   0   1   1   1   0   1   1
 │   │   │   │   │   │   │   │
 │   │   │   │   │   │   │   └─> 1 × 20 =   1
 │   │   │   │   │   │   │
 │   │   │   │   │   │   └─────> 1 × 21 =   2
 │   │   │   │   │   │
 │   │   │   │   │   └─────────> 0 × 22 =   0
 │   │   │   │   │
 │   │   │   │   └─────────────> 1 × 23 =   8
 │   │   │   │
 │   │   │   └─────────────────> 1 × 24 =  16
 │   │   │
 │   │   └─────────────────────> 1 × 25 =  32
 │   │
 │   └─────────────────────────> 0 × 26 =   0
 └─────────────────────────────> 1 × 27 = 128
                                           187 (BASE 10)

                    Figure 3

Once again, the number is shown with the position values above each digit in the number. In Base 2, you will notice that the position values are powers of 2. This means that, unlike the decimal progression of 1, 10, 100, etc. the binary system has a progression of 1, 2, 4, 8 and so on. As a result, the number 10111011 (Base 2) is 187 in Base 10. Figure 4 shows the binary equivalents of the numbers 0–19. Try using the method shown in Figure 3 to convert these numbers to the Base 10 equivalents shown.

Figure 4

Remember how a Base 10 number multiplied by 10 when we shifted it left one digit? Let’s look at how a binary number is affected by such a shift. Figure 5a shows the number 7 in binary before the shift and Figure 5b shows the number after the shift.

 23 22 21 20  POSITION VALUE
 0   1   1   1
 │   │   │   │
 │   │   │   └─> 1 × 20 = 1
 │   │   │
 │   │   └─────> 1 × 21 = 2
 │   │
 │   └─────────> 1 × 22 = 4
 └─────────────> 0 × 23 = 0
                           7 (BASE 10)

                    Figure 5a
 23 22 21 20  POSITION VALUE
 1   1   1   0
 │   │   │   │
 │   │   │   └─> 0 × 20 = 0
 │   │   │
 │   │   └─────> 1 × 21 = 2
 │   │
 │   └─────────> 1 × 22 = 4
 └─────────────> 1 × 23 = 8
                          14 (BASE 10)

                    Figure 5b

The number has been multiplied by 2! By examining this result and the above shift in Base 10, we can see that by shifting the digits in a number left to right, we multiply or divide the number by its base number. This concept will come in very handy in later installments of this column, so keep it in mind.

“Funny” Numbers

The mechanics of the binary numbering system are extremely important, but they can cause some problems.

Let’s say you want to look at what is in Memory Location 33011, but don’t want to give the number in decimal for some reason. The most logical choice, as far as your computer is concerned, is binary. Unfortunately for us humans, this number comes out as 1000000011110011, and is cumbersome, to say the least. We don’t like to handle numbers like this—there are just too many chances to make a mistake.

Fear not! There is yet another numbering system that is compatible with both our friend the computer and our human limitation for handling large numbers. What is this system, you ask? It is called Base 16, or hexadecimal.

We have already noted that Base 10 uses ten digits, 0–9, and that Base 2 uses two digits, 0–1. Naturally, then, it follows that Base 16 uses sixteen digits.

But wait a minute! Since we humans normally use only the ten digits from the decimal system, we don’t have enough for Base 16—we’ll have to come up with six more. Rather than invent six new digit symbols, we’ll use the letters A–F, which are already in existence. Figure 6 shows the 16 digits used in hexadecimal and their decimal equivalents.

Figure 6

Once again, the principle of the hexadecimal (hex) numbering system is the same as the other systems we have examined so far; the only difference is that the letters A–F must be thought of as the numbers 10–15. Figure 7 shows the conversion of the hex number $F4BE (all hex numbers should be preceded by a “$”) to decimal.

163162161160  POSITION VALUE
 F   4   B   E
 │   │   │   │
 │   │   │   └─> 14 × 160 =    14
 │   │   │
 │   │   └─────> 11 × 161 =   176
 │   │
 │   └─────────>  4 × 162 =  1024
 └─────────────> 15 × 163 = 61440
                             62654 (BASE 10)

                    Figure 7

So how does expressing numbers in hex help us avoid binary monstrosities? It’s easy. The number 33011 (1000000011110011 in binary) is $80F3 in hex. Obviously, this is a much easier number to remember than its binary equivalent.

Another interesting fact is that binary numbers are very easy to convert to hex. First the binary number must be divided into groups of four digits, from right to left. Then each group of four digits (ranging in value from 0–15) can easily be converted to the corresponding hex digit 0–F. Figure 8 illustrates this technique.

BINARY NUMBER: 111100111011

SPLIT: 1111 0011 1011
         │    │    │
         │    │    └─> 1110 = $B
         │    │
         │    └──────>  310 = $3
         └───────────> 1510 = $F


                    Figure 8

Bytes and Bits

All readers who have owned their computers for more than a few days have at least heard of the terms “byte” and “bit”. Usually this term pops up when the memory capacity of the computer is being discussed.

The byte is the unit most often used when referring to memory size. If your system has “16K” of memory, it has 16 × 1024 bytes, or 16384 bytes total. Each byte is made up of eight bits (short for binary digits). Each bit can be either off or on, corresponding to the 0 and 1 digits in the binary numbering system. With eight digits, this means that each byte can have 2 to the 8th power (or 256) combinations. This is why BASIC limits values in the POKE command to the range of 0–255.

In the process of learning assembly language, we will learn to manipulate the memory of your computer to do the things we want. Study these concepts carefully as they will be used in almost every assembly-language program you write.

How Assembly Works

In BASIC, a programmer can simply type in a program, type RUN, and the computer will begin executing the program immediately. If there is a problem, the programmer presses Break, finds the error and runs the program again. This makes programming very easy, and almost everyone is happy.

Yes, I said almost everyone.

Unfortunately for budding game programmers, a kind of brick wall soon appears on the happy road to the ultimate game. These programmers soon find that BASIC is far too slow to handle the complex graphics and game logic necessary for an arcade-style game. At the very least, assembly-language subroutines are necessary to speed things up.

Why is BASIC so slow? Inside, the computer is a device called a 6502 microprocessor. This little chip of silicon is what makes your computer work. It is capable of performing hundreds of thousands of operations per second, and does so every second your computer is powered on!

Sadly, all this computing power is lost as soon as a BASIC cartridge is inserted into your machine. You see, the microprocessor doesn’t understand a single word of English, and the BASIC cartridge must act as an interpreter.

All of this interpreting takes time, and instead of doing the work you want, the poor microprocessor winds up spending most of its time translating BASIC into a language it can understand: binary. And this translation doesn’t happen just once—it happens every time a BASIC command is executed! What a waste.

Assembly language, on the other hand, uses what is known as an assembler to perform this translation just once. The programmer writes a program in a special format. This is known as the source code. When ready to execute the program, the programmer processes it with an assembler, which translates the source code into object code, which is the actual binary machine-language. This code can be loaded at any time and executed as fast as the computer can go. It only has to be assembled once.

There are a few trade-offs involved when using assembly language, however.

First, the programmer must re-assemble a program each time a change is made. This can take quite a bit of time when a large program is involved. For this reason, it is a good idea to flowchart each program before writing any code. This helps reduce logic errors.

Second, the programmer must know where the program will be located in memory. Since the computer’s operating system has certain needs, the programmer must be aware of what memory locations are available.

Third, errors can be hard to find. When a program is executing at hundreds of thousands of operations per second, an error cannot always be easily traced to a certain instruction. For this reason, a good debugging package is a must.

Fourth, all arithmetic must be handled explicitly by the programmer. Assembly language does not have square root, sine or cosine functions. It cannot multiply or even divide! Unless the programmer specifies otherwise, the addition and subtraction instructions can only produce numbers from -128 to 127. In the course of this column, we will examine the arithmetic functions that are possible in assembly language and how they are coded.

This may sound like a lot of limitations, but the 6502 processor allows the programmer to use the computer’s built-in operating system directly, which BASIC has a hard time doing. And, of course, assembly language can be thousands of times faster than BASIC, allowing the programmer to write real-time simulations and arcade-style games.

Now that we’ve laid the groundwork for assembly-language programming, let’s look at the 6502 itself.

Chip Off the Old Block

The 6502 processor chip has six registers that we are concerned with. These registers hold specific information and provide work areas for the programmer, and are shown in Figure 9.

                 │                 │ ACCUMULATOR (A)
                 │                 │ INDEX REGISTER (X)
                 │                 │ INDEX REGISTER (Y)
 │                                 │ PROGRAM COUNTER (PC)
                 │                 │ STACK POINTER (SP)
                 │ N V   B D I Z C │ STATUS REGISTER (P)

                    Figure 9

The accumulator (A) is the most important register as far as the programmer is concerned. It is used for all arithmetic operations and most data manipulation. The accumulator is used more than any other register.

The index registers (X and Y) are used to hold memory indexes, counters, or offsets into tables. They can also be used as temporary storage areas.

The program counter (PC) is used by the 6502 to keep track of what instruction is being executed. This register is 16 bits long, enabling it to point to any byte in memory (up to 65535, or 64K). Since this register is maintained by the 6502, we will not be referencing it very often.

The stack pointer (SP) is used by the 6502 to keep track of a temporary storage region known as the stack. The stack holds subroutine return addresses and other temporary data. Since this registration is maintained by the 6502, we will not be referencing it very often, either.

The processor status register (P) is made up of seven individual “flags”, or indicators, which inform the programmer of the 6502’s current status.

The sign flag (N) is 0 when the result of an operation is positive, and 1 when the result is negative.

The overflow flag (V) is set to the exclusive-or of Bits 6 and 7 of the result of an arithmetic operation. The exclusive-or will result in a TRUE result if either bit being evaluated is TRUE, but not if both are TRUE. The overflow flag is rarely used, and is not important at this point.

The break flag (B) is set to 1 when a BRK instruction is executed. We will be using the instruction during program testing to stop program execution.

The decimal mode (D) flag is used to tell the processor to use either binary (0) or binary-coded decimal (1) arithmetic. This flag is important, and the programmer must be aware of its setting at all times.

The interrupt flag (I) enables or disables system-interrupts, depending on its setting.

The zero flag (Z) is set to 1 when any arithmetic or logical operation produces a zero result. A non-zero result sets the flag to 0. The carry flag (C) holds carries out of add, shift, and rotate instructions. It is also used as a borrow flag in subtraction operations. This is a very important flag, and will be discussed in detail later.

Next issue, we’ll cover the different ways instructions can address memory, and start studying arithmetic operations, subroutines, and several other areas. Until then, study what has been covered here until you understand it thoroughly.

A.N.A.L.O.G. ISSUE 14 / NOVEMBER 1983 / PAGE 125
A.N.A.L.O.G. ISSUE 73 / JUNE 1989 / PAGE 40

Boot Camp

by Tom Hudson

Last issue, you were introduced to the concept of various numbering systems, including base 2, base 10 and base 16. We also covered the basics of assembly language and the registers of the 6502 microprocessor.

In this issue, we’ll talk about the ways the 6502 can address memory and begin looking at the 6502 instruction set.

Address unknown?

In order to perform useful work for us, the 6502 microprocessor chip must be able to get numbers from memory, manipulate them, and place the results back in memory. Each memory location has its own number, or ADDRESS. The 6502 can reference up to 65536 bytes of memory ($0000–$FFFF).

If you’ve used the BASIC PEEK and POKE functions, you’ve used the 6502’s addressing ability already. Consider, for example, the BASIC command:

POKE 559,0

This command places a zero in address 559 ($22F), which turns off the computer’s screen display.

Luckily for us programmers, the designers of the 6502 gave us quite a bit of flexibility in how we reference memory locations. These ways are listed below.

Immediate addressing allows us to place one number we are working with (or OPERAND) right after the operation code. The operand must be preceded with the “#” symbol. For example, the assembly instruction:

LDA #23

places the number 23 in the accumulator. In this example we specified the number in decimal. If we wanted, we could have given the number in hexadecimal (base 16):

LDA #$17

Note that decimal numbers require no special marking, but hex numbers are always preceded by a “$” symbol.

Absolute addressing tells the computer we want to get the operand from a certain address somewhere in memory. For example, let’s say we want to turn off the screen as we did before in the above BASIC example. Instead of a POKE 559,0 command, we could use the following two assembly instructions:

LDA #0
STA 559

The first instruction, as we learned above, will load the accumulator with a zero. The second instruction uses the absolute addressing mode to store the contents of the accumulator into memory address 559. What could be easier?

Implied addressing means that no addresses are used in the instruction. The CLC (clear carry) and RTS (return from subroutine) instructions are good examples of implied addressing instructions.

Accumulator addressing is used for those instructions that use only the accumulator, such as ASL A (arithmetic shift left).

Indexed addressing is a useful type of addressing which makes table operations very simple. In this mode, the X or Y register is used as an index. For example, in the following instruction:


If the X register contains a 7, the accumulator will be loaded with whatever is in the seventh byte after TABLE. It’s a very simple concept, and works the same with the Y register.

Indirect addressing is only used with the JMP (jump to location) instruction. In the following example:

JMP ($3000)

The JMP will NOT go to address $3000, but it will take the contents of $3000 and $3001 and jump to the address indicated by their contents. If, for example, $3000 contains $3F and $3001 contains $50, the program will jump to $503F. This instruction is rarely used, but it can be irreplacable under certain circumstances.

Pre-indexed indirect addressing uses the X register and an operand byte to address a byte in the first 256 bytes of memory. In the following example:


If the X register contains $12, the computer adds $AF and $12, giving a result of $C1. The computer then takes the contents of $C1 and $C2 and loads the accumulator from the address contained in these bytes. For example, if location $C1 contains $50 and location $C2 contains $3F, the accumulator will be loaded from location $3F50.

Post-indexed indirect addressing uses the Y register and an address in the first 256 bytes of memory to point to another address. In the following example:


The computer takes the contents of bytes $AF and $B0 and adds the Y register to this address for a final address. If $AF contains $00 and $B0 contains $40 the computer first points to $4000, then uses the Y register as an offset. If the Y register contains $50, the accumulator would be loaded from $4050. This addressing mode is used fairly often.

Relative addressing is used in all branch-on-condition instructions in the 6502. Usually after a comparison the programmer will branch on a condition. This is the same as an IF/THEN statement in BASIC. In the following example:


The computer will calculate the number of bytes between the branch instruction and the location referenced by START at assembly time. During execution, the image in memory may look like:

BNE $30

This indicates that START was 48 bytes from the branch instruction. If the branch is executed, the computer will skip 48 bytes and continue executing at the part of the program labeled START. There is only one drawback to this addressing mode: The branch cannot be farther than -126 or +129 bytes. Longer branches require the use of the JMP instruction.

Assembler syntax.

Every computer language has a set of rules known as SYNTAX. These rules are established so that the programmer will enter program code in a way that the computer can understand. Assembly language has a very simple syntax, shown in Figure 1.


                    Figure 1.

If you have ever looked at assembly language source listings in A.N.A.L.O.G., you have probably noticed the neat columns of “gibberish.” This is the way assembly language is structured.

Each column of information in the assembly source listing is known as a FIELD. Each field is separated by one or more spaces.

The first field, or LABEL field, is optional. If the code you are writing will be referenced elsewhere in the program, you should place an appropriate label in the label field.

A label should give some idea of what the section of code does. For example, L0001 tells nothing about the code, whereas VBLANK tells us that the code is part of the vertical blank cycle. Meaningful labels should be included whenever possible.

Labels should start with a letter, but can contain numbers within them.

Many assemblers use only the first 5 or 6 characters of a label, so the labels we use will be limited to 6 characters. This will enable the readers with assemblers other than the ATARI cartridge to use the program listings with as little modification as possible.

The second field in an assembler statement is the OPERATION CODE. This is usually a three-character standard 6502 instruction, such as LDA, STA, or JMP.

Each assembler also has a set of DIRECTIVES, or PSEUDO-OPERATIONS. These operations are not commands to the 6502, but are processed by the assembler program at assembly time. The most common directives are “.BYTE,” “.WORD,” “EQU” or “=” and “ORG” or “*=.” These will be discussed in detail later.

The third field in an assembler statement is the OPERAND. This field contains data or addresses required by the operation code. Operands are not needed by all operation codes.

Operands are usually given in decimal or hexadecimal. Decimal numbers require no special prefix, but hex numbers must be preceded by the “$” character.

Operands can also be labels defined elsewhere in the program. For example, instead of:

  JMP $4000

We could have used the EQUATE directive to define a label called START and set it to the value of $4000 as follows:

START = $4000

By using labels in operands instead of absolute numbers, programs are easier to change if the need arises. Imagine having to change 50 “JMP $4000” instructions to “JMP $5000.” If we used “JMP START” instead, we’d only have to change the “START = $4000” to “START = $5000.” This would automatically change the 50 JMP instructions!

The last field in an assembler statement is the COMMENT. Comments are optional, but encouraged. Comments are like REMarks in BASIC—they help document what the programmer is doing. This is especially important in assembler programs, which are somewhat difficult to decipher.

Comments are preceded by a semicolon (;). Everything after the semicolon is ignored by the assembler. Comments should be used as often as possible, especially when a section of code is fairly complex. This will not only help others who use the program, but will help you if you need to make changes to the program at a later date.

Where to put the program?

In BASIC, the programmer doesn’t really care where the program is placed in memory. BASIC handles all these messy details for the programmer, who simply writes program code. This is one of the benefits of a high-level language like BASIC.

As mentioned last issue, the assembly language programmer must know at all times what locations a program is using. Without total knowledge of a program’s location, it is possible to overlap memory used by the system and cause an irrecoverable “lock-up.”

Let’s look at what memory locations are available to us in the ATARI computer system. This discussion will apply to users of the ATARI Assembler-Editor cartridge only.

Plug your cartridge into the computer and turn on the power. When the EDIT prompt appears, type SIZE and press RETURN. The cartridge will show three numbers, such as:

0700 0800 3C1F

The first number is the bottom of RAM, the second is the end of the text editor buffer, and the third is the highest available RAM address.

Since readers have different amounts of memory and since cassette and disk systems use different amounts of memory, each reader must decide where to place the object program in memory. To do this, subtract about $600 (1536 bytes) from the last number above. In this case, the number is $3C1F–$0600 = $361F. Round this down to the nearest 256 bytes and you have $3600. This will be the starting address of your object program. Use this address in the “*=” directive of the program in this column.

There are also 256 bytes available for use at $0600–$06FF, or PAGE 6. We will be using this area later for subroutines called by BASIC. The term “page” is used to refer to a 256-byte section of memory. The page number comes from the first two digits of the hex address. $0200–$02FF is page 2, $0800–08FF is page 8, etc.

The last memory available to us has special significance. This memory lies on page 0, $0000–$00FF. When the 6502 knows a byte is on page 0, it only needs the last two hex digits to address it. This allows the 6502 to access the information faster, with a smaller program, since only one byte is needed in the operand instead of the usual two needed for an address.

Since page 0 addresses can be accessed faster with less program memory, it is obviously good to use page whenever possible. The problem is, the system uses some page 0 for its own needs. The entire first half page of 0 ($0000–$007F) is always used by the system. The second half ($0080–$00FF) is available to assembly language programs if no cartridges are in use.

Unfortunately, the ATARI Assembler Editor cartridge only allows you to use locations $B0–$CF. These locations are probably sufficient for most testing purposes.

When writing assembly language programs to be called as subroutines by ATARI BASIC, only locations $CB–$D1 and $D4–$D5 can be used without conflict with BASIC’s work areas. If an assembly subroutine needs temporary work areas, locations $D6–$F1 can be used. These areas will probably be changed by BASIC after the assembly subroutine ends, but they will work fine as temporary storage locations.

A few instructions.

Now we’re ready to look at a few 6502 operation codes and see how they work. We’ll start with the most frequently used instructions and work our way up to the rarely used instructions.

Without a doubt, the most frequently used 6502 operation code is LDA, or LOAD ACCUMULATOR. This instruction places a desired number in the A register, or accumulator.

The accumulator is used in all addition and subtraction operations, as well as most other arithmetic that can be performed on the 6502. You must move numbers in and out of the accumulator constantly, keeping track of the results. At times, you’ll feel like a traffic cop trying to direct hundreds of cars through an ordinary doorway. After just a few hours of assembly programming, you’ll see how important the accumulator is.

The LDA instruction has eight different formats, each with its own addressing method:

LDA nn      (ABSOLUTE)
LDA n       (PAGE ZERO)

Each of these instructions work differently in order to load the accumulator. They find the address from which they are to get the number and place it in the accumulator, destroying whatever was there before. Once the number is placed in the accumulator, however, the instructions act alike.

Let’s assume the number loaded into the accumulator was $94, shown below in its binary form (note the “%” sign preceding the binary number).

$94 = %10010100

All LDA instructions take special information from the number loaded and set microprocessor status flags accordingly. The two flags changed are the SIGN flag and the ZERO flag.

The zero flag is set to 1 if the number loaded was zero, and is set to if the number was not zero. This flag is mainly used for branching, which we will cover later.

The sign flag is set to the value of the high-order (or leftmost) bit of the number loaded. You should remember that an 8-bit byte can contain numbers from 0–255. This is true when we are considering the numbers to be UNSIGNED. The 6502 uses a signed numbering system that can be somewhat confusing.

Whenever a number’s high-order bit is a 1, the number is considered to be negative. Using this method, a byte can contain numbers from -128 to 127. How does this work? Let’s start with the positive numbers.

Positive numbers in the 6502 signed number scheme range from 0 (which is always considered positive) to 127. The upper limit of 127 is set because if the number goes to 128, the high-order bit will be set to 1 and the number is negative.

Negative numbers range from -1 to -128 in the 6502 system. If we subtract 1 from zero in the 8-bit byte format, the byte’s contents will “wrap around” to the bit pattern 11111111, which is 255. 255 corresponds to -1 in this scheme. An easy way to remember the relationship here is the following calculation:


Using this formula with the unsigned number 255, we can see that 255 - 256 = -1, which is correct. We can easily find the signed counterpart to 128, or 128 - 256 = -128.

Now you can see exactly how the sign flag works. This flag will be very important later when we perform comparisons.

The next instruction, which is used almost as much as the LDA instruction is STA, or STORE ACCUMULATOR. This instruction does almost the same thing as LDA, but in reverse.

The STA instruction has the following formats:

STA nn      (ABSOLUTE)
STA n       (PAGE ZERO)

You will notice that the STA instruction has the same formats as the LDA instructor except for the IMMEDIATE format. Think about it for a minute and the reason should be obvious.

The STA instruction simply places whatever number is in the accumulator into the address specified in the operand. The number in the accumulator will be unaffected, and will still be available for your use.

The STA instruction does not affect any status flags.

A third instruction that is widely used is the JMP instruction. This instruction is just like BASIC’s GOTO statement. Whenever this instruction is executed, the program will JUMP to the address specified and continue processing. The address jumped to MUST contain executable program statements, so take care.

The JMP instruction has two formats:

JMP nn      (ABSOLUTE)
JMP (nn)    (INDIRECT)

As noted above in the discussion of the indirect format, the indirect jump is rarely used, but can be very helpful in special situations.

The absolute jump instruction is the most-used form of the JMP operation code. The address specified can either be a hex or decimal number or a label that is defined elsewhere in the program.

The JMP instruction does not affect any status flags.

Applying the instructions.

Now that we’ve described the LDA, STA and JMP instructions, let’s apply them in a short program.

The program in Figure 2 is essentially a “do-nothing.” It will simply move numbers around in memory until we stop it. Type the program into your computer, remembering to set your origin value (*= in line 140) as described above.

0110 ;
0130 ;
0140        *= $????    ;YOUR ORIGIN!
0150 ;
0170        STA BYTE2   ;TO BYTE2
0180        LDA #7      ;PUT A 7...
0190        STA BYTE3   ;IN BYTE3
0200        JMP PART2   ;JUMP!
0210 ;
0230        STA BYTE4   ;TO BYTE4
0240        JMP PART3   ;AND JUMP
0250 ;
0270        STA BYTE1   ;TO BYTE1
0280        JMP PART1   ;AND JUMP!
0290 ;
0310        STA BYTE5   ;TO BYTE5
0320        JMP START   ;AND JUMP!
0330 ;
0350 ;
0360 BYTE1  .BYTE 1     ;NUMBER 1
0370 BYTE2  .BYTE 2     ;NUMBER 2
0380 BYTE3  .BYTE 3     ;NUMBER 3
0390 BYTE4  .BYTE 4     ;NUMBER 4
0400 BYTE5  .BYTE 5     ;NUMBER 5
0420 ;
0430        .END

            Figure 2.

When you have entered the program and set the origin at Line 140, type ASM and press RETURN. The program will be assembled into memory and is ready to execute.

Before executing the program, let’s look at Figure 2. The first thing you’ll notice in the listing is the presence of COMMENTS. I can’t overemphasize the importance of comments in an assembly language program. They’re simply a MUST whenever you’re writing programs, even for yourself. You’ll notice that some comment lines are simply semicolons with no comment. These are used as separators to break up sections of code. For example, each label group (i.e. START, PART1, PART2, etc.) is a distinct group in the listing.

Remember, comments don’t take up any program space in assembly language, so use them as often as possible!

Line 160—loads the accumulator with the number 7, wiping out whatever was previously in the accumulator. Remember that whenever the accumulator is loaded, the contents of the accumulator before the load will be lost.

Line 190—stores the 7 just loaded into the accumulator at the location labeled BYTE5. This is also a very common operation.

Line 200—jumps to PART2, and execution continues there.

Line 220—labeled PART1, loads the accumulator from the location marked BYTE2.

Line 230—stores the value just loaded from BYTE2 into the location labeled BYTE4.

Line 240—jumps to PART3.

Line 260—labeled PART2, loads a byte from the computer’s random number generator at $D20A. This location gives a random number from 0–255.

Line 270—stores the random number at the location labeled BYTE1.

Line 280—jumps to PART1.

Line 300—labeled PART3, loads the accumulator from the location labeled BYTE4.

Line 310—stores the number just loaded at location BYTE5.

Line 320—jumps to START. This causes the program to loop forever until you press the BREAK key.

Lines 360–400—define the bytes labeled BYTE1-BYTE5. The .BYTE directive is used to assign initial values to the locations. BYTE 1 will contain 1, BYTE2 will contain 2, etc.

Line 410—uses the EQUATE directive to define the address of the label RANDOM. This location is $D20A (53770 decimal). Whenever the label RANDOM is referenced, the computer will use the value $D20A.

Line 430—uses the .END directive to tell the assembler the end of the source code has been reached. This directive is optional, but recommended.

Tracing the action.

Now you can execute the above program and see what it does. Note the address you used in Line 140, With the EDIT prompt on the screen, type BUG and press RETURN. The DEBUG prompt will appear.

Type L followed by the address you used in Line 140 and press RETURN. For example, if your Line 140 reads:

*= $4300

You should type L4300 and press RETURN. The computer will show how your program appears in memory, and should look something like Figure 3.

6000  AD 29 60  LDA $6029
6003  8D 2A 60  STA $602A
6006  A9 07     LDA #$07
6008  8D 2B 60  STA $602B
600B  4C 17 60  JMP $6017
600E  AD 2A 60  LDA $602A
6011  8D 2C 60  STA $602C
6014  4C 20 60  JMP $6020
6017  AD 0A D2  LDA $D20A
601A  8D 29 60  STA $6029
601D  AC 0E 60  JMP $600E
6020  AD 2C 60  LDA $6620
6023  8D 2D 60  STA $602D
6026  4C 00 60  JMP $6000
6029  01 02     ORA ($02,X)
602B  03        ???
602C  04        ???
602D  05 00     ORA #$00
602F  00        BRK
6030  00        BRK

            Figure 3.

Your listing will probably vary from this illustration, which was assembled to location $6000. Note that the BYTE1–BYTE5 values appear in memory from $6029–$602D, and the computer tries to show the bytes as instructions (like ORA #$00). Simply ignore such instructions whenever you know they are misinterpreted data.

If your program is at the proper location, you are ready to watch its execution. Type T followed by the address in Line 140 and press RETURN.

The computer will begin tracing the execution of your program one line at a time. Each instruction will be shown along with its address and the contents of the 6502 registers after the instruction executes. Page 40 of the ATARI Assembler Editor manual describes the trace operation in detail.

At any time in the execution you may stop the program with the BREAK key and examine the BYTE1–BYTE5 locations (note their addresses at assembly time) by using the Dnnnn command, described on page 36 of the Assembler Editor manual.

We are interested in seeing how the instructions we used are executed and how they affect memory. Figure 4 shows the lines of the program as they are executed and the status of the variables BYTE1–BYTE5 after each statement is executed. Note that the value present in RANDOM cannot be predicted and is indicated by “R#.”

                STATEMENT         A  BT. BT. BT. BT. BT.
                                 RG.  1   2   3   4   5
160   START     LDA  BYTE1       01  01  02  03  04  05
170             STA  BYTE2       01  01  01  03  04  05
180             LDA  #7          07  01  01  07  04  05
190             STA  BYTE3       07  01  01  07  04  05
200             JMP  PART2       07  01  01  07  04  05
260             LDA  RANDOM      R#  01  01  07  04  05
270             STA  BYTE1       R#  R#  01  07  04  05
280             JMP  PART1       R#  R#  01  07  04  05
220             LDA  BYTE2       01  R#  01  07  04  05
230             STA  BYTE4       01  R#  01  07  01  05
240             JMP  PART3       01  R#  01  07  01  05
300             LDA  BYTE4       01  R#  01  07  01  05
310             STA  BYTE5       01  R#  01  07  01  01
320             JMP  START       01  R#  01  07  01  01
160             LDA  BYTE1       R#  R#  01  07  01  01
170             STA  BYTE2       R#  R#  R#  07  01  01

                Figure 4.

As stated earlier, this is a “do-nothing” program, and will continue to execute forever unless it is stopped by the user. If you’d like a demonstration of this infinite execution, type G followed by the address in Line 140 and press RETURN. The computer will begin executing the do-nothing at unbelievable speed, and won’t stop until you press BREAK. You won’t see anything happen during the program’s execution, but you can rest assured that the computer is following your instructions to the letter.

Stay tuned.

Next issue, we’ll start digging into more 6502 operation codes, learn to add and subtract, and work with the index registers. Until then, make your own short programs using the instructions we’ve covered. I realize these three aren’t enough to create complex programs, but knowledge of their use is essential to future lessons.

A.N.A.L.O.G. ISSUE 15 / JANUARY 1984 / PAGE 124
A.N.A.L.O.G. ISSUE 74 / JULY 1989 / PAGE 58

Boot Camp

by Tom Hudson

Welcome to the third installment of Boot Camp. As promised last issue, we’re going to cover more 6502 instructions this time, and begin exploring the world of simple mathematical operations.

Before we start with the math operations, though, let’s look at an instruction that will help us during the testing of the programs we write in this column.

BREAKing away.

Remember the do-nothing program from last issue? When we executed it with the “G” (execute program) command with the Assembler Editor cartridge, it ran forever. This is hardly a good way to test programs. Imagine trying to stop the program at a specific instruction with the BREAK key when hundreds of thousands of operations are being executed each second. You can see that this would be nearly impossible.

Luckily for us, the 6502 has a handy instruction called BRK (or BREAK). This instruction does the same thing as the BREAK key on the keyboard when an assembly program is executing. The nice part is that it will stop the program EXACTLY where we want it to stop.

The short program in Figure 1 has a BRK instruction after the load accumulator (LDA) instruction. The accumulator will be loaded with $4F (79 decimal) and the computer will stop. Type the program into your computer and assemble it into memory with the ASM command.

10  *=   $0600   ;START ADDRESS
30  BRK          ;AND STOP
40  .END

                    Figure 1.

After the program is assembled, go to the DEBUG mode with the BUG command. To execute the short program, type:

G 600

The program will execute in a fraction of a second and the computer will return with a display similar to Figure 2.

0602        A=4F X=00 Y=00 P=30 S=00

                    Figure 2.

Note that the accumulator (A) equals $4F. The X, Y, processor status and stack registers are also displayed, but have no significance to us at this time, since we didn’t change them.

Now you can see that the BRK instruction can be helpful in the debugging stage of a program. We will be using it to stop the computer when we want to check the results of certain operations.

Using index registers.

Index registers were mentioned briefly last issue. As you may recall, there are two index registers in the 6502, the X and Y registers. These two registers are built into the 6502 microprocessor chip. Each is made up of 8 bits, allowing a range of values from 0–255.

The first instructions we’ll look at are the LDX (load X) and LDY (load Y) instructions. These instructions are similar to the LDA (load Accumulator) instruction we examined last time. Their formats are:

LDX nn      (ABSOLUTE)
LDX n       (ZERO PAGE)
LDY nn      (ABSOLUTE)
LDY n       (ZERO PAGE)

The LDX and LDY instructions place a specified value in the X or Y register, respectively. For example, the following instruction will load the X register with $3A (58 decimal):

LDX #$3A

The following instruction will load the Y register with the contents of memory location $3F00:

LDY $3F00

The following instruction will load the X register from the page zero location $4D, which is the attract mode counter:


Like the LDA instruction, both the LDX and LDY instructions set the sign and zero flags depending on the number loaded into the register.

Storing the contents of the X and Y registers is just as easy as loading them. The following addressing modes are available with the STX (store X) and STY (store Y) instructions:

STX nn      (ABSOLUTE)
STX n       (ZERO PAGE)
STY nn      (ABSOLUTE)
STY n       (ZERO PAGE)

Unfortunately for us, the designers of the 6502 decided to limit indexed store X and Y instructions to page zero, even though there is a non-zero page load instruction. This is simply something assembly programmers must live with.

Like the STA instruction, the STX and STY instructions do not affect any status flags.

The STX and STY instructions are very easy to use. For example, to store the X register at location $4FFB, simply use the instruction:


In addition to the LDX/LDY and STX/STY instructions, the 6502 provides four more instructions which help the programmer with X/Y operations. These are the TRANSFER instructions.

The transfer instructions allow quick movement of information from one register to another. They are TAX, TAY, TXA and TYA. Two other transfer instructions, TSX and TXS, are used in stack operations, and we’ll look at them in a later article.

The TAX and TAY instructions transfer the contents of the Accumulator (A) to the X or Y register, respectively. The A register is unchanged.

Figure 3 illustrates how the TAX instruction works. Type this short program into your computer and assemble it into memory.

10  *=   $0600  ;START ADDRESS
20  LDA  #$0F   ;PUT $0F IN A
30  TAX         ;PUT IN X, TOO
40  LDA  #$6A   ;PUT $6A IN A
50  TAY         ;NOW PUT IN Y, TOO
60  BRK         ;AND STOP!
70  .END

                    Figure 3.

Line 20 loads the accumulator with $0F (15 decimal).

Line 30 transfers the contents of the accumulator to the X register. At this point both the accumulator and the X register will contain $0F.

Line 40 loads the accumulator with $6A (106 decimal).

Line 50 transfers the contents of the accumulator to the Y register. Now the accumulator and the Y register will contain $6A. The X register will be unchanged.

Line 60 will BREAK the execution of the program.

After the program in Figure 3 is assembled into memory, go to DEBUG mode and execute it by typing:

G 600

After execution, the screen of your computer should look like Figure 4.

0606        A=6A X=0F Y=6A P=30 S=00

                    Figure 4.

You can see that the X register contains $0F and that the A and Y registers contain $6A. Try some different combinations and observe the results.

The two other transfer instructions we are concerned with here are the TXA and TYA instructions. As you may have guessed, these instructions do the opposite of the TAX and TAY instructions. That is, TXA will transfer the contents of the X register to the accumulator, and TYA will move the Y register’s contents to the accumulator.

Here’s a small problem for you to solve using the instructions we’ve covered so far. This is a simple data manipulation operation using the A, X and Y registers and as many locations as necessary.

PROBLEM: Write a program which starts with A=$03, X=$07 and Y=$14. Then write the code necessary to change these registers so that when the program ends, the registers are A=$07, X=14 and Y=$03.

The code necessary to perform this change is only four lines long, and there are many ways to do it. Next issue I’ll show several possible solutions.

This issue, we’ve only shown how to make the X and Y registers contain the values we want. In order to make the X and Y registers do some real work, we’ll need to cover the branch-on-condition intructions. These will be discussed next issue, along with X and Y register indexing techniques.

It all ADDs up.

I’m sure that just about every person reading this column by now wants to start working with something more interesting than loading and storing bytes, right? Well, let’s take a break from all that admittedly dull stuff and get on with something fun, actual addition.

We’ll start out with some simple addition, working with values from 0–255. This is known as single-byte integer arithmetic, and is the simplest kind of math on the 6502.

Why only integers from 0–255? Remember that all arithmetic operations must be processed through the accumulator, or A register. The accumulator is made up of only 8 bits, and can’t hold any number greater than 255. The accumulator doesn’t know what a decimal point is, either, so we are limited to integers for the time being.

Binary or BCD?

The 6502 microprocessor has the option of performing arithmetic instructions in two different modes, BINARY and BINARY CODED DECIMAL (BCD). Let’s look at how both these systems work.

Binary arithmetic, as we have noted before, produces numbers from 0–255 in one byte. All 8 bits are used for the number. These numbers can be considered either signed or unsigned by the programmer, but they are handled the same by the computer. Since all 8 bits are used to represent the number, the value of a byte is simply the byte’s decimal contents.

BCD arithmetic, on the other hand, is a more human approach to computer math, and easier to use in input-output operations.

In BCD math, the byte is split into two 4-bit sections, or NYBBLES. Each nybble contains one decimal number, from 0–9. With this system, each byte contains two decimal numbers, allowing easy base-10 number storage. Of course, the BCD numbering system requires more storage than binary, since the value of a byte can now only range from 0–99, rather than 0–255. The nice thing about BCD is that when looking at the hexadecimal representation of the byte, you see the decimal value of the byte. For example, $56 is 56 decimal.

We’ll cover BCD math later in this series, when we get into screen I/O. For now we’ll stick with binary math. Even though it may seem more difficult, binary math is much more important at this early stage.

Getting into BINARY.

The 6502 can handle two different types of math, so how does it know which one you want to use? The answer lies in a single-bit flag in the processor status register, called the DECIMAL MODE flag.

The decimal mode flag has two states. When set (1), the decimal mode is selected. When cleared (0), the binary mode is selected. This flag is extremely important! The following example illustrates this fact.

Let’s say you want to add two binary numbers, $23 and $18. A normal binary add would give a result of $3B.

What if the decimal mode flag was set by mistake? The add would give a result of $41, the sum of 23 and 18. If your program adds or subtracts numbers with the decimal mode incorrectly set, the results can be very confusing. Moral: ALWAYS know the setting of the decimal mode flag.

For our purposes, until further notice, we will always CLEAR the decimal mode with the CLD (clear decimal mode) instruction. The format of this instruction is:


This is a very simple instruction, but easy to forget. If you have trouble remembering things (like myself), I suggest that you tape an appropriate message to your monitor, computer, forehead, etc. This will save an incredible amount of debugging time.

Important: When writing assembly subroutines for BASIC programs, you must clear the decimal mode if you’re doing any arithmetic in the subroutine. BASIC uses the floating-point arithmetic package built into the computer, which sets the decimal mode. The first time I wrote a BASIC assembly subroutine with math, it took me two days to find the problem. Once again, write a note.

Now that I’ve warned you about the evils of decimal mode ignorance, let’s get on with some actual addition!

Add ’em up!

First we’ll cover single-byte additions, the simplest kind. These types of additions are sufficient for general counters, changing color registers, or any operation in which the result will not exceed 255. The 6502 has only one add instruction, ADC add with carry). This instruction has the following formats:

ADC nn      (ABSOLUTE)
ADC n       (ZERO PAGE)

The ADC instruction adds the number at the memory location specified in the operand to the accumulator and places the result in the accumulator. Depending on the result, the 6502 will alter the sign, overflow, zero and carry flags.

Let’s look at a simple single-byte addition operation, using the immediate format. We will add 23 and 14 decimal and place the result in a location called ANSWER. Figure 5 shows the code needed to perform this operation.

10  LDA  #23     ;PUT 23 IN A
30  ADC  #14     ;AND ADD 14 TO IT!

                    Figure 5.

The first line in Figure 5 places the number 23 in the accumulator. Simple enough, right?

The second line introduces a new operation code, CLC (clear carry). The CLC instruction places a zero in the 6502 carry flag. This is an important instruction to remember, and should always be present in single-byte addition operations.

Why is the CLC instruction so important? The answer lies in the structure of the 6502 ADC instruction. Remember, ADC means “add with CARRY.” Whenever an addition is performed on the 6502, the result is set to ACCUMULATOR + OPERAND + CARRY.

Here’s an example of what can go wrong when the programmer is not sure of the contents of the carry flag. Let’s say the carry happens to be set to 1. Fred the careless programmer wants to add 1 + 1 to verify that the answer is indeed 2, so he writes the following code:

LDA #1
ADC #1

When Fred runs the program, he is astounded to find that one plus one is three! If Fred had only inserted a simple CLC instruction, his life would have been much happier, as well as more accurate. Suffice it to say that in any single byte addition operation, you should always clear the carry flag BEFORE the ADC instruction.

The third line adds 14 to the accumulator, giving a result of 37 ($25 hex), which is, of course, correct. You can use any of the 8 addressing modes with the ADC instruction. All produce the same results, they just get their data with different methods.


Earlier I mentioned the flags altered by the ADC instruction. These are the sign, overflow, zero and carry.

The SIGN flag indicates the sign of the result. The contents of the accumulator’s 7th bit are placed in this flag. If the flag is zero after an add, the result is considered positive. A one in this flag indicates a negative result. See Issue 13’s Boot Camp for an in-depth discussion of the sign flag.

The OVERFLOW flag is set to the exclusive-or of bits 6 and 7 of the result. The overflow flag is rarely used, but it’s a good idea to know what happens to it during processing.

The ZERO flag is set to one if the result of the add was zero, and is set to zero if the result was NOT zero.

The CARRY flag is set to one if the result of the add is greater than 255. This flag is important in multi-byte addition (for numbers greater than 255). We’ll be examining multi-byte operations next issue.

All these flags are important in the computer’s decision-making process. Depending on the result of an operation, the programmer can go to other parts of the program using comparison and branch instructions (similar to IF/THEN statements in BASIC). We will also cover these operations next issue.

Starting with subtraction.

Now that we’ve covered simple addition, let’s do a little subtraction. Subtraction is just as easy as addition, with a couple of simple differences. Shown below are the formats of the 6502 subtraction instruction, SBC (subtract with borrow). You will notice that the SBC has the same formats as the ADC instruction.

SBC nn      (ABSOLUTE)
SBC n       (ZERO PAGE)

The SBC instruction subtracts the number at the memory location specified in the operand from the accumulator and places the result in the accumulator. Like the ADC instruction, the sign, overflow, zero and carry flags will be altered.

For the time being we’ll work only with single-byte subtractions, since they’re the easiest to understand. We will subtract 14 from 23 decimal and place the answer in a location called ANSWER. Figure 6 shows the code needed for this operation.

10  LDA  #23      ;PUT 23 IN A
20  SEC           ;SET CARRY FOR SUB.
30  SBC  #14      ;AND SUB 14 FROM IT!

                    Figure 6.

The first line in Figure 6 simply places the number 23 in the accumulator.

The second line introduces another new operation code, SEC (set carry). This instruction sets the carry flag to one. Like the CLC instruction in single-byte additions, the SEC instruction is a must for all single-byte subtractions.

The SBC instruction is strange in that it subtracts the contents of the memory byte indicated in the operand and the complement of the carry flag from the accumulator, placing the result back in the accumulator. Here’s an example. Let’s say the accumulator contains 4 decimal, and we’re subtracting 3 decimal from this. Assume the carry flag is clear (0). The computer will subtract 3 from 4, then subtract 1 from this (the complement of the carry flag), giving a result of zero.

By setting the carry to 1, we make sure that the subtraction of our two numbers is unaffected by the subtraction of the carry’s complement, which in this case is zero. The carry flag is used as a borrow in subtraction and not necessary in single-byte operations.

The third line of Figure 6 performs the subtraction. The result will be 23-14-0 or 9.

The last line of the program places the result in the location labeled ANSWER. The result will still be in the accumulator.

Like the ADC instruction, the SBC instruction works the same with all 8 addressing modes available with the instruction. The SBC instruction affects the 6502 status flags in the same way as ADC.

Applying what we’ve covered.

We’ve now progressed to the point where we can write simple math programs using addition and subtraction. Let’s write a program to solve the equation:

4+5+34-(8-7) = ?

Unlike BASIC, we can’t simply code this equation right into our computer. In assembly language, it’s up to the programmer to figure out the procedure needed to obtain the result and code it.

Let’s look at the equation shown above. In any mathematical equation, the expressions in parentheses must be solved before proceeding with the rest of the equation. If we simply solve the equation from left to right, we will get an incorrect answer:

4+5+34-8-7 = 28

In order to solve the equation correctly, we must solve it as follows:

4+5+34-(1) = 42

Now that we know how to proceed, let’s write a section of code to solve the equation. Figure 7 shows one possible solution.

10  *= $0600
20  CLD          ;NO DECIMAL MODE!
30  LDA #8       ;PUT 8 IN A
40  SEC          ;SET CARRY,
50  SBC #7       ;SUBTRACT 7 FROM 8
70  LDA  #4      ;NOW PUT 4 IN A
80  CLC          ;CLEAR CARRY,
90  ADC  #5      ;ADD 4 & 5
0100  CLC         ;CLEAR CARRY AGAIN
0110  ADC  #34    ;ADD 34 TO LAST #
0120  SEC         ;SET CARRY,
0150  BRK         ;ALL DONE!
0155 ;
0160 HOLD   *=*+1 ;TEMP. HOLD AREA
0100  .END

                    Figure 7.

Line 10 tells the assembler to place the program at location $0600, a safe location in computer memory.

Line 20 clears the decimal mode, to avoid any accidental BCD results.

Line 30 places the number 8 in the accumulator.

Line 40 sets the carry flag to get ready for a single-byte subtract.

Line 50 subtracts 7 from 8, leaving the result in the accumulator.

Line 60 stores the result of the expression in parentheses at a memory location called HOLD. This is done because we will need this number in a moment.

Line 70 places a 4 in the accumulator in order to start solving the first part of the equation.

Line 80 clears the carry flag to get ready for a single-byte add.

Line 90 adds 5 to the accumulator, leaving the result in the accumulator.

Line 100 clears the carry again for the next addition. In this case, the CLC is not necessary since we know the previous add did not exceed 255, but it’s a good idea to get into the CLC habit.

Line 110 adds 34 to the accumulator, once again leaving the result in the accumulator.

Line 120 sets the carry flag for the next subtract operation.

Line 130 subtracts the result of the expression in parentheses (stored in HOLD) from the accumulator and gets the final result.

Line 140 places the final result in the memory location called ANSWER.

Line 150 BREAKS the program execution. At this point the accumulator should equal 42 decimal ($2A hex).

Lines 160 and 170 set up the one-byte storage areas, HOLD and ANSWER. The assembler directive *=*+1 simply tells the assembler to reserve one byte for each label.

Line 180 tells the assembler that the end of the source code has been reached.

After this code is typed in and assembled into memory, execute the program from DEBUG mode with the command:

G 600

The program will execute very quickly and return with a screen similar to Figure 8.

0618        A=2A X=0F Y=6A P=31 S=80

                    Figure 8.

Note that the accumulator contains $2A (42 decimal). This is the correct answer to our equation. This example shows how you can perform simple add-subtract operations in assembly language. Of course, we’re limited to one-byte integers, but we’ll soon exceed these limitations.

Until next time…

Try your own problems until you’re proficient with the 6502 add and subtract operations. Try using the various addressing modes to see how they work. In order to learn assembly language (or any other language, for that matter), you’ll have to roll up your sleeves and dig in.

Next issue will cover a lot of material, including the assembly equivalent of the BASIC IF/THEN statement, index register usage and multi-byte addition and subtraction.

A.N.A.L.O.G. ISSUE 16 / FEBRUARY 1984 / PAGE 108
A.N.A.L.O.G. ISSUE 75 / AUGUST 1989 / PAGE 36

Boot Camp

by Tom Hudson

I hope all Boot Camp readers have been practicing their addition, subtraction and X-Y register manipulations, because we’re moving on to bigger and better things. We’ll be dabbling with comparisons, branching and indexing this month, giving you even more tools to work with in assembly language.

First things first.

Last month, I gave you a simple data manipulation problem:

PROBLEM: Write a program which starts with A=$03, X=$07 and Y=$14. Then write the code necessary to change these registers so that when the program ends, the registers are A=$07, X=14 and Y=$03.

As most readers know, there are hundreds of ways to solve any programming problem, and this one is no exception. The objective is not just to solve the problem, but to do it in the most efficient way possible. I’ll show you two ways to solve the above problem, and discuss the pros and cons of each.

10       STA AHOLD
20       STX XHOLD
30       STY YHOLD
40       LDA XHOLD
50       LDX YHOLD
60       LDY AHOLD
70       BRK
80 AHOLD *=*+1
90 XHOLD *=*+1
0100 YHOLD *=*+1
0110       .END
            Figure 1.

Figure 1 shows an easy-to-understand, straight-forward solution to our problem. It stores each register in hold areas, then loads the registers from the appropriate hold area. Lines 10–60 perform the register exchange function, and Lines 80–100 set up the one byte storage areas.

This solution is very easy to understand by simply looking at it, and is a solution that most beginners would probably use. However, from a memory usage standpoint, this routine requires 22 bytes. We can do the same exchange in only 10 bytes with the routine in Figure 2.

10       STY HOLD
20       TAY
30       TXA
40       LDX HOLD
50       BRK
60 HOLD  *=*+1
70       .END
        Figure 2.

As you can see, this code uses two of the transfer instructions, TAY and TXA, to eliminate two of the temporary storage areas used in Figure 1. Since the transfer instructions use only one byte versus the six bytes for a LDA and STA instruction, this version of the exchange code uses less than half the memory of Figure 1.

Although we gain memory savings by using the code in Figure 2, we do lose some readability. Let’s say you use the routine in Figure 1 in a program and don’t look at the program for a year. If you need to make a change, it’s easy to see what the routine does. The code in Figure 2 may not be so easy to decipher. Since you never know when you’ll have to make a change to a program, it’s a very good idea to COMMENT your code heavily, in order to let yourself know what you were doing.

What if…?

The great thing about computers is that they can perform calculations very quickly. Without the ability to make decisions, though, a computer would be almost useless.

For this reason, the 6502 microprocessor in your Atari is equipped with 14 comparison instructions. These instructions are designed to test the values contained in the Accumulator, X and Y registers. Each of these instructions compares the desired register with the memory byte specified in the operand and sets the 6502 status flags accordingly.

The Accumulator comparison instructions are:

CMP nn      (ABSOLUTE)
CMP n       (ZERO PAGE)

The X register comparison instructions are:

CPX nn      (ABSOLUTE)
CPX n       (ZERO PAGE)

The Y register comparison instructions are:

CPY nn      (ABSOLUTE)
CPY n       (ZERO PAGE)

All comparison instructions affect only three status flags. These are the SIGN, ZERO and CARRY flags.

What happens in a comparison? Internally, the computer will subtract the operand byte from the register contents, set the status flags just like a subtract, but will NOT alter the register. Simple, right? Let’s look at a few examples.

Assume the accumulator contains $45, and we execute the instruction: CMP #$31

Inside the computer, the faithful 6502 would subtract $31 from $45 and obtain the following result:

$45 = 0 1 0 0 0 1 0 1
$31 = 0 0 1 1 0 0 0 1
      0 0 0 1 0 1 0 0 = $14

Since the result is not zero, the ZERO flag is set to 0. The SIGN flag is set to bit 7 of the result, which is 0. The CARRY flag is set to 1, since no borrow was required. The CARRY flag is always the inverse of the borrow status.

By looking at the result of this comparison, we can say that the accumulator is NOT EQUAL to $31, since the result of the compare was not zero. We can also say that the accumulator is GREATER THAN $31, since the CARRY flag is set.

Assume the X register contains $7F and we want to compare it with $7F. We would use the following instruction:

CPX #$7F

The subtract operation inside the 6502 would look like:

$7F = 0 1 1 1 1 1 1 1
$7F = 0 1 1 1 1 1 1 1
      0 0 0 0 0 0 0 0 = $00

The result is zero, so the ZERO flag is set to 1. The 7 bit of the result is 0, so the SIGN flag is set to 0. No borrow was required, so the CARRY flag is set to 1.

After this comparison is complete, we can conclude that the register is EQUAL to $7F because the ZERO flag is set.

Assume the Y register contains $ 12 and we want to compare it to $4E. We would use the following instruction:

CPY #$4E

The subtract operation inside the 6502 would look like:

$12 = 0 0 0 1 0 0 1 0
$4E = 0 1 0 0 1 1 1 0
      1 1 0 0 0 1 0 0 = $C4

Before you get confused with the above binary operation, remember how subtraction works in base 10. If the number being subtracted (minuend) is larger than the subtrahend, a BORROW is necessary from the next higher digit. This case of the compare requires a borrow.

In this case, the ZERO flag will be set to zero, indicating a non-zero result. The SIGN flag will be set to the contents of bit 7 of the result, which is a 1. The CARRY flag will be set to 0, the inverse of the borrow status.

From these flags, we can conclude that the Y register is less than $4E because the CARRY flag is cleared (0).

That’s all there is to using the compare instructions. They work the same way, regardless of the addressing mode.

Comparisons are just about worthless without the ability to do something based on the result of a comparison, so next we’ll look at the 6502 branch-on-condition instructions.

Branches conveniently located.

So far, the only means of transferring program execution we’ve looked at has been the JMP (JUMP TO LOCATION) instruction. Now we’ll look at the 8 branch-on-condition instructions used by the 6502. The 8 formats are:

BCS n       (BRANCH IF CARRY = 1)
BCC n       (BRANCH IF CARRY = 0)
BEQ n       (BRANCH IF ZERO = 1)
BNE n       (BRANCH IF ZERO = 0)
BMI n       (BRANCH IF SIGN = 1)
BPL n       (BRANCH IF SIGN = 0)
BVC n       (BRANCH IF OFLOW = 0)
BVS n       (BRANCH IF OFLOW = 1)

Observant readers may note that the operand of the branch instructions consists of only one byte. As you may recall, the JMP instruction was able to jump to any memory location because its operand consisted of two bytes. Branches are another story altogether.

With only one byte in their operands, branch instructions are only able to branch backward 128 bytes or forward 127 bytes. This is known as RELATIVE addressing. Fortunately, most assemblers will calculate the distance of a branch for you. However, if a branch distance is more than the branch limit, you’ll have to restructure your branch by using a JMP or multiple branch instructions.

Let’s look at a few typical branch applications. Figure 3 shows the comparison/branch structure for the condition:

        CPX #7
        BEQ START

        Figure 3.

As you can see, the CPY instruction is followed by a branch instruction. In this case, if the X register is EQUAL TO 7, the program will go to the location labeled START.

For the condition:


we would use the code in Figure 4.

        CMP #52
        BNE POINTA

        Figure 4.

Multiple conditions may require some extra effort, such as the condition:


The code for this condition is shown in Figure 5.

        CPY #242
        BEQ MAIN
        BMI MAIN

        Figure 5.

These multiple conditions are really quite easy, you just have to use the instructions provided.

The nice thing about branch instructions is that you don’t have to use them after a compare instruction. You can place them anywhere in a program. For example, in addition or subtraction instructions, which set the status flags just like a compare, a zero result in an operation will set the proper branch flags. Look at the following code:

        LDA BYTE1
        SBC BYTE2
        CMP #0
        BEQ ZERO

The CMP #0 instruction is not necessary, since the SBC operation sets the flags for us! The optimized code would look like:

        LDA BYTE1
        SBC BYTE2
        BEQ ZERO

Remember, branches can be done anywhere the status flags are altered, giving you incredible flexibility in program design.

“I wish I was indexing…”

Now we can start combining some of our new programming tools to do meaningful work. With the added function of branching, we can start using the X and Y registers as counters and indexes.

Indexing was discussed in the second installment of Boot Camp in ANALOG #14, so I won’t repeat all the basics. The first example I’ll show is the use of the X and Y registers as counters.

Let’s say we want to execute a section of code ten times. Since the program uses the Accumulator and X register in the loop, we’ll use the Y register as a counter to control the loop.

In order to use the X and Y registers as indexes, we have been given four instructions:

INX         (INCREMENT X BY 1)
INY         (INCREMENT Y BY 1)
DEX         (DECREMENT X BY 1)
DEY         (DECREMENT Y BY 1)

These four instructions simply add or subtract one from the X or Y registers, allowing you to use the registers as indexes easily. These registers affect the ZERO and SIGN flags.

Figure 6 shows the code necessary to perform a loop ten times.

        LDY #10
LOOP     .
        BNE LOOP

        Figure 6.

This is a very simple counter example. Note that, in this case, we have set up the Y register as a countdown counter, from 10 to 0. After the DEY instruction is executed, we BNE LOOP. If the Y register decremented to zero, the program will not take the branch, and the loop is finished. No CPY #0 instruction was needed, since the DEY instruction set the zero flag for us.

We could have used the Y register as a count-up counter, from 0 to 10, as in Figure 7.

        LDY #0
LOOP     .
        CPY #10
        BNE LOOP

        Figure 7.

Note that in the count-up example an extra compare is needed (CPY #10) to see the Y register has reached ten yet. If it has not, the program will take the BNE LOOP branch to continue looping.

Using the X and Y registers for indexing is similar to using them for counters. The main difference is that the register is used inside the loop to point to varying places in memory. Figure 8 shows an example of indexing that will copy the six bytes of TABLE1 into TABLE2.

10       LDX #5
30       STA TABLE2,X
40       DEX
50       BPL COPY
60       BRK
70 TABLE1 .BYTE 10,20,30,40,50,60
80 TABLE2 *=*+6
90       .END

        Figure 8.

The program in Figure 8 begins with the X register set at 5. Remember, when referencing individual elements in a table, the indexes for the elements range from zero to one less than the number elements. In this case, the element numbers range from 0–5. As the loop (labeled COPY) executes, each byte of TABLE1 will be moved to TABLE2. This looping will continue until the X register is decremented past zero, where it will equal 255 due to wraparound. At this point, the SIGN flag will be 1, indicating a negative number. When this happens, the BPL COPY instruction will be ignored and the looping will end. Try assembling this routine into memory and tracing its execution.

What if we want to copy TABLE1 into TABLE2 in REVERSE ORDER? This is a nifty little problem that will help you understand X-Y indexing more thoroughly. Try writing the code necessary, using as many memory locations as necessary. Next issue, I’ll show a way to do this with only three changes to Figure 8.

No more time.

I had wanted to cover multi-byte math this issue, but due to space limitations I’ll have to delay this until next issue. Until then, play around with comparisons and branching, and try to find a solution to the above problem.

A.N.A.L.O.G. ISSUE 17 / MARCH 1984 / PAGE 96
A.N.A.L.O.G. ISSUE 76 / SEPTEMBER 1989 / PAGE 44

Boot Camp

by Tom Hudson

In this month’s Boot Camp, we’re going to finish our discussion of X and Y register indexing and become proficient in multi-byte addition.

Regular Boot Camp readers will be happy to know that the introductory material will be completely covered in the next few issues. After that, we can start applying all the 6502 instructions to useful subroutines and full-scale programs!

Solution #2.

I hope everyone at least tried to solve the indexing problem presented last issue. This problem asked readers to write the code necessary to copy the contents of the 6-byte TABLE1 to TABLE2 in reverse order. This little brain-teaser is an excellent opportunity to gain more experience with the 6502 index registers.

Figure 1 shows the code necessary to copy TABLE1 to TABLE2 in normal order. This figure was shown last month.

10      *= $600
20      LDX #5
40      STA TABLE2,X
50      DEX
60      BPL COPY
70      BRK
80 TABLE1 .BYTE 10,20,30,40,50,60
90 TABLE2 *=*+6
0100      .END

            Figure 1.

I told you that only three changes to this code would allow it to copy the table in reverse order. The changed code is shown in Figure 2.

10      *= $600
20      LDX #5
30      LDY #0
50      STA TABLE2,Y
60      INY
70      DEX
80      BPL COPY
90      BRK
0100 TABLE1 .BYTE 10,20,30,40,50,60
0110 TABLE2 *=*+6
0120        .END

            Figure 2.

How does it work? Let’s step through the code and see.

Line 20 sets the X register to 5. This register will be used to point to different parts of TABLE1. With the index starting at 5, the register will point to the last byte of TABLE1.

Line 30 sets the Y register to 0. This register will be used to point to varying places in TABLE2. Unlike the X register, the Y register will start pointing at the first byte of TABLE2.

Lines 40–80 perform the table data move function.

Line 40 loads the accumulator with a byte from TABLE1, indicated by the X register.

Line 50 stores the byte just loaded into a byte of TABLE2, indicated by the Y register.

Lines 60 and 70 are the heart of this routine. Note that the Y register is INCREMENTED each time the BACKWD loop is executed, while the X register is DECREMENTED. Figure 3 shows the X and Y register contents for each iteration of the loop.

        TABLE1  TABLE2
         (X)     (Y)
          5       0
          4       1
          3       2
          2       3
          1       4
          0       5

           Figure 3.

By looking at Figure 3, you can see that the 6th byte (5+1) of TABLE1 will be moved to the 1st byte (0+1) of TABLE2, the 5th byte of TABLE1 to the 2nd byte of TABLE2, and so on.

Line 80 loops back to the BACKWD label if the X register is positive (0–127). Once the X register is decremented past 0, it “wraps around” to binary 1111111, or -1 decimal, and the program stops at the BRK instruction in line 90.

Line 100 sets up the initial values contained in TABLE1.

Line 110 tells the assembler to reserve 6 bytes for TABLE2. Remember, the “*=*+” directive allows you to set aside any number of bytes for tables, working areas, etc.

As a further example of the “reverse table” problem, Figure 4 shows the BASIC equivalent of the assembly code in Figure 2.

10 DIM TABLE1(5),TABLE2(5)
15 TABLE1(0)=10:TABLE1(1)=20:TABLE1(2)=30:TABLE1(3)=40:TABLE1(4)=50:TABLE1(5)=60
20 X=5
30 Y=0
40 A=TABLE1(X)
50 TABLE2(Y)=A
60 Y=Y+1
70 X=X-1
80 IF X>=0 THEN 40
90 END

            Figure 4.

Note that, in BASIC, it is necessary to initialize the TABLE1 array (line 15). This does the same thing as the .BYTE directive in line 100 of Figure 2.

This should give you a good idea of how indexing works. If you still have trouble, re-read last month’s discussion of indexing and try developing your own simple problems.

Math revisited.

As promised last month, we’re going to start looking at multi-byte math operations, both in binary and binary coded decimal (BCD).

Why do we want to bother with multi-byte math? If you’re only working with numbers from 0–255, then single-byte math is fine. But what happens when you’re writing the ultimate game program and need to show scores into the hundreds of thousands of points? Multi-byte math is the answer.

The simplest form of multi-byte math is probably two-byte address storage. The 6502 can address 65536 (or 216) bytes of memory. Observant readers will note that this number will easily fit into two eight-bit bytes.

You’ve probably encountered two-byte addresses in BASIC. For example, if you need to know where your computer’s display list is located, you can use the BASIC command:


How does this work? Normally, we think of a byte as having bit values from 1 to 128 (left to right). In order to represent larger numbers, we add a second high order byte to the first low order byte. The high order byte contains bit values from 28 (256) to 215 (32768). This relationship is shown in Figure 5.

 3 1
 2 6 8 4 2 1
 7 3 1 0 0 0 5 2     1
 6 8 9 9 4 2 1 5     2 6 3 1
 8 4 2 6 8 4 2 6     8 4 2 6 8 4 2 1
┌─┬─┬─┬─┬─┬─┬─┬─┐   ┌─┬─┬─┬─┬─┬─┬─┬─┐
└─┴─┴─┴─┴─┴─┴─┴─┘   └─┴─┴─┴─┴─┴─┴─┴─┘
   HIGH BYTE            LOW BYTE

            Figure 5.

In order for BASIC to reconstruct the number, it must multiply each byte by the value of its lowest-order bit. In the 2-byte case, the low-order byte is multiplied by 1, and the high-order byte is multiplied by 256. When the resulting numbers are added together, you have the value of the 2-byte number.

Figure 6 shows some decimal numbers along with their two-byte binary equivalents.

───────   ─────────   ────────
  128      00000000   10000000
  255      00000000   11111111
  256      00000001   00000000
  257      00000001   00000001
  511      00000001   11111111
  512      00000010   00000000
65534      11111111   11111110
    0      00000000   00000000

            Figure 6.

You don’t have to stop with two bytes, either. For example, by using 3 bytes, you can store numbers up to 224, or 16,777,216. 4 bytes will give up to 232, or over 4 billion, and so on.

CARRYing on.

How is multi-byte math handled in 6502 assembly language? It’s the same as single-byte, but with one difference. In multi-byte addition, the CARRY flag is used to handle carries and borrows.

You’ve used carries and borrows all your life, but you probably don’t think about them. Consider the addition of 13+9. When you add 3+9, you get 12. Since 12 is greater than the maximum digit value of 9, you place the units portion (2) in the units position of the result and CARRY the 10 to the next digit. This adds to the tens digit of 13, giving 20. When this is added to the units portion calculated earlier, we get a result of 22.

In subtraction, if you’re subtracting 7 from 20, 7 is larger than 0, so a borrow from the next digit is necessary. The 2 in the tens position becomes a 1, and the 7 is subtracted from the borrowed 10, giving a result of 3 in the units position. The final result is 13.

These same principles apply in multi-byte math operations. The only difference is the base we are operating in. As you recall from the Issue #15 Boot Camp, the CARRY flag is set to 1 if the result of an addition operation is greater than 255. In single-byte addition, we always CLEAR the CARRY flag before the ADC operation. In multi-byte adds, the CARRY is only cleared before the FIRST addition operation. This prevents any unwanted carries from giving incorrect results.

    HIGH        LOW
    ────        ───
  00000000   11111111  (255)
+ 00000000   00000001  (  1)
  ────────   ────────  ─────
  00000001   00000000  (256)

        Figure 7.

Figure 7 shows how carries work in binary. When 1 is added to 255, the resulting value of 256 is too large to fit in one byte. The low-order byte wraps around to zero and the carry flag is set. The high-order bytes are then added, along with the carry flag (1). This gives the high-order result a value of 1. Remember that the high-order byte of a two-byte value is always multiplied by 256. This gives us a final value of (1 × 256) + = 256.

Figure 8 shows the code necessary for this addition operation in 6502 assembly code.

01    *=  $600
10    CLD           ;BINARY MODE
20    LDA #255      ;GET 255,
30    CLC           ;FIRST ADD!
40    ADC #1        ;ADD 1 TO 255
60    LDA #0        ;GET OP1 HIGH
70    ADC #0        ;ADD OP2 HIGH,
90    BRK           ;ALL DONE!
0100 RESLO *=*+1    ;LOW RESULT BYTE
0120  .END          ;END OF ASSEMBLY

                Figure 8.

Line 10 clears the decimal mode, to make sure we’re working with binary numbers.

Line 20 loads 255, the low byte of the first operand, into the accumulator.

Line 30 clears the carry flag for the first add operation. ALWAYS remember to clear the carry flag for the first add of a multi-byte add operation.

Line 40 adds 1, the low byte of the second operand, to the low byte of the first operand. This operation will leave a zero in the accumulator, and the carry flag will be set (1).

Line 50 stores the result of the low byte add in the location labeled RESLO.

Line 60 loads 0, the high byte of the first operand, into the accumulator.

Line 70 adds 0, the high byte of the first operand, to the high byte of the second operand. Note that we DID NOT clear the carry before this operation, since we want the carry status to be taken into account for all adds after the first one. In this case, with the carry set, our result is 0+0+1, or 1.

Line 80 stores the result of the high byte addition in the location labeled RESHI. Line 90 stops the execution of the program with the BRK (BREAK) instruction.

Lines 100 and 110 set up the RESLO and RESHI storage areas. Note that these areas are set up with the low byte first, followed by the high byte. This is the standard 6502 storage format for two-byte values, and it’s a good idea to get accustomed to it.

Multi-byte subtraction also works the same way as the single-byte version, except that the first subtract operation is preceded by a SEC (SET CARRY) instruction. Figure 9 shows an example of the three-byte subtract operation $4203F5 - $2E45FF. When finished, the result will be placed in RESL (LOW ORDER), RESM (MIDDLE) and RESH (HIGH ORDER). Try executing this code and observe that the resulting number is $13BDF6.

01    *=  $600
10    CLD               ;BINARY MODE
20    LDA #$F5          ;GET OP1 LOW
30    SEC               ;FIRST SUBTRACT
40    SBC #$FF          ;SUB OP2 LOW
50    STA RESL          ;SAVE LOW RESULT
60    LDA #$03          ;GET OP1 MIDDLE
70    SBC #$45          ;SUB OP2 MIDDLE
80    STA RESM          ;SAVE MID RESULT
90    LDA #$42          ;GET OP1 HI
0100    SBC #$2E        ;SUB OP2 HIGH
0120    BRK             ;ALL DONE!
0130 RESL *=*+1         ;LOW RESULT BYTE
0140 RESM *=*+1         ;MID RESULT BYTE
0150 RESH *=*+1         ;HIGH RESULT BYTE
0160 .END               ;END OF ASSEMBLY

                Figure 9.

What about the decimal mode?

Remember how the 6502 uses two different methods of storing numbers? We have been looking at multi-byte operations in the binary mode. Multi-byte decimal mode math works exactly like binary, but the data is stored in binary-coded decimal (see Issue 15 for a discussion of BCD). All you have to do to select BCD math is use the SED (SET DECIMAL MODE) instruction at the start of your program. You can return to binary math at any time by using the CLD (CLEAR DECIMAL MODE) instruction. Now that we’ve looked at the basics of multi-byte math, let’s make a few generalizations about the process.

10    LDA BYTE1A        ;BYTE 1
15    CLC               ;ON FIRST ONLY
20    ADC BYTE1B
30    LDA BYTE2A        ;BYTE2
35    ADC BYTE2B
45     .
50     .                ;ETC,
55     .
60    LDA BYTEnA        ;BYTE n
65    ADC BYTEnB

                Figure 10.

Figure 10 shows the procedure for a multi-byte add, where n is the number of bytes in the value. Note that the CLC instruction is used only for the first add of the group.

10    LDA BYTE1A        ;BYTE 1
15    SEC               ;ON FIRST ONLY!
20    SBC BYTE1B
30    LDA BYTE2A        ;BYTE2
35    SBC BYTE2B
45     .
50     .                ;ETC.
55     .
60    LDA BYTEnA        ;BYTE n
65    SBC BYTEnB

                Figure 11.

Figure 11 shows the procedure for a multi-byte subtract, where n is the number of bytes in the value. The subtract procedure is similar to the add in that the SEC instruction is only used for the first subtract.

What happens when you want to add or subtract two values of different length, such as adding a one-byte value to a three-byte value? Figure 12 shows how this is done.

10    *= $600
15    CLD               ;BINARY NODE
20    LDA SCORE         ;GET SCORE LO
25    CLC               ;CLEAR 1ST TIME
35    STA SCORE         ;SAVE SCORE LO
40    LDA SCORE+1       ;GET SCORE MID
45    ADC #0            ;ADD DUMMY ZERO
60    ADC #0            ;ADD DUMMY ZERO
70    BRK               ;ALL DONE!
75    POINTS *=*+1      ;ONE BYTE
80    SCORE *=*+3       ;THREE BYTES
85    .END              ;END OF ASSEMBLY

                Figure 12.

The program in Figure 12 adds the one-byte value POINTS to the three-byte value SCORE. In this example the three bytes of SCORE are not individually labeled, but are referenced as SCORE (LOW ORDER), SCORE+1 (MIDDLE) and SCORE+2 (HIGH ORDER). The +1 and +2 added to the label SCORE simply indicate that the assembler is to add 1 and 2 to the address of SCORE for these operations. For example, if SCORE is located at $4000, SCORE+1 is address $4001, and SCORE+2 is $4002. If we had indicated SCORE-1, the address used would be $3FFF.

By looking at Figure 12, you will see that the first ADC operation adds the low byte of SCORE to POINTS, placing the result in SCORE. This is a typical first add, with a CLC operation before the addition.

The second and third adds are special in this case. Since POINTS is a one-byte field and SCORE is a three-byte field, we must complete the last two additions as if POINTS were three bytes long. As you can see from the example, the second and third adds simply add zeros to the second and third bytes of SCORE. This ensures that any carries out of the low bytes of SCORE will be properly taken care of. By adding zeros, the only factor affecting the result is the carry flag.

The challenge.

No tutorial would be complete without a challenge to the readers. For next month try to solve the following problems.

PROBLEM 1: Subtract the two-byte field WITHD (withdrawals) from the three-byte field OLDBAL (old balance), placing the result in the three-byte field NEWBAL (new balance). All fields should be stored in BCD, with standard data storage formats. Start with OLDBAL = 108673 and WITHD = 4285. After the subtraction is complete, check NEWBAL to be sure it contains 104388.

PROBLEM 2: Start with three 10-byte tables. Label these tables TABLE1, TABLE2 and TABLE3. Initialize TABLE1 and TABLE2 as follows:

TABLE1 .BYTE $10,$18,$40,$86,$9A
       .BYTE $10,$BC,$C0,$F0,$F8
TABLE2 .BYTE $00,$08,$14,$2F,$9A
       .BYTE $90,$0B,$22,$65,$78

Write the code necessary to subtract each byte of TABLE2 from the corresponding byte of TABLE1, placing the result in TABLE3. That is, subtract the first byte of TABLE2 from the first byte of TABLE1 and place it in the first byte of TABLE3. Repeat this process for each of the ten bytes in the tables. When complete, TABLE3 should contain the values:


These problems should get you thinking about multi-byte operations more deeply. Whatever you do, don’t give up! Stick with it and you’ll soon get the hang of it.

Next month, we’ll start looking at the many ways to manipulate our friend, the eight-bit byte.

A.N.A.L.O.G. ISSUE 18 / MAY 1984 / PAGE 28
A.N.A.L.O.G. ISSUE 77 / OCTOBER 1989 / PAGE 32

Boot Camp

by Tom Hudson

Before beginning my regular Boot Camp material, I’d like any users of the MAC/65 assembler to take a look at this issue’s HBUG debug package (see page 78).

I received a letter from Allen J. Henninger of Linden, PA in January. He informed me that most of the Boot Camp examples failed to operate properly when he used MAC/65’s debug utility, BUG/65. I looked into the problem and, sure enough, Mr. Henninger was right.

When using BUG/65, BRK instructions cause a fatal system crash. Programs executing infinite loops can only be stopped via the SYSTEM RESET key. There are ways to circumvent the BRK lockup problem, but there’s no way to stop an infinite loop and find where the program was executing.

If you use MAC/65, I strongly suggest that you type in HBUG. It’ll help you check the operation of the programs shown in Boot Camp, avoiding nasty lock-ups.

The solutions.

If you solved last issue’s multi-byte math problems, give yourself a pat on the back. Successful completion of these programming puzzles indicates that you’re well on your way to becoming proficient in 6502 assembly language.

Whether you solved the problems or not, take a look at the following possible solutions. There are many ways to solve any programming problem, and these examples may show you a different approach.

10  *=$600
20  SED              ;DECIMAL MODE
40  SEC              ;FIRST SUBTRACT
0110  SBC #0         ;SUBTRACT DUMMY
0130  BRK            ;ALL DONE!
0140 OLDBAL .BYTE $73,$86,$10
0150 WITHD  .BYTE $85,$42
0160 NEWBAL *=*+3
0170  .END

                Figure 1.

Figure 1 shows the solution to the first problem given last month. You were asked to subtract the two-byte BCD variable WITHD from the three-byte variable OLDBAL, placing the result in the three-byte variable NEWBAL; OLDBAL = 108673 and WITHD = 4285.

As you can see from Figure 1, both OLDBAL and WITHD are defined using the .BYTE directive. Standard data storage formats are used, so the values are defined from low-order to high-order. That is, 108673 is defined as .BYTE $73,$86,$10. The variable NEWBAL is simply set up as *=*+3, reserving three bytes for the result of the operation.

The program itself uses the usual multi-byte subtract structure for the first two subtract operations. The third subtract uses a “dummy” value of zero for the third byte of WITHD, since it is one byte shorter than OLDBAL. This insures that any borrows from lower-order bytes will be processed properly.

Try executing this program on your computer. After it is finished, examine the three-byte NEWBAL to be sure it contains 104388 (108673 - 4285). NEWBAL is located at memory location $0622–0624. If you display these locations, you will see something like Figure 2.

        0622 88 43 18

            Figure 2.

You will note that the number 104388 contained in NEWBAL is stored in low-order to high-order format, just like OLDBAL and WITHD.

Solution two.

The second problem I assigned last month asked you to subtract each byte of the ten-byte TABLE2 from the corresponding byte of TABLE 1, placing the results in the ten-byte TABLE3. The initial values for TABLE1 and TABLE2 are:

TABLE1 .BYTE $10,$18,$40,$86,$9A
       .BYTE $10,$BC,$C0,$F0,$F8
TABLE2 .BYTE $00,$08,$14,$2F,$9A
       .BYTE $90,$0B,$22,$65,$78

If done properly, TABLE3 should contain the following values when the program is finished:


A possible solution to this problem is shown in Figure 3.

10       *=$600
20       CLD          ;BINARY MODE!
30       LDA #9       ;10 BYTES TO DO
50       SEC          ;SINGLE-BYTE!
80       DEX          ;NEKT BYTE
90       BPL SUBLP    ;DO ALL 10 BYTES
0100      BRK         ;ALL DONE!
0110 TABLE1 .BYTE $10,$18,$40,$86,$9A
0120        .BYTE $10,$BC,$C0,$F0,$F8
0130 TABLE2 .BYTE $00,$08,$14,$2F,$9A
0140        .BYTE $90,$0B,$22,$65,$78
0150 TABLE3 *=*+10
0160 .END

                Figure 3.

As you can see from Figure 3, this problem can be solved by simply indexing through all ten bytes of the tables in the loop SUBLP. Within this loop, the X register points to the desired byte of each table. Each time the loop is executed, the byte from TABLE2 is subtracted from the corresponding byte of TABLE1, and the result is placed in the proper location in TABLE3. Note that each subtract is preceded by the SEC (set carry) instruction, so that the subtracts will be treated as single-byte operations.

If you’re still having trouble with multi-byte math, go back and re-read last issue’s column. It may also be a good idea to review the math basics from ANALOG #15’s Boot Camp.

Ups and downs.

There are two handy instructions we haven’t covered yet that can sometimes be considered math instructions. These are INC (increment memory by 1) and DEC (decrement memory by 1).

INC n       (ZERO PAGE)
INC nn      (ABSOLUTE)
DEC n       (ZERO PAGE)
DEC nn      (ABSOLUTE)

The INC instruction simply adds 1 to the value contained in the memory byte referenced and places the result back into the memory location. The accumulator is not affected, but the SIGN and ZERO flags reflect the result of the operation. Figure 4 shows an example of the INC operation.

10       *=$0600
20       LDA #5      ;5 IN ACCUMULATOR
40       INC VALUE   ;VALUE = 6
50       INC VALUE   ;VALUE = 7
60       INC VALUE   ;VALUE = 8
70       BRK         ;ALL DONE!
80 VALUE *=*+1
90       .END

            Figure 4.

This program will place the value 5 in the accumulator and the location labeled VALUE. It then increments VALUE 3 times. When finished, the accumulator will still contain 5, but VALUE will contain 8.

If the INC operation is performed on a byte containing $FF, the byte’s value will “wrap around” to zero. Note that this instruction is not a true math instruction because the carry resulting from the byte wraparound is NOT shown in the status flags.

The DEC instruction is similar to the INC instruction, but operates in reverse. Instead of adding 1 to the value of the byte, DEC subtracts 1. Figure 5 shows an example of the use of the DEC instruction.

10      *=$600
20      CLD          ;BINARY MODE
30      LDA #5       ;SET COUNTER...
40      STA COUNT    ;TO 5
50      LDA #7       ;SET ADDVAL...
60      STA ADDVAL   ;TO 7
80      CLC          ;SINGLE-BYTE ADD
0120      BNE LOOP   ;NO! LOOP BACK
0130      BRK        ;ALL DONE!
0140 ADDVAL *=*+1
0150 COUNT *=*+1
0160 .END

                Figure 5.

In Figure 5, we’re using the variable COUNT as a simple counter to control the addition of ADDVAL. We will add ADDVAL to itself 5 times. When finished, ADDVAL will be multiplied by 32. Let’s walk through this example.

Line 20 clears the decimal mode so that we’ll be working in binary mode.

Lines 30–40 initialize COUNT to 5.

Lines 50–60 initialize ADDVAL to 7. When complete, this program will multiply 7 by 32, with a result of 224 ($E0) in the accumulator.

Lines 70–100 add ADDVAL to itself, placing the result back in ADDVAL. This has the effect of multiplying ADDVAL by 2 each time it is done.

Line 110 decrements COUNT by 1. When COUNT reaches zero, the ZERO flag will be set. This will be our signal to stop.

Line 120 checks the ZERO flag to see if all five multiplies have been done. If the ZERO flag is NOT set, the program will branch (BNE) back to the label LOOP

Line 130 BREAKS the program when all five iterations of the loop are complete.

Lines 140–150 define the one-byte storage areas ADDVAL and COUNT.

As you can see, the INC and DEC instructions can come in handy when you need a counter or want to add or subtract without affecting the accumulator. We have used the X and Y registers to perform counter functions, but if these registers are in use, you can always set up a byte and use the INC and DEC instructions instead.


When you get deeper into assembly language, you’ll need to manipulate bytes in ways that BASIC can’t. Now we’ll look at four instructions that allow a wide variety of ways to manipulate and test the contents of the accumulator. These instructions are AND, BIT, ORA and EOR.

    BYTE 1: 0 1 1 0 1 0 1 1
AND BYTE 2: 1 0 1 1 0 0 0 1
    RESULT: 0 0 1 0 0 0 0 1

            Figure 6.

Figure 6 shows how the AND function works. As you can see, two bytes are used as inputs to the function. The corresponding bits of these two bytes are examined. If the bit of the first byte is 1 AND the bit of the second byte is 1, the result for that bit will be 1. Otherwise, that bit of the result will be set to 0. This process is repeated for all eight bits.

In 6502 assembly language, the AND function has the following eight formats:

AND nn      (ABSOLUTE)
AND n       (ZERO PAGE)

In each of these formats, the accumulator is ANDed with the memory byte indicated in the operand. The result of the AND function is placed in the accumulator. The SIGN and ZERO flags are set according to the result.

The AND function is most often used to mask off certain bits of the accumulator or test bits to see if they are on.

Let’s say you want to get a random number that does not exceed 7. You could use the code:

       CMP #8

This code gets a random number and checks to see if it is greater than 7. If it is, the program loops back to GETRND and tries again. This routine works, but it may need to try several times before it gets a good value.

We can perform the same function easily with the AND instruction. By using the AND instruction, only one try is necessary. It even takes less memory than the previous example. The code is:

AND #07

This code MASKS the contents of the accumulator with the value 7. Figure 7 shows three possible outcomes of the procedure. As you can see, none of them exceed 7.

      BYTE: 1 0 0 1 1 1 0 1
  AND MASK: 0 0 0 0 0 1 1 1
    RESULT: 0 0 0 0 0 1 0 1 = 5
.     BYTE: 1 1 1 1 0 1 1 1
  AND MASK: 0 0 0 0 0 1 1 1
    RESULT: 0 0 0 0 0 1 1 1 = 7
.     BYTE: 0 0 0 1 0 0 0 0
  AND MASK: 0 0 0 0 0 1 1 1
    RESULT: 0 0 0 0 0 0 0 0 = 0

              Figure 7.

This is just one example of the use of the AND operation. We’ll cover more uses in the future.

A companion to the AND function is the BIT (bit test) instruction. It performs almost the same function as AND, but changes only the status flags. BIT does not affect the contents of the accumulator. The primary function of the BIT operation is to test the contents of the accumulator. BIT has the following formats:

BIT nn      (ABSOLUTE)
BIT n       (ZERO PAGE)

Besides not changing the accumulator as a result of the AND operation, BIT handles the status flags differently. The ZERO flag is handled the same as AND. The SIGN and OVERFLOW flags are set to bits 7 and 6 of the operand, respectively. This is a strange twist, and I’ve not yet encountered a situation where I’ve used this odd flag setting. The following code shows a typical use of the BIT instruction.

        LDA BYTE
        BIT TESTBT
        BNE BITON
BITON    .
BYTE    *=*+1

This code uses the bit mask TESTBT to see if the 1 bit of the memory location labeled BYTE is set. The value contained in BYTE is placed in the accumulator, then the BIT instruction is executed. Since TESTBT is the location used by the BIT operand, the accumulator will be ANDed with $01. If the 1 bit of the accumulator is set, the result of the BIT operation will be a NOT EQUAL condition. In this case, the BNE instruction would cause the program to branch to the location BITON. Otherwise, the program would fall through to the code after the BNE instruction.

I personally don’t use BIT instructions much. Unfortunately, the designers of the 6502 didn’t allow for an immediate format of this instruction. As a result, you must set up all the masks you use somewhere in memory, making the operation a bit more cumbersome.

This OR that.

Another bit-manipulating instruction used fairly often is the ORA (OR accumulator) operation. The formats of this instruction are:

ORA nn      (ABSOLUTE)
ORA n       (ZERO PAGE)

Unlike the AND operator, which only sets the result bit when both input bits are 1, the OR operator sets the result bit when EITHER input bit is 1. Figure 8 shows how the OR function works.

      BYTE 1: 1 0 1 1 0 1 1 0
   OR BYTE 2: 0 1 0 1 0 0 1 0
      RESULT: 1 1 1 1 0 1 1 0

            Figure 8.

As you can see, the OR operation sets the result bit if either bit 1 OR bit 2 is set. If both of the bits are off, the result bit will also be off. Like the AND operation, the ORA operation affects only the SIGN and ZERO flags.

The OR operation is used to turn on specific bits in a byte, most often in graphics handlers. The following code demonstrates how the OR instruction works.

10     *=$600
20     LDA #$4C     ;$4C IN ACCUM.
30     ORA #$11     ;OR WITH $11
40     ORA OR3      ;OR WITH $80
50     BRK          ;ALL DONE!
60 OR3 .BYTE $80
70     .END

                Figure 9.

Line 20 loads the accumulator with $4C (01001100 binary).

Line 30 ORs the accumulator with $11 (00010001 binary). After this OR operation, the accumulator will contin $5D (01011101) binary).

Line 40 ORs the accumulator with the contents of the memory location OR3. Since OR3 is defined as $80, the accumulator will be OR’d with 10000000 binary. After this instruction is executed, the accumulator will contain $DD (11011101 binary).

Line 50 stops the execution of the program. At this point you can see that the accumulator contains $DD.

An ANALOG exclusive.

The last accumulator manipulation instruction we’re going to look at this time is EOR (exclusive-OR). This instruction works like OR except that when BOTH input bits are set, the result bit will be turned off. The following example shows how EOR works:

      BYTE 1: 1 0 1 1 0 0 1 1
  EOR BYTE 2: 1 0 0 1 1 0 1 0
      RESULT: 0 0 1 0 1 0 0 1

The EOR instruction is commonly used in graphics routines, and also for flipping the setting of bits in program flags. Let’s see how the EOR instruction lets us flip bits. The following example shows the EOR function flipping all the bits of a byte to the opposite binary settings:

      BYTE 1: 1 0 1 1 0 0 0 1
  EOR BYTE 2: 1 1 1 1 1 1 1 1
      RESULT: 0 1 0 0 1 1 1 0

No matter what the contents of byte 1, if it is exclusive-OR’d with $FF (binary 11111111), the result of the operation will be the mirror-image of the first byte. The 6502 code necessary for this operation is:

LDA #$B1

What if we only want to flip a certain bit? The following example shows the flipping of only the 4 bit of byte 1:

      BYTE 1: 1 0 1 1 0 0 0 1
  EOR BYTE 2: 0 0 0 0 0 1 0 0
      RESULT: 1 0 1 1 0 1 0 1

As you can see, the bit has been flipped to a 1. The equivalent 6502 code for this example is:

LDA #$B1
EOR #$04

The EOR operation is easy to use. All you need to do is determine which bits you want to flip and exclusive-OR the accumulator with the appropriate byte. Like the AND and OKA operation codes, EOR sets the SIGN and ZERO flags according to the result of the operation.

Problem time.

Here are some good bit-manipulation problems for you to solve for next month.

In each of the following problems, you are given bit patterns before and after a bit manipulation operation. You must determine (1) the operation (AND, ORA, EOR) and (2) the second bit pattern used to obtain the result. Some problems have 2 possible answers. These are indicated with a (2) to the right of the problem. If you’ve read carefully, these should be a snap to solve.

──────── ─── ──────── ──────── ───
01000011              01000001 (2)
11001011              10100010
11110000              01000000 (2)
01010101              11111111 (2)
11001000              01111100
11111111              11110001 (2)
00100100              10111000
01000111              00010010

Until next time, try developing some problems of your own. It’s a good idea to try some addressing modes other than the ones used in this column. Next month, we’ll find out how to do simple multiplication and division!

Address all letters to:

 P.O. Box 23
 Worcester, MA 01603
A.N.A.L.O.G. ISSUE 19 / JUNE 1984 / PAGE 68
A.N.A.L.O.G. ISSUE 78 / NOVEMBER 1989 / PAGE 45

Boot Camp

by Tom Hudson

It’s hard to believe, but here we are in the seventh installment of Boot Camp. We’ve only got a few more 6502 operation codes to cover before we begin writing full-scale programs, so hang in! The best is yet to come.

Old Business

Last issue’s assignment asked you to solve eight bit-manipulation problems. You were given “before” and “after” bit patterns and asked to find what operation codes and operands were used to get the results. Figure 1 shows the completed assignment. Some of the problems had two possible answers. These are so noted, with both solutions.

Clever readers have probably noticed that the fourth problem actually has far more than two possible answers. In fact, by using the ORA instruction, Byte 2 could be any value with bits 1, 3, 5 and 7 set! Try it yourself with a short program.

  01000011 AND 01000011 01000001 (1)
  01000001 EOR 00000010 01000001 (2)
  11001011 EOR 01101001 10100010
  11110000 AND 01000000 01000000 (1)
  11110000 EOR 10110000 01000000 (2)
  01010101 ORA 10101010 11111111 (1)
  01010101 EOR 10101010 11111111 (2)
  11001000 EOR 10110100 01111100
  11111111 AND 11110001 11110001 (1)
  11111111 EOR 00001110 11110001 (2)
  00100100 EOR 10011100 10111000
  01000111 EOR 01010011 00010010

Simple Multiplication

As you may recall, by shifting a binary number left one bit, we effectively multiply it by two. Shifting it left two bits multiplies it by four. This principle is very handy, allowing us to multiply integers quickly and easily.

How do we perform this left-shift operation in 6502 assembly language? With the ASL (Arithmetic Shift Left) instruction, of course. This operation shifts the contents of the accumulator or a selected memory byte left one bit, and has the following formats:


When an ASL instruction is executed, the accumulator or memory byte is shifted one bit to the left. Figure 2 shows how the operation is handled internally.

         ┌─┐ ┌─┬─┬─┬─┬─┬─┬─┬─┐
BEFORE:  │0│ │0│0│1│1│0│0│0│1│
         └─┘ └─┴─┴─┴─┴─┴─┴─┴─┘
        CARRY    BYTE=49
         ┌─┐ ┌─┬─┬─┬─┬─┬─┬─┬─┐
AFTER:   │0│←│0│1│1│0│0│0│1│0│←0
         └─┘ └─┴─┴─┴─┴─┴─┴─┴─┘
        CARRY    BYTE=98

As you can see from the “before” and “after” images in Figure 2, each bit of the selected byte is shifted to the left one place. Since Bit 8 has no other place to go, it is shifted into the 6502 Carry flag. This is done to allow for multiple-byte shifts, which we’ll look at in a moment. A 0 is shifted into the 1 bit. As you can see, the value of the byte has been multiplied by two!

As long as the results of your shift-multiples do not exceed 255 decimal, you will find the ASL instruction works fine. Problems begin, though, when you get into multiple-byte values.

Figure 3 shows an example of a multiple-byte shift. As you can see, the contents of Bit 7 of the low byte must shift into Bit 0 of the high byte. In order to do this, we must see the LSR instruction to shift the low byte, and a new instruction, ROL (Rotate Left through carry), for the high byte. ROL has the following formats:

   ┌─┬─┬─┬─┬─┬─┬─┬─┐ ┌─┬─┬─┬─┬─┬─┬─┬─┐
   │0│0│1│1│0│1│0│0│ │1│0│0│1│1│1│1│0│
   └─┴─┴─┴─┴─┴─┴─┴─┘ └─┴─┴─┴─┴─┴─┴─┴─┘
           VALUE = 13470

   ┌─┬─┬─┬─┬─┬─┬─┬─┐ ┌─┬─┬─┬─┬─┬─┬─┬─┐
   └─┴─┴─┴─┴─┴─┴─┴─┘ └─┴─┴─┴─┴─┴─┴─┴─┘
           VALUE = 26940

The ROL instruction performs the same function as ASL, except that it puts the contents of the Carry flag in the low-order bit instead of a zero.

Both ASL and ROL set the Sign, Zero and Carry flags according to the result of the operation.

Let’s look at a few examples of multiplication using the ASL and ROL instructions.

10  *=  $0600
20  LDA #$07    ;PLACE 7 IN ACCUM.
30  ASL A       ;TIMES 2
40  ASL A       ;TIMES 4
50  ASL A       ;TIMES 8
70  BRK         ;AND STOP!
80 TIMES8 *=*+1
90  .END

The above shows an example of single-byte multiplication using the ASL instruction. In this example, we’re multiplying the contents of the accumulator (7) by eight and storing the result in the location labeled TIMES8.

Line 20 loads the accumulator with the number 7 (00000111 binary). You can try different values here to test the multiply. Remember that since this is only a single-byte multiply, the result cannot exceed 255. Therefore, don’t use any values greater than 31 decimal here.

Line 30 shifts the accumulator to the left one bit, multiplying the accumulator by two. After this instruction executes, the accumulator will contain 14 decimal (00011110 binary).

Line 40 shifts the accumulator left another bit. At this point, the accumulator is four times the starting value of 7, or 28 (00011100 binary).

Line 50 shifts the accumulator left a third time, giving us eight times the starting value, or 56 (00111000 binary).

Line 60 stores the final value of 56 decimal ($38 hex) in the location labeled TIMES8. If you change the value in line 20, the value you enter will be multiplied by eight and placed in TIMES8.

Line 70 stops the program execution.

Line 80 reserves one byte for the result of the multiplication, labeled TIMES8.

The above example shows how easy the ASL instruction makes it to multiply a number by a power of two, but what if you want to multiply a number by five?

In such cases, it’s good to break the multiplier down into “bite-sized” pieces. For example, a multiply by five can be broken down into:

    (NUMBER * 4)
  + (NUMBER    )
    (NUMBER * 5)

The 6502 code required for this operation is shown below.

10  *=  $0600
15  LDA #23     ;PLACE 23 IN ACCUM.
20  ASL A       ;TIMES 2
25  ASL A       ;TIMES 4
35  ADC #23     ;ADD 23 = TIMES 5!
45  BRK         ;ALL DONE!
50 TIMES5 *=*+1
55  .END

Similarly, a multiply by ten can be broken down to:

   (NUMBER * 8)
 + (NUMBER * 2)
   (NUMBER * 10)

With its 6502 code shown here:

10  *=  $0600
15  LDA #23     ;PLACE 23 IN ACCUM.
20  ASL A       ;TIMES 2
30  ASL A       ;TIMES 4
35  ASL A       ;TIMES 8
45  ADC TIMES2  ;*8 + *2 = *10!
55  BRK         ;AND STOP!
60 TIMES2 *=*+1
65 TIMES10 *=*+1
70  .END

As you can see, you can multiply a number by almost any value through a combination of left-shifts and add/subtract operations. It’s just a matter of careful planning when writing a program.

Multi-Byte Multiplication

Now that we’ve looked at single-byte multiplication, we can go on to bigger and better things, such as multiplying two-byte values. The figure below shows the procedure for multiplying the two-byte value TOTAL by 16. Note that the low-order byte is always SHIFTed, and the high byte is always ROTATEd.

10  *=  $0600
15  LDA #$02     ;PLACE 02...
25  LDA #$4F     ;PLACE 4F...
75  BRK          ;ALL DONE!
80 TOTAL *=*+2
85  .END

Lines 15–30 initialize the variable TOTAL to $024F (0000001001001111) binary. Note that the label TOTAL is the low-order byte and TOTAL+1 is the high-order byte.

Line 35 shifts the low byte of TOTAL left one bit, multiplying it by two. This operation places the contents of Bit 7 of the low byte in the Carry flag so that it can be shifted into the high byte by the next operation.

Line 40 rotates the high byte of TOTAL left, placing the Carry flag’s contents in Bit 0. Like the shift operation, the rotate places the contents of the high byte’s Bit 7 in the Carry flag. After this instruction executes, TOTAL contains $049E (0000010010011110 binary), or two times the original value.

Lines 45–50 multiply TOTAL by two a second time, resulting in a value of $903 (0000100100111100 binary), or four times the original value.

Lines 55–60 multiply TOTAL by two again, giving a value of $1278 (0001001001111000 binary), or eight times the original value.

Lines 65–70 multiply TOTAL by two a final time, giving a final result of $24F0 (0010010011110000 binary), which should be $024F*16. Checking, we find that $024F is 591 decimal. 591 times 16 is 9456 decimal, or $24F0, and our answer in TOTAL is correct.

These examples show the basics of 6502 multiplication, but don’t stop here. Study the above code and try creating your own programming puzzles. I’ve given you the ball, now run with it!

Divide and Conquer

Now that we’ve covered simple multiplication, let’s look at basic division. You know how bit-shifting works, so picking up the finer points of binary division should be easy.

Remember how shifting the value 49 decimal (00110001 binary) left one bit gave us 98 (01100010 binary)? What happens if we shift the value RIGHT one bit? Figure 4 gives us the answer.

          ┌─┬─┬─┬─┬─┬─┬─┬─┐ ┌─┐
BEFORE:   │0│0│1│1│0│0│0│1│ │0│
          └─┴─┴─┴─┴─┴─┴─┴─┘ └─┘
            BYTE=49        CARRY

          ┌─┬─┬─┬─┬─┬─┬─┬─┐ ┌─┐
AFTER:  0→│0│0│0│1│1│0│0│0│→│1│
          └─┴─┴─┴─┴─┴─┴─┴─┘ └─┘
            BYTE=24        CARRY

As you can see, we’ve just discovered the first limitation of binary division—we can’t handle decimals! Using real numbers instead of integers, 49/2 = 24.5. Shifting the value 49 right one bit divided it by two, all right, but we lost the decimal portion of the result. We’ll look at real-number division in later installments of Boot Camp, but for now the loss of the precision does not matter. I mentioned the problem because it’s good for you to be aware of this limitation.

In the 6502 instruction set, the operation that performs this right-shift is the LSR (Logical shift right) instruction. Its formats are:


As Figure 4 shows, the LSR instruction shifts all the bits of the indicated byte right one position. A zero is placed in the high-order, or 128, bit. The low-order, or 1, bit is shifted into the Carry flag. This allows us to perform multi-byte right-shifts, similar to multi-byte left-shifts.

Before we look at multiple-byte division, let’s look at a single-byte example.

10  *=  $0600
20  LDA #184  ;PUT 184 IN ACCUM.
30  LSR A     ;DIVIDE BY 2
40  LSR A     ;DIVIDE BY 4
50  LSR A     ;DIVIDE BY 8
70  BRK       ;AND STOP!
80 DIV8 *=*+1
90  .END

The above shows an example of dividing a single-byte value by eight. Like multiplication by eight, this operation requires three shifts, but in the opposite direction. In this example, we divide the number 184 decimal by eight, placing the result in the location DIV8.

Line 20 places the number 184 (10111000 binary) in the accumulator.

Line 30 shifts the accumulator contents right one bit, dividing the value there by two. After this instruction, the accumulator contains 92 (01011100 binary).

Line 40 shifts the accumulator right another bit, dividing the value by two again. At this point the accumulator is divided by four and contains 46 (00101110 binary).

Line 50 shifts the accumulator right a final time, leaving the accumulator con’; raining the original value divided by eight. At this point it contains 23 (00010111 binary).

Line 60 stores the contents of the accumulator in the locaiton labeled DIV8. If you examine this location after the program executes, you will see that it contains 23 decimal ($17 hex). Checking, you will find that this is 184 divided by eight.

Line 70 BREAKS the program, stopping execution.

Line 80 reserves one byte for the value DIV8.

Now you see how simple single-byte division is. If you want to divide any integer up to 255 by a power of two, this process works fine.

Shifting Into High

Up till now, we’ve limited ourselves to simple, single-byte division. Now let’s see how we do it with more than one byte.

Figure 5 shows the division of the two-byte value 28008 by two. As you can easily calculate, the result is 14004. If you compare this example with the multi-byte multiplication shown in Figure 3, you will notice an interesting difference.

   ┌─┬─┬─┬─┬─┬─┬─┬─┐ ┌─┬─┬─┬─┬─┬─┬─┬─┐
   │0│1│1│0│1│1│0│1│ │0│1│1│0│1│0│0│0│
   └─┴─┴─┴─┴─┴─┴─┴─┘ └─┴─┴─┴─┴─┴─┴─┴─┘
         VALUE = 28008

   ┌─┬─┬─┬─┬─┬─┬─┬─┐ ┌─┬─┬─┬─┬─┬─┬─┬─┐
   └─┴─┴─┴─┴─┴─┴─┴─┘ └─┴─┴─┴─┴─┴─┴─┴─┘
         VALUE = 14004

In multiplication, the low byte is shifted and the high byte(s) is (are) rotated. This is because the bit-shift proceeds from right to left.

In division, however, things are reversed. Since we are shifting all the bits to the right, the highest byte is shifted and the remaining bytes are rotated. This allows the low-order bits of the bytes being divided to shift into the lower-order bytes.

Let’s look at an example of the three-byte value SCORE being divided by four. The code necessary is shown below:

10  *=  $0600
15  LDA #$49     ;SET UP...
20  STA SCORE+2  ;3-BYTE...
25  LDA #$23     ;VALUE...
35  LDA #$F8     ;= $4923F8
50  ROR SCORE+1  ;SCORE...
55  ROR SCORE    ;BY 2
65  ROR SCORE+1  ;SCORE...
70  ROR SCORE    ;BY 4
75  BRK          ;AND STOP!
80 SCORE *=*+3
85  .END

Lines 15–40 initialize the three-byte value SCORE to $4923F8. Remember that multi-byte values are always stored in low-byte/high-byte order. In this case SCORE is the lowest-order byte and SCORE+2 is the highest-order byte.

Line 45 shifts the highest-order byte of SCORE right one bit. The 1-bit of SCORE+2 is placed in the Carry flag, ready to be rotated into the next byte of SCORE.

Line 50 rotates the middle-order byte right one bit. The bit carried from the highest-order byte is shifted into SCORE+1’s 128-bit, and the 1-bit of SCORE+1 is placed in the Carry flag for the next rotate.

Line 55 rotates the low-order byte of SCORE right one bit. Once again, the Carry status is placed in the 128-bit, and the 1-bit is shifted into the Carry. This final Carry is not used, but is ignored. After this instruction executes, the value in SCORE is divided by two and contains $2491FC. As an exercise, you can calculate the binary and decimal values.

Lines 60–70 perform the same function as Lines 45–55, leaving SCORE with the original value divided by four, or $1248FE. Calculate the decimal and binary values for this result, and you will see that the original value has been divided by four.

Line 75 BREAKS the execution of the program. At this point, you can examine the three bytes of SCORE and see that they contain the proper result.

Line 80 reserves three bytes for the variable SCORE.

Well, now you have the basics of integer-binary multiplication and division under your belt. The principle is simple; you just have to work with it until you feel comfortable. In order to do that, create your own problems to solve.

Here it Comes

For those of you who need some prompting to get started with problems, here’s one that shouldn’t be too hard if you’ve read carefully.

Write a program that multiplies the value 5 by 27. Use any of the techniques we have discussed so far. There are several possible solutions to this problem, so give it your best shot. When you solve it, I’d like to see the technique you used. Send listings of your solutions to:

Boot Camp
c/o ANALOG Computing
P.O. Box 23
Worcester, MA 01603

Next issue, we’ll look at a couple of possible solutions. We’ll also find out what the stack is and how it helps us write subroutines.

A.N.A.L.O.G. ISSUE 20 / JULY 1984 / PAGE 76
A.N.A.L.O.G. ISSUE 79 / DECEMBER 1989 / PAGE 118

Boot Camp

by Tom Hudson

Welcome back! As I mentioned last issue, there are only a few more 6502 instructions left for us to cover, and we’ll talk about them in the next two installments. There are also a couple of instructions we’re going to skip until later. They are for more advanced uses and won’t make much sense until you’ve got more experience with assembly language.

Several people have written lately, asking if we’ll get into full-scale programs using the Atari’s powerful operating system. The answer: You bet! We’re going to find out how to access the disk, cassette, graphics, keyboard and just about anything else you’d like to hear about. We’ll study routines for high-speed math, player/missile graphics and more. Boot Camp is here not only to teach you what 6502 assembly instructions do, but how to apply them.

TWO Solutions

Last issue, I asked you to write a program that multiplied the number 5 by 27. There are an almost infinite number of ways to do this, and I’ll show you two of them now. Remember, these aren’t the only possibilities, and even though your solution may not be as efficient, getting the correct answer is what counts most.

      10  *= $600
      20  CLD          ;BINARY MATH!
      30  LDA #5       ;GET # TO MULT.
      40  STA TIMES1   ;SAVE # TIMES 1
      50  ASL A        ;*2
      60  STA TIMES2   ;SAVE # TIMES 2
      70  ASL A        ;*4
      80  ASL A        ;*8
      50  STA TIMES8   ;SAVE # TIMES 8
      0100  ASL A      ;*16
      0110  CLC        ;CLEAR FOR ADD
      0120  ADC TIMES8 ;*24
      0130  CLC        ;CLEAR AGAIN
      0140  ADC TIMES2 ;*26
      0150  CLC        ;CLEAR AGAIN
      0160  ADC TIMES1 ;*27
      0170  STA RESULT ;SAVE # TIMES 7
      0180  BRK        ;WE'RE DONE!
      0190 TIMES1 *=*+1
      0200 TIMES2 *=*+1
      0210 TIMES8 *=*+1
      0220 RESULT *=*+1
      0230  .END

The first solution I’m going to cover is shown above. This program uses the principle of breaking a multiply into bite-sized pieces, as shown last issue. In this case, I broke the multiply by 27 down into the following group of adds:

  (number * 16)
  (number *  8)
  (number *  2)
+ (number     )
  (number * 27)

Let’s step through the program and see how it works.

Line 20—clears the decimal mode. Always remember to be sure of the setting of the decimal flag before doing any arithmetic.

Line 30—loads the accumulator with the number 5. When the routine is finished, this number will be multiplied by 27 and stored in the memory location labeled RESULT.

Line 40—stores the accumulator’s contents in the memory location labeled TIMES1 (5*1). We need to save this value for later, when we add the bite-sized pieces together.

Line 50—shifts the accumulator contents left one bit, multiplying it by two.

Line 60—saves the accumulator (now 5*2) in the location TIMES2. This value is also needed for our final result.

Line 70—shifts the accumulator left one bit again, leaving the accumulator with the value 5*4.

Line 80—performs another left shift on the accumulator. The accumulator now contains 5*8.

Line 90—saves the accumulator’s contents in the location TIMES8.

Line 100—performs a final left shift on the accumulator, leaving the accumulator with the value 5*16. At this point, we have all the bite-sized pieces we need to get our answer and are ready to add them up.

Line 110—clears the carry flag for the first add in the group. Remember, this is a necessary instruction before any single-byte addition.

Line 120—adds the accumulator (5*16) to TIMES8 (5*8), leaving the result (5*24) in the accumulator for the next add.

Line 130—clears the carry for the next add.

Line 140—adds the accumulator (5*24) to TIMES2 (5*27), with the result (5*26) left in the accumulator.

Line 150—clears the carry again, for the final addition operation.

Line 160—adds the accumulator (5*26) to TIMES1 (5*1), leaving the accumulator holding the final value, 5 times 27!

Line 170—saves the final answer in the location labeled RESULT.

Line 180—BREAKs the execution of the program. At this point, you can check the location RESULT to be sure it contains 5*27, or 135 ($87 hex).

Lines 190–220—reserve one byte for each of the four data areas used by the program.

Solution #2

The second solution is a modification of the first technique. In this program, I break the multiply down into smaller pieces again, but structure it so that subtracts are used instead of adds:

  (number * 32)
  (number *  4)
- (number     )
  (number * 27)

As you can see, we get the same result as with adds, but with only three math operations instead of four. The figure below shows the 6502 code necessary to implement this method.

10  *=  $0600
20  CLD          ;BINARY MATH
30  LDA #5       ;GET # TO MULT.
50  ASL A        ;*2
60  ASL A        ;*4
80  ASL A        ;*8
90  ASL A        ;*16
0100  ASL A      ;*32
0110  SEC        ;SET FOR SUBTRACT
0120  SBC TIMES4 ;*28
0130  SEC        ;SET AGAIN
0140  SBC TIMES1 ;*27
0160  BRK        ;ALL DONE!
0170 TIMES1 *=*+1
0180 TIMES4 *=*+1
0190 RESULT *=*+1
0200  .END

Let’s walk through this program and see what’s going on.

Line 20—clears the decimal mode for binary arithmetic. I can’t overemphasize the importance of knowing the status of the decimal mode flag. If you’re in doubt, set or clear it as needed.

Line 30—loads the accumulator with the number 5. When this program is finished, the number 5 will be multiplied by 27.

Line 40—saves the contents of the accumulator in the location labeled TIMES1, for later use.

Line 50—shifts the accumulator left one bit, multiplying it by 2.

Line 60—shifts the accumulator left again, leaving the accumulator with the value 5*4.

Line 70—saves the contents of the accumulator (5*4) in the memory location TIMES4.

Line 80—shifts the accumulator left again, leaving the value 5*8 in the accumulator.

Line 90—performs another left shift. At this point the accumulator contains 5*16.

Line 100—shifts the accumulator left a final time. The accumulator now contains the value 5*32. We are now ready to perform the subtract operations as shown above.

Line 110—sets the carry flag for the first subtract operation. Remember, the carry flag should always be set before a single-byte subtract to ensure correct results.

Line 120—subtracts the value TIMES4 (5*4) from the accumulator (5*32), leaving the accumulator containing the value 5*28.

Line 130—sets the carry flag for the next subtract.

Line 140—subtracts the value TIMES1 (5*1) from the accumulator (5*28), leaving the accumulator with the value 5*27!

Line 150—saves the answer in the location labeled RESULT.

Line 160—stops the program’s execution with the BRK instruction. At this point, you can verify that the location RESULT (and the accumulator) contains 5*27, or 135 ($87 hex).

Lines 170–190—reserve one byte for each of the three data fields used by the program.

Obviously, these are just two of the thousands of solutions possible for this problem.

Stacking the Deck

The last topic we’re going to cover before going on to bigger and better things is the 6502 stack. This is an important feature of the 6502, as it allows us to write subroutines. Since the stack concept is important, we’re going to cover it in detail starting with this issue and finish it with assembly examples next time. Let’s get started finding out what the stack is and how it works.

The 6502 reserves 256 bytes of memory from $0100–01FF (also called page 1) for a temporary storage area. We call this area the stack. This area is automatically maintained for the 6502, but we can use it for short-term storage, too.

We call the stack a “last-in, first-out” structure. The last number placed on the stack is always the first to be pulled off. A good way to remember this is to think of a stack of pancakes. When you pile them up, the last one put on the stack is on top. When you take them off one at a time, the last one you put on comes off first. Using this analogy, the computer could keep track of 256 pancakes, each with a number written on it.

The computer keeps track of the stack’s contents by using the Stack Pointer register inside the 6502. This pointer ranges from $00–FF When the stack pointer contains $00, it is pointing to the memory location $0100. When it contains $FF, the location $01FF is indicated.

Interestingly, the stack works backwards from the way we would expect. When the stack is empty, the stack pointer is set to $FF. Figure 1 shows an empty stack.

       Empty Stack
$01FF │           │ <──┐
      ├───────────┤    │
      │           │    │
      ├───────────┤    │  SP
            .          │ ┌──┐
            .          └─┤FF│
            .            └──┘
      │           │
$0100 │           │

As the stack is filled with more and more values, the stack pointer is decremented, pointing to lower areas of page 1. When completely filled, the stack pointer will contain $00, as shown in Figure 2.

       Full Stack
$01FF │     42    │
      │     1B    │
      ├───────────┤       SP
            .            ┌──┐
            .          ┌─┤00│
            .          │ └──┘
      ├───────────┤    │
      │     01    │    │
      ├───────────┤    │
$0100 │     02    │ <──┘

Since the computer has only reserved 256 bytes for a stack, there are obviously limitations in its use. If the stack is filled with too many values, the stack pointer will wrap around back to $FF and begin wiping out earlier stack entries! There is no error message for this, so you must be careful when working with the stack.

When entries are removed from the stack, the process is reversed. As each byte is pulled off the stack, the pointer is incremented, pointing to progressively higher locations of the stack.

How Subroutines Work

In BASIC, subroutines are easy to write. You simply set up the necessary BASIC code, put a RETURN instruction at the end of it, and call it with the GOSUB statement whenever you need it. The subroutine code is performed, and BASIC resumes execution at the next statement after the GOSUB. Neat, huh?

In order for a BASIC subroutine to work, the computer has to know how to get back to the instruction after the GOSUB. It does this by using a stack. Let’s look at a simplified example of how a BASIC subroutine is executed.

10 GOSUB 100
20 END
100 GOSUB 200
200 A=A+1

The above is a short BASIC program using the BASIC subroutine statements, GOSUB and RETURN. We’re going to step through it and watch what happens to the BASIC stack, a special area similar to the 6502 stack.

Before execution, the stack is empty, and the stack pointer is pointing to the first available position.

       BASIC Stack
      │           │ <── POINTER
      │           │
      │           │

Line 10—GOSUB to Line 100 is executed. First, the computer finds the next statement after GOSUB. The next statement is in Line 20, so the computer pushes that line number onto the first location on the stack, and changes the stack pointer to point to the next available location. Execution then proceeds at Line 100. At this point, the stack looks like:

       BASIC Stack
      │     20    │
      │           │ <── POINTER
      │           │

Line 100—This line executes a GOSUB to Line 200. The next statement after this GOSUB is Line 110, so this number is placed on the stack, and the stack pointer is advanced to the next available position. Execution continues at Line 200. The stack now looks like:

       BASIC Stack
      │     20    │
      │    110    │
      │           │ <── POINTER

Line 200—The computer adds one to the variable A. The stack is not affected.

Line 210—The computer encounters a RETURN statement. At this point, the computer increments the stack pointer, like so:

       BASIC Stack
      │     20    │
      │    110    │ <── POINTER
      │           │

Now the computer takes the line number 110 from the stack. As you can see, the computer can now go back to the instruction after the last GOSUB. Execution continues at Line 110.

Line 110—Another RETURN is encountered, and the stack pointer is incremented again. Now the stack looks like this:

       BASIC Stack
      │     20    │ <── POINTER
      │    110    │
      │           │

The computer gets the line number from the stack and completes the RETURN by resuming execution at Line 20.

Line 20—This line terminates execution with the END statement. The stack is back to its original condition, with the pointer indicating the first stack location. The line numbers are still in the stack itself, but since the stack pointer no longer points to them, they are no longer active. They will be wiped out by new stack entries.

Now do you see how the stack works? It’s a great way to handle subroutines, where the computer must be able to find its way back to the code which called up the subroutine.

Until Next Time

If you think Boot Camp looks more like BASIC Training this issue, hold on! I wanted to explain the subroutine process in a language you’re familiar with, like BASIC. Next issue we’ll examine the operation of the 6502 subroutine process and learn how to use the stack for our own programs.

10 GOSUB 10
20 END

Until we meet again, the above is a little program to get you thinking. Type in the BASIC program and run it. It may take a while, but something will happen, and I want you to see if you can find the cause. Use the stack illustration method I used in the BASIC example to get the answer.

Also, if you haven’t already, try to find more alternate methods for multiplying five by 27!

Address all letters to:

 P.O. Box 23
 Worcester, MA 01603
A.N.A.L.O.G. ISSUE 21 / AUGUST 1984 / PAGE 90

Boot Camp

by Tom Hudson

Well, for the last week or so I’ve been receiving your solutions to the 5 times 27 multiply problem, and it looks like everybody’s got the hang of it. Some people tried to cheat by multiplying 27 by 5. This is a much simpler operation, but we’ll see later why this type of shortcut is not always possible.

What happened?!!

Figure 6 from last issue’s column was a simple BASIC program that looked like this:

10 GOSUB 10
20 END

I told you to execute it and see if you could determine what went wrong. If you look at the code, you’ll see that the program places itself in an infinite loop with the GOSUB 10 statement. If you let the program run for a few minutes, you’ll eventually see an ERROR 2 message. What happened? Let’s step through the program and find out.

Line 10 executes a GOSUB 10 statement. The next executable statement is Line 20, so the line number 20 is placed on the stack. The program then branches to Line 10. The stack now looks like this:

       BASIC Stack
      │     20    │
      │           │ <── POINTER
      │           │
      │           │

Line 10 executes GOSUB 10 again, with the same results as above. The line number 20 is placed on the stack again, and execution continues at Line 10 again. Now the stack looks like this:

       BASIC Stack
      │     20    │
      │     20    │
      │           │ <── POINTER
      │           │

Line 10 performs the same set of operations again, and you can see that the program is in an infinite loop. Each time the GOSUB 10 statement is executed, the BASIC stack gets larger and larger…until there is no more memory available. When this happens, the computer stops with the ERROR 2 AT LINE 10 message. Obviously, one must take care that all subroutines are terminated by a RETURN. Each subroutine must contain at least one RETURN statement, otherwise you’ll find yourself running out of memory far faster than you ever dreamed!

Assembly subroutines.

Last issue, as you recall, we found out what a stack is and how BASIC uses a stack to execute subroutines. There is a lot of “housekeeping” done by the system to keep track of subroutines, and we don’t want to write all those routines ourselves, do we?

Luckily for us, the 6502 microprocessor has its own set of subroutine instructions. They are: JSR (jump to subroutine), which corresponds to the BASIC GOSUB statement; and RTS (return from subroutine), which performs the same function as the BASIC RETURN statement.

The format of the JSR instruction is:

JSR nn      (ABSOLUTE)

The operand of the JSR instruction can be any address, such as JSR $4000, or a program label, such as JSR PRINT.

When the JSR instruction executes, things happen a little differently than they did in our BASIC example, last issue. Instead of a line number being placed on the stack, a two-byte address is used. More on that in a moment.

The format of the RTS instruction is:


Like the RETURN statement in BASIC, the RTS instruction will continue execution at the instruction following the JSR which called the subroutine.

Let’s look at an assembly program which uses the JSR and RTS instructions. For purposes of illustration, we’ll duplicate the function of the BASIC program we used last time. Figure 1 is a listing of the assembly program, with the addresses and hex codes of the instructions shown to the left of the line numbers. The corresponding BASIC statements are shown in the comment fields.

0000        10        *=   $600
0600 D8     15        CLD
0601 200306 20        JSR  SUB1     ;GOSUB 100
0604 00     25        BRK
0605 200906 30 SUB1   JSR  SUB2     ;GOSUB 200
0608 60     35        RTS           ;RETURN
0609 AD1306 40 SUB2   LDA  VARA     ;VARA=VARA+1
060C 18     45        CLC
060D 6901   50        ADC
060F 8D1306 55        STA  VARA
0612 60     60        RTS           ;RETURN
0613        65 VARA   *=   *+1
0614        70        .END

                      Figure 1.

Let’s walk through this program and watch what happens to the stack. Remember, the 6502 does all the stack handling for us, and this walk-through is just to familiarize you with what’s happening inside the machine.

Line 15 clears the decimal mode for the binary arithmetic the program will do later. At the start of the program, the stack pointer will be at some arbitrary location. We’ll assume that it’s set to $00 for this demonstration. The stack at this point looks like this:

       6502 stack   <───┐
      ┌───────────┐     │
$01FF │           │     │
      ├───────────┤     │
      │           │     │    SP
      ├───────────┤     │   ┌──┐
      │           │     └───┤00│
      ├───────────┤         └──┘
      │           │
      │           │
      │           │

Line 20 performs a JSR to the location labeled SUB1. Before going to the subroutine, the 6502 must save the return address on the stack. The next instruction after the JSR is at $0604, so the 6502 takes this address and subtracts 1 from it, resulting in a return address of $0603. The stack pointer is decremented by 1, and contains $FF. The high byte of the return address ($06) is placed at location $01FF. The stack pointer is decremented again, and now contains $FE. Now the 6502 stores the low byte of the return address ($03) on the stack at location $01FE. The return address is now properly stored, and execution continues at location $0605, the address of SUB1. At this point, the stack looks like this:

       6052 stack
$01FF │    06     │
      │    03     │ <───┐    SP
      ├───────────┤     │   ┌──┐
      │           │     └───┤FE│
      ├───────────┤         └──┘
      │           │
      │           │
      │           │

Line 30—Execution continues here after the JSR process is complete. This is another JSR, this time to the subroutine labeled SUB2. As in the previous JSR, the return address minus 1 ($0607 this time) is stored in the next two stack locations, and execution continues at the subroutine. The stack pointer now contains $FC, and the stack looks like this:

       6502 stack
$01FF │    06     │
      │    03     │          SP
      ├───────────┤         ┌──┐
      │    06     │     ┌───┤FC│
      ├───────────┤     │   └──┘
      │    07     │ <───┘
      │           │
      │           │

Lines 40–55 add 1 to the contents of location VARA, placing the result back into VARA. The stack is unchanged by this operation.

Line 60—Now we encounter our first RTS instruction. It functions almost like the BASIC RETURN statement, but with a small difference. When executed, the RTS gets the byte from the stack location indicated by the stack pointer and places it in the low byte of the program counter. Remember that the program counter is where the 6502 stores the address of the instruction that is currently being executed. The stack pointer is then incremented (to $FD), the next byte in the stack is placed in the high byte of the program counter, and the stack pointer is incremented again (to $FE). At this point, the program counter contains the return address minus 1, so the program counter is incremented by 1 to get the proper return address. In this case, the return address is $0608, and the program continues there (Line 35). After this instruction executes, the stack will look like this:

       6502 stack
$01FF │    06     │
      │    03     │ <───┐    SP
      ├───────────┤     │   ┌──┐
      │    06     │     └───┤FE│
      ├───────────┤         └──┘
      │    07     │
      │           │
      │           │

Line 35 executes another RTS instruction. This time, the program will return to location $0604 (1 byte higher than the location in the last two bytes of the stack). The stack pointer will be incremented twice, and when the program is complete, the stack pointer will contain $00. After this RTS, execution continues at Line 25, and the stack looks like this:

       6502 stack   <───┐
      ┌───────────┐     │
$01FF │    06     │     │
      ├───────────┤     │
      │    03     │     │    SP
      ├───────────┤     │   ┌──┐
      │    06     │     └───┤00│
      ├───────────┤         └──┘
      │    07     │
      │           │
      │           │

Line 25 stops the execution of the program with the BRK instruction. The stack is unchanged.

Don’t panic!

Remember, the 6502 performs all of the stack maintenance functions for you. Writing a subroutine in assembly is just as easy as writing one in BASIC. I’ve just explained the details of the stack, so that you’ll be prepared for next issue’s stack-manipulation instructions.

Later on, when you’re more comfortable with assembly language and the stack, we’ll see how we can use the stack for some fancy control structures.

Simple subroutines.

Right now, let’s see how simple assembly subroutines can be. Let’s write a subroutine that will add 1 to a two-byte counter for us.

Let’s assume the counter is labeled COUNTL (low byte) and COUNTH (high byte). The normal code we’d use to add 1 to this two-byte counter is shown in Figure 2.

       CLC           ;CLEAR CARRY
       ADC #1        ;ADD 1
       ADC #0        ;ADD WITH CARRY

                   Figure 2.

Clearly, this is just a simple two-byte add operation (if you have problems with addition, review issue 17’s Boot Camp).

Let’s say you’re writing a program which needs to increment this counter in several different places. You could re-type the addition code each time you need it, but this would waste quite a bit of memory. Luckily, you know all about the 6502 JSR and RTS instructions, so you write a simple subroutine to do the job. Figure 3 shows the code necessary.

       CLC         ;CLEAR CARRY
       ADC #1      ;ADD 1
       ADC #0      ;ADD W/CARRY
       RTS         ;RETURN!

                Figure 3.

If you look at the subroutine closely, you’ll see only two changes from Figure 1! The first line of the subroutine contains the label INCCTR (INCrement CounTeR). This allows us to reference the subroutine with an easy-to-remember name. The other change is the addition of an RTS instruction at the end of the routine. See? Writing assembly subroutines isn’t so hard, after all.

To call this subroutine, all we need is the statement:


I’m sure you’ll agree that this is much easier than retyping the addition code each time you need to increment the counter. Figure 4 shows a complete program which uses the subroutine in three places.

10          *= $0600
20          CLD        ;BINARY MATH
30          LDA #0     ;ZERO OUT...
70          LDX #4     ;5 TIMES...
90          DEX        ;NEXT X
0100        BPL LOOP1  ;LOOP IF POS.
0110        LDA #$50   ;GET # IN ACC.
0130        STA ACCUM  ;SAVE ACCUM.
0140        BRK        ;ALL DONE!
0160        CLC        ;CLEAR CARRY
0170        ADC #1     ;ADD 1
0200        ADC #0     ;ADD W/CARRY
0220        RTS        ;RETURN!
0230 COUNTL *=*+1
0240 COUNTH *=*+1
0250 ACCUM  *=*+1
0260        .END

                Figure 4.

Line 20 clears the decimal mode for binary arithmetic.

Lines 30–50 set the counter (COUNTL and COUNTH) to zero.

Line 60 increments the counter using the JSR INCCTR instruction.

Lines 70–100 increment the counter five times using the X register as a loop counter. The count starts at 4, and the routine loops back to LOOP1 until the X register is less than zero.

Line 110 loads the accumulator with $50.

Line 120 JSR’s to INCCTR to increment the counter a final time.

Line 130 stores the contents of the accumulator at the location labeled ACCUM. Note that this will not be the value $50 loaded in Line 110, but will be whatever value the subroutine left there! This is an important point: You must remember which registers are altered by a subroutine, because the values in those registers will be lost when the subroutine is called! In this case, only the accumulator is used by the subroutine, so the X and Y registers can be used without concern.

Line 140 stops the program with the BRK instruction. At this point, you can examine the counter (COUNTL and COUNTH) and see that it contains the value $0007. The location ACCUM will contain $00, not the value $50 loaded in Line 110.

Lines 150–220 are the INCCTR subroutine.

Flexible subroutines.

The INCCTR subroutine showed how a subroutine could be written to perform the same function each time. Now we’re going to write a subroutine that will perform a function on a value passed to the subroutine in one of the registers. We’ll use another familiar routine, multiplication by 27.

We’ll write a subroutine which will multiply the contents of the accumulator by 27 and return with the value times 27 in the accumulator.

Those people who took the multiply 27 by 5 shortcut are in for a little surprise! In order for this subroutine to work, the multiply by 27 approach must be used. Take that!

Figure 5 shows the subroutine necessary to multiply the accumulator by 27 and return the result in the accumulator. Only the accumulator is altered; the X and Y registers are untouched. The subroutine requires three one-byte storage locations, TIMES1, TIMES2 and TIMES8.

       ASL A      ;* 2
       ASL A      ;* 4
       ASL A      ;* 8
       ASL A      ;* 16
       CLC        ;CLEAR CARRY
       ADC TIMES8 ;*16 + *8 = *24
       CLC        ;CLEAR CARRY
       ADC TIMES2 ;*24 + *2 = *26
       CLC        ;CLEAR AGAIN
       ADC TIMES1 ;*26 + *1 = *27
       RTS        ;ALL DONE!

                Figure 5.

This routine is essentially the same as the multiply by 27 solution shown last issue. The accumulator is assumed to contain the number to be multiplied upon entry into the subroutine. After the multiply is complete, the result is left in the accumulator. The RTS instruction at the end of the routine lets us know that this is a subroutine. The subroutine is labeled MULT27 and is called with the statement:


Let’s put this subroutine to work, using a program which will multiply the numbers 3, 7 and 9 by 27. We will place the results in locations labeled THREE, SEVEN and NINE, respectively. Figure 6 shows one possible solution.

10          *= $0600
20          CLD        ;BINARY MATH
30          LDA #3     ;GET 3,
40          JSR MULT27 ;MULT BY 27,
60          LDA #7     ;GET 7,
70          JSR MULT27 ;MULT BY 27,
90          LDA #9     ;GET 9,
0100        JSR MULT27 ;MULT BY 27
0110        STA NINE   ;SAVE RESULT
0120        BRK        ;AND STOP!
0140        ASL A      ;* 2
0150        STA TIMES2 ;SAVE # TIMES 2
0160        ASL A      ;* 4
0170        ASL A      ;* 8
0180        STA TIMES8 ;SAVE # TIMES 8
0190        ASL A      ;* 16
0200        CLC        ;CLEAR CARRY
0210        ADC TIMES8 ;*16 + *8 = *24
0220        CLC        ;CLEAR CARRY
0230        ADC TIMES2 ;*24 + *2 = *26
0240        CLC        ;CLEAR AGAIN
0250        ADC TIMES1 ;*26 + *1 = *27
0260        RTS        ;ALL DONE!
0270 TIMES1 *=*+1
0280 TIMES2 *=*+1
0290 TIMES8 *=*+1
0300 THREE  *=*+1      ;3*27 RESULT
0310 SEVEN  *=*+1      ;7*27 RESULT
0320 NINE   *=*+1      ;9*27 RESULT
0330        .END

                Figure 6.

Line 20 clears the decimal mode for binary arithmetic.

Line 30 places the number 3 in the accumulator, so that it can be multiplied by 27.

Line 40 performs a JSR to the subroutine MULT27, which multiplies the accumulator by 27. The result of the multiply will be in the accumulator when the subroutine is finished. Line 50 stores the contents of the accumulator in the location THREE. This is the value 3*27.

Lines 60–80 multiply the number 7 by 27 and place the result in the location SEVEN.

Lines 90–110 multiply the number 9 by 27 and place the result in the location NINE.

Line 120 stops the program’s execution. At this point, you can examine the locations THREE, SEVEN and NINE to be sure they contain 81 ($51), 189 ($BD) and 243 ($F3), respectively.

Lines 130–260 are the multiply by 27 subroutine.


Now you know how to write subroutines in 6502 assembly language. Subroutines are a powerful programming technique, and open doors into the Atari operating system (OS). Future installments of Boot Camp will show how to access these OS routines.

Until next time, write a subroutine that will add the X register to the Y register, placing the result in the accumulator. If the result of the add is greater than 255 (carry flag set), put the value $FF in the X register. Otherwise, set the X register to $00. Good luck!

Send all letters to:

 P.O. Box 23
 Worcester, MA 01603
A.N.A.L.O.G. ISSUE 22 / SEPTEMBER 1984 / PAGE 74

Boot Camp

by Tom Hudson

Welcome to Boot Camp, the beginner’s assembly language column. With this issue, we will have completed our introduction to the world of 6502 assembly operation codes. Starting next issue, we’ll find out exactly how to apply these instructions in BASIC subroutines, games, utilities and other programs.

Fun with subroutines.

Last issue’s homework was for you to write a subroutine that would add the X and Y registers, placing the result in the accumulator. If the result of the add was greater than 255, you were to put the value $FF in the X register. If not, you were to set the X register to zero. Figure 1 shows one possible solution. Let’s step through it and see how it works.

12 ;
18 ;IF RESULT  > 255, X REG = $FF
20 ;IF RESULT <= 255, X REG = $00
22 ;
24        *=  $0600
28        STX TEMP   ;SAVE X REG,
30        TYA        ;PUT Y IN ACC.
32        CLC        ;CLEAR FOR ADD
36        BCS GTR255 ;BRANCH IF > 255
38        LDX #$00   ;ZERO X REGISTER
40        RTS        ;AND RETURN!
44        RTS        ;AND RETURN!
46 TEMP   *=*+1
48        .END

                Figure 1

Lines 10–22 are the subroutine documentation lines. They tell what the subroutine does and how to use it. This can help refresh your memory if you need to change a program several years after you write it.

Line 26 is the entry point for the subroutine. I have labeled this one ADDXY, for “Add X and Y registers.” It’s a good idea to use descriptive labels in your programs. I could have called the subroutine DOG, but this wouldn’t help me remember what the subroutine does. This line clears the decimal mode, so that we’re sure the subroutine is operating in binary math mode.

Line 28 stores the X register at the location TEMP, a temporary hold area.

Line 30 transfers the Y register to the accumulator with the TYA instruction. This is done because the 6502 add instruction (ADC) only works with the accumulator.

Line 32 clears the carry flag for the add operation.

Line 34 adds the accumulator (which now contains the Y-register value) to the location TEMP (which contains the X-register value). After this instruction executes, we have completed the first part of the homework, adding the X and Y registers with the result in the accumulator.

Line 36 branches to the label GTR255 (Greater than 255) if the carry flag is set (BCS). If the carry is not set, execution continues at Line 38. Remember that the carry flag is set if the result of an add operation is greater than 255. Review the issue 17 Boot Camp if you’re not sure of the carry flag’s function.

Line 38 places a zero in the X register if the add result was not greater than 255. The X register in this case is used as an indicator to tell the code which called the subroutine that the addition result fits in the accumulator. If the carry flag had been set, the result was greater than 255 and would not have fit in the 8-bit accumulator.

Line 40 is an RTS instruction. This will return control to the code which called the subroutine.

Line 42, labeled GTR255, is the code that will be executed if the add result is too large for the accumulator. It loads the X register with the value $FF. Once again, after the subroutine has been executed, the calling routine can test the X register. If the X register contains $FF, the calling routine can take the appropriate action.

Line 44 is another RTS instruction, and will return control to the calling code.

Line 46 defines a one-byte temporary storage location, labeled TEMP.

How would we use this subroutine? Figure 2 shows an example of the code necessary to call the subroutine ADDXY.

       LDX ADD1   ;GET ADD #1
       LDY ADD2   ;GET ADD #2
       JSR ADDXY  ;ADD X & Y
       CPX #$00   ;ADD OK?
       BNE BADADD ;NO!

            Figure 2.

As you can see, this code first loads the X and Y registers with the desired add values, then JSRs to the subroutine.

The first instruction after the JSR tests the X register to see if it’s zero. If not, the add was too large for the accumulator, and we branch to the label BADADD. If the add was okay, we store the accumulator in the location labeled RESULT and jump to another part of the program, labeled OK.

Of course, the use of the X register as an overflow flag was not really necessary in this problem. We could have simply tested the carry flag after the JSR and taken the appropriate action then. Still, I thought this would be a good time to introduce you to the technique of using subroutine result indicators.

So there you have it. Just one of the many ways in which the homework assignment can be solved. I’m sure most of you came up with other ways to accomplish the objective, and—as long as they work—it doesn’t matter which approach you take. Just remember to thoroughly test each subroutine you write, to be sure they’ll return the proper results.

Getting pushy.

Up till now, all our stack usage has been handled by the 6502 itself, in the JSR and RTS instructions. Now we’re going to find out how to use the stack for our own purposes.

The first two stack instructions we’re going to investigate are the PHA (Push accumulator onto stack) and PLA (Pull accumulator from stack). The format of the PHA instruction is:


The PHA instruction is used to place the accumulator on the “top” of the stack. It doesn’t affect any status flags. Let’s see what happens when a PHA instruction executes.

       6502 stack   <───┐
      ┌───────────┐     │
$01FF │           │     │    SP
      ├───────────┤     │   ┌──┐
      │           │     └───┤00│
      ├───────────┤         └──┘
      │           │
      │           │

            Figure 3.

Figure 3 shows how the stack looks when it’s empty. The stack pointer (SP) contains $00. As you recall from the last two Boot Camp installments, the 6502 stack resides in the memory from $0100–01FF. Let’s assume the following two instructions are executed:

       LDA #$40

The first instruction loads the accumulator with the value $40. The second instruction “pushes” this value onto the stack. The 6502 decrements the stack pointer (to $FF), then stores the accumulator’s contents at the indicated memory location. Figure 4 shows how the stack looks after the PHA instruction.

       6502 stack
$01FF │    40     │ <───┐    SP
      ├───────────┤     │   ┌──┐
      │           │     └───┤FF│
      ├───────────┤         └──┘
      │           │
      │           │

            Figure 4.

If we like, we can push another value onto the stack. Let’s push the value $6D onto the stack this time. Heres the code:

       LDA #$6D

This time, the stack pointer will be decremented (to $FE), and the value $6D stored at the indicated location. Figure 5 shows how the stack looks now.

       6502 stack
$01FF │    40     │          SP
      ├───────────┤         ┌──┐
      │    6D     │ <───────┤FE│
      ├───────────┤         └──┘
      │           │
      │           │

            Figure 4.

See how simple the PHA instruction is? No registers except the stack pointer are affected, and the numbers are sitting on the stack, ready for you to use them. How do we get them back? With the PLA instruction, of course!

Not like pulling teeth.

Once you have numbers stored on the stack, they’re incredibly easy to retrieve. We simply use the PLA instruction. Its format is:


The PLA instruction takes the first number on the stack, places it in the accumulator, sets the SIGN and ZERO flags accordingly, and increments the stack pointer so that the next value is ready to be pulled from the stack. Let’s see how this works with the numbers we placed on the stack earlier.

Figure 5 shows the stack as it appears now. We want to pull a value off the stack, so we write the following code:


The 6502 loads the accumulator from the indicated byte of the stack ($6D) and increments the stack pointer. At this point, the accumulator contains $6D, and the stack looks like Figure 6.

       6502 stack
$01FF │    40     │ <───┐    SP
      ├───────────┤     │   ┌──┐
      │    6D     │     └───┤FF│
      ├───────────┤         └──┘
      │           │
      │           │

            Figure 6.

Simple, right? We’ve just retrieved the last number placed on the stack. Let’s do it again. We use the code:


When complete, the accumulator contains $40, and the stack looks like Figure 7.

       6502 stack   <───┐
      ┌───────────┐     │
$01FF │    40     │     │    SP
      ├───────────┤     │   ┌──┐
      │    6D     │     └───┤00│
      ├───────────┤         └──┘
      │           │
      │           │

            Figure 7.

Now you see how easy stack usage is. All you need to do is push and pull the desired values, and the computer takes care of all necessary overhead. However, there are a few things you need to remember when using the stack.

Stack logic.

The first thing you must remember about the stack is that it is a LIFO (Last-In, First-Out) structure. That is, the last number you place onto the stack will be the first number that you pull off. This sometimes takes getting used to, but you’ll get the hang of it if you diagram your stack logic on paper first.

Second, the stack can only hold up to 256 numbers, and some space on the stack is used by the system. A good rule of thumb is to use the stack only when you need to, like in BASIC USR calls or when you’re running out of memory (a PHA only takes one byte; an STA can take up to three bytes).

Using the stack.

What can you use the stack for? Most people use it to store numbers temporarily or as a small table that automatically maintains pointers.

Here’s an example of using the stack to save the accumulator’s contents when a subroutine is executed. Remember that when a subroutine is executed, if it uses any registers, the values that were in those registers are lost.

Figure 8 shows how to save the accumulator so that you can be sure it is unchanged after a subroutine executes.

10        PHA         ;SAVE ACCUMULATOR
30        PLA         ;RESTORE ACCUMULATOR

                Figure 8.

Line 10 pushes the accumulator’s contents onto the stack. Now, no matter what the subroutine does with the accumulator, we can always restore the accumulator to its original value.

Line 20 calls the subroutine SUBRTN with the JSR instruction. We assume that the subroutine manipulates the accumulator, changing it to some unknown value.

Line 30 pulls the old accumulator value off the stack, making sure that we have the accumulator restored to the desired value.

Unfortunately, the designers of the 6502 did not allow for the PUSHing of the X and Y registers, so we have to write a little extra code.

To push the X register, we use the code:

       TXA          ;MOVE X TO ACCUM.
       PHA          ;AND PUSH IT!

This transfers the X register to the accumulator, then pushes the value onto the stack. Similarly, the Y value register can be pushed with the sequence:

       TYA          ;MOVE Y TO ACCUM.
       PHA          ;AND PUSH IT!

To pull the X or Y registers from the stack, use one of the following code sequences:

       PLA          ;PULL THE VALUE,
       TAX          ;AND PUT IN X!

       PLA          ;PULL THE VALUE,
       TAY          ;AND PUT IN Y!

These routines are simple enough, but you should remember that the accumulator will be lost in all of these operations unless you save it somewhere first.

Saving your status.

Sometimes you’ll want to save the processor status register before a subroutine or comparison operation so that you can test certain flags later. This can be done by using the PHP (Push processor status register onto stack) and PLP (Pull processor status register from stack) instructions. Their formats are:


The PHP and PLP instructions work just like the PHA and PLA instructions, except that they push and pull the status flags instead of the accumulator.

The PHP instruction does not affect any flags, but the PLP instruction changes all the flags, since it is actually loading the flags from the stack.

We’ll explore the use of the PHP in more detail later, when the need arises.

Which way to the stack?

Someday, you may need to know where the stack pointer is currently pointing, or you may need to change the stack pointer to point to a particular location. This is usually a rare occurrence, but I needed to do this in my debug utility, HBUG, in issue 18.

The 6502 has two instructions that will allow us to examine and change the stack pointer. These are TSX (Transfer stack pointer to X) and TXS (Transfer X to stack pointer). The formats of these instructions are:


The TSX instruction simply loads the X register with whatever happens to be in the stack pointer at the time. The sign and zero flags reflect the result of the load.

Figure 9 shows an example of the use of the TSX instruction.

10        *=  $0600
12        LDA #$F0    ;PUT # IN ACCUM.
14        TSX         ;GET STACK PTR
16        STX STACK1  ;SAVE STACK #1
18        PHA         ;PUSH ACCUM.
20        TSX         ;GET STACK PTR
22        STX STACK2  ;SAVE STACK #2
24        PLA         ;PULL ACCUM.
26        TSX         ;GET STACK PTR
28        STX STACK3  ;SAVE STACK #3
30        BRK         ;ALL DONE!
32 STACK1 *=*+1
34 STACK2 *=*+1
36 STACK3 *=*+1
38        .END

                Figure 9.

Let’s walk through this code and what happens.

Line 12 loads the accumulator with $F0.

Line 14 transfers the current contents of the stack pointer to the X register.

Line 16 stores the X register (which now contains the stack pointer value) in the location STACK1. This records the original stack location, so we can observe it later.

Line 18 pushes the accumulator onto the stack. As we now know, the stack pointer will be decremented by 1 after this operation.

Line 20 transfers the stack pointer to the X register again.

Line 22 stores the X register (containing the stack pointer value) in the location STACK2. This will record the stack’s position after the PHA instruction.

Line 24 pulls the accumulator from the stack.

Line 26 transfers the stack pointer to the X register a final time.

Line 28 stores the stack pointer contained in the X register at the location STACK3.

Line 30 stops the program’s execution.

Type this program into your computer and assemble it. Note the locations of STACK1, STACK2 and STACK3 during the assembly. When the program is assembled, execute it.

After execution, examine the memory locations at STACK1, STACK2 and STACK3. STACK1 contains the stack’s location at the beginning of the program. STACK2 contains the stack’s location after the PHA instruction. Since the PHA decrements the stack pointer, STACK2 should be one less than STACK1.

STACK3 contains the stack pointer’s contents after the PLA instruction. A PLA instruction increments the stack pointer, so STACK3 will be one more than STACK2.

The TXS instruction does the opposite of TSX. That is, you can move the contents of the X register to the stack pointer. To do this, you simply load the X register with the desired value and execute a TXS instruction, like so:

        LDX #$40    ;STACK AT $0140
        TXS         ;POINT THERE!

I strongly suggest that you leave this instruction alone for the time being. Incorrect setting of the stack pointer can cause a system lockup, so hold on until we get a chance to use it safely in a Boot Camp program.

All for now.

Well, we’ve covered all the major 6502 instructions, and we’re ready to learn some system-specific material. Starting next issue, we’ll go full speed ahead into the world of the Atari’s innards.

Send all letters to:

Boot Camp

 c/o ANALOG Computing
 P.O. Box 23
 Worcester, MA 01603
A.N.A.L.O.G. ISSUE 23 / OCTOBER 1984 / PAGE 80

Boot Camp

by Tom Hudson

As of now, all Boot Camp readers have been exposed to the instructions most assembly programmers consider important. Sure, there are a few we skipped, but they’re primarily used for advanced applications, such as interrupt handling. We’ll discuss them later.

This month’s column presents some information that will prepare you for next issue’s subject: BASIC USR calls.

I’m assuming that most readers of Boot Camp want to speed up their BASIC programs with ultra-fast machine code. If you don’t, you could probably skip this issue’s column, but it’s best that you read it and understand it. If you gain more knowledge of assembly language now, you’ll have fewer “unsolvable” problems later.

Tailor-made or off-the-rack.

From time to time, while reading ANALOG or other magazines dealing with assembly language, you may have heard the term relocatable used to describe an assembly program. What does this term mean? I find a good analogy in the clothing industry, suits in particular.

When a wealthy executive goes out to buy a suit, he probably won’t go to the local self-service “Bargain Barn” to find one. No, he’ll usually see a tailor in order to have one custom-made. More than likely, the suit he has made will fit him perfectly, and nobody else.

When I, on the other hand, go out to buy a suit, I like to get it over with as soon as possible, since I have more important things to do than buy something I’ll use at most a dozen times a year. I’ll go straight to the “Bargain Barn” and pick out an off-the-rack, stock suit. The suit would probably fit thousands of other people fairly well. If I’m lucky, it’ll look just about as good as a tailor-made suit. Usually, though, you’ll have to compromise in some area, such as “perfect” fit.

Assembly programs are sort of like suits. Some programs are written to run only in a specific area of memory and are known as non-relocatable.

Here’s an example: my program Retrofire (issue 14) was written to reside in the area of memory starting at $0800. If you try to load it at $6000, it just won’t work. At best, the screen may change color, and the system will crash. Even if you place it as little as one byte off, at $0801, it will crash. That’s because this program is tailor-made to work only at $0800, and no amount of work (short of re-assembly) will make it operate elsewhere. Retrofire is non-relocatable.

On the other hand, let’s say you write a short assembly routine that is going to be used by a BASIC USR call. You’ve placed the object code bytes in the BASIC string ML$ and are going to call it with the statement:


Since you don’t know where in memory BASIC will put the string ML$, this routine must be relocatable. It must be able to operate wherever BASIC puts it.

Just like you would with an off-the-rack suit, you’ll have to be willing to compromise to a certain degree, by writing your code so that it can be placed anywhere in memory.

Let’s take a look at how relocatable routines are written.

The ABCs of relocatability.

When you’re writing normal, non-relocatable code, you don’t have to worry about anything. You simply write to your heart’s content and let the computer do the rest.

Not so with relocatable code. There is one rule that must be followed without exception: never refer to a location or label within the relocatable code with an absolute format instruction.

What does this mean? Take a look at Figure 1.

0000        10          *=   $0600
0600 A901   20          LDA  #$01
0602 266686 30          JSR  SUBR01
0605 60     40          RTS
0606 8D8A86 50 SUBR01   STA  TEMP
0609 60     60          RTS
060A        70 TEMP     *=   *+1

                Figure 1.

Let’s assume we want to use the code in Figure 1 as a relocatable subroutine. We’ve got two problems.

First, the JSR instruction is an absolute addressing instruction and it is referring to the label SUBR01, which is within our routine. What does the relocation rule say? We cannot use an absolute addressing instruction which refers to a label within the code to be relocated. This JSR is a definite no-no.

Second, the STA TEMP instruction is also absolute and it refers to TEMP, a label within the routine. Sorry, but you can’t do this, either!

Let’s see what happens if this routine is relocated to $6000, instead of $0600, where it was assembled. Figure 2 shows the program image stored in memory at $6000, with the source code shown to the right.

(6006) A901   =  LDA  #$01
(6002) 200606 =  JSR  $0606
(6005) 60     =  RTS
(6006) 8D0A06 =  STA  $060A
(6009) 60     =  RTS
(600A)        =  *=   *+1

            Figure 2.

First, the LDA #$01 is executed. Since this is an immediate format instruction, all is well so far.

Next, the JSR SUBR01 instruction executes. If you look at Figure 1, you’ll see that SUBR01 is supposed to be at location $0606, but the program has been relocated to $6000! The code at SUBR01 is now at $6006, yet the 6502 has no alternative but to follow its instructions. It JSRs to $0606!

What happens next is anybody’s guess. Location $0606 may contain BRK instructions, garbage or even Aunt Mary’s recipe program. There’s simply no way of telling, and the system will probably crash.

How do we avoid such a catastrophe? It takes a little work, but it can be done. Rethink your program so that it does not use absolute addressing instructions. Sometimes this is easier said than done, but if you want it relocatable, you’ve got to work a little harder.

The most common problem in relocating comes when you need to JMP to another part of the routine. Remember, the most common JMP instruction is (you guessed it) absolute! Here’s an uncomplicated solution…

All of the 6502 branch instructions use relative addressing. This isn’t absolute, so we can use all the branch instructions in our relocatable routines. The only problem is that all the branch instructions are conditional. In order to branch each time the branch instruction is executed, we’ll have to make sure its branch condition is true. All the following combinations will replace the JMP instruction:

        BCC LABEL
        BCS LABEL
        LDA #0
        BEQ LABEL
        LDA #1
        BNE LABEL
        LDA #$FF
        BMI LABEL

All of these branch instructions replace the JMP instruction, but their branch range is limited to about 128 bytes. That is, if your relocatable routine is 200 bytes long, and you need to branch from the end to the beginning, one branch won’t go far enough. You’ll have to set up a “bucket brigade” branch. This is accomplished by branching to a second branch, which, in turn, branches to the final destination label. We’ll look at this process in detail in another installment.

Where to put data?

Another common problem in relocatable routines is being uncertain about where to place data values. They can’t be placed in the routine itself, because to load and store the data requires the use of absolute addressing.

If your relocatable routine is for Atari BASIC, you can use the zero page locations $CB through $D1. Page 6 ($0600–06FF) is also available for data storage. When your relocatable routine utilizes data in these areas, all is well because they never move.

Subroutines in relocatable code.

Using subroutines in relocatable code is a particularly messy problem and one for which I’ve never seen a good solution. For now, try to write any relocatable routines with the subroutine code in-line. This is usually acceptable for short subroutines.

Making code relocatable.

As we have seen, the code in Figure 1 is far from being relocatable. However, we can make it relocatable with a few small changes.

First, let’s get rid of the subroutine. It’s a short one, so there is no real problem with putting it in-line. Figure 3 shows the code modified to eliminate the subroutine.

        LDA #$01
        STA TEMP
   TEMP *=*+1

     Figure 3.

Okay, that takes care of the subroutine problem, but there’s still the matter of the TEMP storage location.

No problem, we’ll simply place it in a free location on page zero, as shown in Figure 4.

   TEMP = $CB
        LDA #$01
        STA TEMP

      Figure 4.

As you can see, we have merely told the assembler that TEMP is at location $00CB. This shows the use of the EQUATE directive. Your assembler may use the directive EQU instead of the “equal” sign. Check your assembler manual.

That was simple enough, right? Let’s do another one.

      ADC #1
      STA BYTE2
      BNE PART3
      JMP PART2
BYTE1 *=*+1
BYTE2 *=*+1

        Figure 5.

Figure 5 shows a slightly larger program that is not relocatable. It has two data items and three JMP instructions that must be altered in order to make the program relocatable.

      BNE PART3
      JMP PART2
BYTE1 =   $600
BYTE2 =   $601
      ADC #1
      STA BYTE2
      JMP PART3

    Figure 6.

Let’s change the data items first. They’re easiest, since the only action needed is to place them in fixed memory somewhere. We’ll put them on page 6, the area of memory set aside for our use. Figure 6 shows the program after we make the data item change.

Now let’s tackle the JMP instruction. The first JMP jumps to PART3. If you examine the code at PART3, you’ll see that it expects the accumulator to contain the result of the add in the START section. Therefore, we cannot alter the accumulator. In this case, let’s replace JMP PART3 with the code:


This code clears the carry flag, forcing the BCC PART3 to branch. It’s simple and it works just like the JMP did.

The next JMP, the one in the PART2 section, will JMP to START. We need to replace the JMP with a branch, and this case is particularly easy.

If you look at the instruction preceding the JMP, you’ll see that it’s a BNE (Branch Not Equal) instruction. This means that the JMP START instruction will only execute if the accumulator is equal to four.

We can take advantage of this fact when we replace the JMP. In this situation, the JMP START can be replaced with:


The last JMP, in the PART3 section, is right after an LSR instruction. We don’t want to disturb the accumulator, so we can replace the JMP with the code:


The final, relocatable code for the program is shown in Figure 7.

      ADC #1
      STA BYTE2
      BCC PART3
      BNE PART3
      BCC PART2
BYTE1 =   $600
BYTE2 =   $601

   Figure 7.

The important thing to remember when making a program relocatable is to avoid disturbing any registers the program is using. Don’t make any assumptions about what the program is doing—check it out.

Review the instructions.

It’s a good idea, at this point, for you to go back and review all the operation codes we’ve discussed so far, noting all those which use the absolute addressing mode. It’s important that you get to know all of the assembly instructions as well as you know the BASIC commands. This will avoid wasting a lot of time looking instructions up in a book when you start programming.

Next issue, we’ll talk more about relocatable code, when we start examining BASIC USR calls. Until then, review!

Send all letters to:

Boot Camp
 c/o ANALOG Computing
 P.O. Box 23
 Worcester, MA 01603
A.N.A.L.O.G. ISSUE 25 / DECEMBER 1984 / PAGE 57

Boot Camp

by Tom Hudson

Well, Boot Camp fans, here I am again, after a month’s absence. I hope all Boot Camp readers typed in the BOFFO program from last issue—it’ll be a big help to you in the future when building BASIC USR call structures.

Speaking of BASIC USR calls, that just happens to be the subject of this issue’s column. We’re going to take a couple of problems and solve them using assembly language, via the USR function.

Using USR.

As we all know, interpreted BASIC is a very slow way to get things done in a computer. For an explanation of what happens in interpreted BASIC, take a look at issue 13’s Boot Camp. We’re going to look at a way to overcome this inherent sluggishness, using the USR (call user subroutine) function. The format of this function is:

USR( aexp1 [,aexp2] [,aexp3...] )

…where aexp1 is the address of the assembly code to be executed, and aexp2, aexp3, etc. are optional numeric arguments which are passed to the assembly routine.

The USR function simply tells BASIC to perform a JSR to the assembly code located at the address indicated by aexp1. Take a look at the following USR call:


This USR call will execute the code at location 1536 ($0600) and return to BASIC, where the program will continue execution with the next statement. If the subroutine is to return a value to BASIC, it will be returned in the variable A. This USR call doesn’t use any arguments.

Many times, you’ll want your USR call to process some type of data or accept parameters of some sort. This is done by using the optional arguments. Look at this example:


This USR function calls the assembly code located at address 30926 ($78CE) and passes two values, 102 and the value of Q*3, to the subroutine. The arguments must be integers in the range 0–65535, and the assembly code must be equipped to handle the two parameters properly.

What happens to the parameters we send? BASIC places them on the 6502 stack for easy retrieval by the assembly code. For example, look at the following USR call:


This code calls the routine at 1536 ($0600) and passes two arguments, 509 and 200, to the subroutine. After the USR function is processed, and control is passed to the assembly code, the stack looks like Figure 1.

         │   nn   │ ┐
         ├────────┤ ├ RETURN ADDR (nnnn)
         │   nn   │ ┘
         │   200  │ ┐
         ├────────┤ ├ ARGUMENT 2
         │     0  │ ┘
         │   253  │ ┐
         ├────────┤ ├ ARGUMENT 1
         │     1  │ ┘
POINTER─>│     2  │ ─ # OF ARGUMENTS

                 Figure 1.

Remember that the stack grows downward in memory, so the first number on the stack is the number of two-byte arguments placed on the stack by BASIC, or 2. This value is important for assembly code which may require a variable number of parameters, since the subroutine can tell how many parameters are available by this number.

The next item on the stack is the first parameter, 509. As you can see, the high byte of this value will be pulled off the stack first, followed by the low byte. All arguments are placed on the stack in high-byte, low-byte order.

Next in the stack is the second parameter, 200. Like the first argument, it is placed in high-byte (0), low-byte (200) order.

Finally, the stack contains the return address for the assembly subroutine. In this stack illustration, the address is shown as nnnn, since we don’t know where BASIC will go when the assembly routine returns. This is simply an RTS return address, and—after processing the parameters, if any—the subroutine merely executes an RTS instruction to return to BASIC.

Remember how I said the assembly code could return a value to BASIC? If this is necessary, the assembly code should place the low byte of the value in location 212 ($D4) and the high byte in location 213 ($D5). BASIC automatically puts the value into the proper variable. The programmer must take care, however, that the result is in the range 0–65535.

Useful example #1.

I’ve always said that there’s no better way to learn something than hands-on experience, so let’s write a simple program to add two arguments passed by a BASIC program.

We’ll place the subroutine on page 6 (you could put it anywhere in free RAM) and call it with the following code:


Now let’s write the assembly code necessary for the routine. First, we must set up the program for binary arithmetic with the CLD instruction. I can’t overemphasize how important it is to know the status of the decimal mode flag, especially in programs which perform mathematical operations. Our program looks like this:

0160        *=  $0600  ;PUT ON PAGE 6
0170        CLD        ;BINARY MATH!

Next, we use the PLA instruction to pull the first byte from the stack. As explained earlier, the first byte tells the routine how many arguments were passed to the routine. For simplicity, we’ll assume the USR call was properly set up with two arguments, and ignore this value. Now our program looks like this:

0160        *=  $0600 ;PUT ON PAGE 6
0170        CLD       ;BINARY MATH
0180        PLA       ;PULL # OF ARGS

The next step is to pull the first argument from the stack, placing it in a temporary storage location. We have several bytes available on page zero, from $CB–$D1, so we’ll use $CB (for the low byte) and $CC (for the high byte) to hold parameter 1. Remember, parameters are stored in high-byte, low-byte order. We first pull the high byte from the stack and store it, then the low byte. Our program now looks like this:

0100 ARG1L  =   $CB
0110 ARG1H  =   $CC
0160        *=  $0600 ;PUT ON PAGE 6
0170        CLD       ;BINARY MATH!
0180        PLA       ;PULL It OF ARGS
0190        PLA       ;PULL ARG1 HI
0200        STA ARG1H ;SAVE IT
0210        PLA       ;PULL ARG1 LOW
0220        STA ARG1L ;SAVE IT

Now we must pull the second argument from the stack and place it in temporary storage. We’ll use locations $CD (for the low byte) and $CE (for the high byte) to store parameter 2. Once again, we must pull the parameter from the stack, high byte first, then low byte, storing each in the proper location. Our program so far:

0200 ARG1L  =   $CB
0110 ARG1H  =   $CC
0120 ARG2L  =   $CD
0130 ARG2H  =   $CE
0160        *=  $0600 ;PUT ON PAGE 6
0170        CLD       ;BINARY MATH!
0180        PLA       ;PULL # OF ARGS
0190        PLA       ;PULL ARG1 HI
0200        STA ARG1H ;SAVE IT
0210        PLA       ;PULL ARG1 LOW
0220        STA ARG1L ;SAVE IT
0230        PLA       ;PULL ARG2 HI
0240        STA ARG2H ;SAVE IT
0250        PLA       ;PULL ARG2 LOW
0260        STA ARG2L ;SAVE IT

Okay, we’ve pulled all the arguments from the stack, and we’re now ready to add them together and put the result in locations $D4 and $D5, which we’ll label RESLO and RESHI. The addition is a simple, two-byte add, like many we’ve covered before. After the addition, we place an RTS instruction to take us back to BASIC, and the assembly code is complete. The final program looks like this:

0100 ARG1L  =   $CB
0110 ARG1H  =   SCC
0120 ARG2L  =   SCD
0130 ARG2H  =   $CE
0140 RESLO  =   $D4
0150 RESHI  =   $D5
0160        *=  $0600 ;PUT ON PAGE 6
0170        CLD       ;BINARY MATH!
0180        PLA       ;PULL # OF ARGS
0130        PLA       ;PULL ARG1 HI
0200        STA ARG1H ;SAVE IT
0210        PLA       ;PULL ARG1 LOM
0220        STA ARG1L ;SAVE IT
0230        PLA       ;PULL ARG2 HI
0240        STA ARG2H ;SAVE IT
0250        PLA       ;PULL ARG2 LOM
0260        STA ARG2L ;SAVE IT
0270        LDA ARG1L ;GET ARG1 LOM
0280        CLC       ;CLC FOR ADD,
0290        ADC ARG2L ;ADD TO ARG2 LOM
0310        LDA ARG1H ;GET ARG1 HI
0320        ADC ARG2H ;ADD TO ARG2 HI
0340        RTS       ;ALL DONE.

Now we must assemble the program and place the object file on disk. Use BOFFO (see issue 24 of ANALOG Computing) to convert the object file to BASIC DATA statements. If you don’t have BOFFO, you can do this by hand. Using BASIC, you must POKE this data into memory, starting at location 1536, then call the subroutine with the USR call shown earlier. Figure 2 shows a BASIC program which does this.

10 FOR X=1536 TO 1563:READ N:POKE K,N:NEXT X
30 A=USR(1536,VAL1,VAL2)
50 GOTO 20
60 DATA 216,104,104,133,204,104,133,203,104,133,206,104,133,205,165,203,24,101,205,133,212,165,204,101,206
70 DATA 133,213,96

                Figure 2.

Line 10 READS the DATA statements and POKEs each byte into page 6. This sets up the assembly subroutine so we can use it through BASIC.

Line 20 accepts the two values to be added and places them in the variables VAL1 and VAL2. Be careful that the values you enter will not add up to more than 65535, or you’ll get an incorrect result.

Line 30 calls the assembly subroutine with the USR function and places the result (VAL1 + VAL2) in the variable A.

Line 40 prints the result of the addition.

Line 50 loops back to accept another set of values to add.

Lines 60–70 are the DATA statements which contain the numeric values for the assembly code. The first value, 216, is the decimal value for the CLD instruction, the first instruction in the subroutine. The next number, 104, is the value of the PLA instruction, and so on.

When the program is executed, enter two numbers and press RETURN. BASIC will send the values to the assembly routine, which will add them and return. BASIC will then print the result. Nifty, huh?

Useful example #2.

Our first example of using the USR function showed how to call in a fixed memory location, such as page 6. Sometimes we can’t use page 6 for some reason, so we must find another place to store our routines. Luckily for us, BASIC has a built-in way to reserve RAM: Strings!

Strings are usually used in BASIC to hold alphanumeric information, such as names, messages or other text. As you will soon see, strings are not limited to these uses. Each position in a string can hold a single byte, with a value from 0–255, just like any other memory location. What we’re going to do is load the bytes of an assembly subroutine into a string and call it with a USR function.

There’s one small snag with this technique, though: strings can move around in memory! Yes, that’s right. They aren’t always in the same place. When you execute a BASIC program, the BASIC interpreter puts the strings in the first available space it can find. If you add or delete code in your program, BASIC must move the string to the appropriate location. We can always find the string’s location with the ADR function, but the code placed in the string must be made relocatable, or address-independent. This simply means that the code will execute no matter where BASIC places it in memory. We covered this subject in ANALOG Computing issue 22’s Boot Camp, so, if you haven’t read that, do so now.

Our second USR call example will return a random number from 0–65535. The assembly subroutine will be placed in the string RAND$. To call the subroutine, we’ll use the code:


As you can see, we are using the ADR function to find the address of RAND$, so that the USR call will know where the routine is located in memory. You should also note that there are no arguments being passed to the subroutine, so we won’t have to worry about storing any parameters. We will, however, have to pull the number of parameters, which will be zero.

This time we don’t care how the decimal mode is set, because we aren’t going to perform any arithmetic. Therefore, our first action in the program is to pull the number of arguments from the stack. Remember to PLA this value; even if there are no arguments, BASIC puts the number of arguments on the stack. After pulling this value, we will forget about it. So far, our code looks like this:

0130 *= $0600 ;PUT ANYWHERE

Note that I have set the program counter to $0600, even though we will be placing this code in a string. Most assemblers require a starting address, so we’re providing a dummy address. If you like, you can always place this code at $0600, since it’ll execute anywhere in memory.

Next, we need to get a random number. Fortunately for us, the designers of the Atari computer systems gave us a handy memory location, RANDOM. RANDOM is located at $D20A (53770) and gives a random number from 0–255 when it’s read. In order to get a random number from 0–65535, we just have to read RANDOM twice, each time placing the byte read into the BASIC return value locations $D4 and $D5. For example, assume the first RANDOM byte is 194, and that it’s placed in the low byte of the result. Further, assume the second RANDOM byte is 49, and that it’s placed in the high byte of the result. When we return to BASIC, the random number will be 12738 ((49 * 256) + 194). When the random number is stored, we can return to BASIC with the RTS instruction. After the addition of this code and the necessary equates, our program looks like this:

0100 RESLO  =   $D4
0110 RESHI  =   $D5
0120 RANDOM =   $D20A
0130        *=  $0600  ;PUT ANYWHERE
0140        PLA        ;PULL # OF ARGS
0150        LDA RANDOM ;GET RANDOM #
0160        STA RESLO  ;PUT IN LOW
0170        LDA RANDOM ;GET RANDOM #
0180        STA RESHI  ;PUT IN HIGH
0150        RTS        ;ALL DONE!

Now you can assemble the code and convert it into BASIC DATA with the BOFFO program. Figure 3 shows the BASIC code needed to install and use the random number routine.

10 DIM RAND$(12)
40 ? A
50 GOTO 30
60 DATA 104,173,10,210,133,212,173,10,210,133,213,96

                Figure 3.

Line 10 dimensions the RAND$ string to 12 bytes. That’s how many bytes the routine will occupy in memory.

Line 20 READs the DATA statements in Line 60 and places them into the RAND$ string, using the CHR$ function. This function simply places the numeric values of the assembly code into the string. If you print the string, you’ll see the ATASCII character equivalents of your code.

Line 30 calls the assembly subroutine and puts the random number into the variable A.

Lines 40–50 print the random number and loop back to Line 30 to get another.

Line 60 contains the decimal DATA for the random number routine.

A new challenge.

Now that you’ve seen a couple of simple examples of BASIC USR calls, I’ve got a challenge for you to try at home. Write a USR call which will accept two arguments. Add them and divide the total by two, returning this value to BASIC. This is a simple, average routine, and you should be able to handle it easily, since we’ve covered all the techniques you need to solve the problem. The answer will be an integer, so don’t worry about fractional results.

When you get a solution, let me know. Send your solutions to:

Boot Camp
c/o ANALOG Computing
P.O. Box 23
Worcester, MA 01603

Next issue, we’ll go deeper into USR calls, including variable-argument calls. Stay tuned!

A.N.A.L.O.G. ISSUE 26 / JANUARY 1985 / PAGE 85

Boot Camp

by Tom Hudson

This month, Boot Camp continues its coverage of BASIC USR calls, with a couple of new twists, including variable-argument calls. So, if you haven’t read the last Boot Camp, I urge you to check it out. The information it contains is vital to this installment.

An “average” challenge.

I hope all Boot Camp readers tried to solve last issue’s problem, which was to write a USR call that would accept two arguments and find their average. This is done by adding the two arguments and dividing the total by two. Figure 1 shows one possible solution.

10 RESLO   =   $D4
20 RESHI   =   $D5
30         *=  $0600
0100       CLD       ;DECIMAL MODE!
0110       PLA       ;DISCARD n ARGS
0120       PLA       ;PULL ARG1 HI 
0130       STA ARG1H ;AND SAVE IT
0140       PLA       ;PULL ARG1 LO
0150       STA ARG1L ;AND SAVE IT
0160       PLA       ;PULL ARG2 HI
0170       STA ARG2H ;AND SAVE IT
0180       PLA       ;PULL ARG2 LO
0190       STA ARG2L ;AND SAVE IT
0200       LDA ARG1L ;ADD...
0210       CLC       ;ARGUMENT1...
0220       ADC ARG2L ;TO...
0230       STA TEMPL ;ARGUMENT2...
0240       LDA ARG1H ;AND...
0250       ADC ARG2H ;PLACE IT...
0260       STA TEHPM ;IN TEMP!
0270       LDA #0
0280       ADC #0
0290       STA TEMPH
0300       LSR TEMPH ;DIVIDE...
0310       ROR TEMPM ;ARG1 * ARG2...
0320       ROR TEMPL ;BY 2
0330       LDA TEMPL ;AND PUT...
0350       LDA TEMPM ;IN BASIC'S...
0370       RTS       ;ALL DONE!
0300 ARG1L *=*+1     ;ARGUMENT 1
0390 ARG1H *=*+1
0400 ARG2L *=*+1     ;ARGUMENT 2
0410 ARG2H *=*+1
0420 TEMPL *=*+1     ;TEMPORARY HOLD
0430 TEHPM *=*+1
0440 TEMPH *=*+1
0450       .END

              Figure 1.

This solution is probably very close to the one most beginning assembly programmers would come up with and is, therefore, less efficient than it could be. We’ll look at a better solution in a moment, but first let’s analyze this one.

Line 100 clears the decimal mode. I know I repeat myself a lot on this point, but it’s important: always be sure of the decimal mode setting!

Line 110 pulls the first byte from the stack. This is the number of parameters passed to the subroutine by BASIC, and we’ll simply discard it, assuming the BASIC program sent two parameters.

Lines 120–150 pull the first argument from the stack and store it in the locations ARG1L and ARG1H for later processing.

Lines 160–190 pull and store the second argument, placing it in the locations ARG2L and ARG2H.

All right, so far we’ve placed the two numbers we’re going to process in the proper memory locations, and we’re ready to average them.

One factor to consider at this point is an important thing to remember in assembly language: data overflow. In this averaging subroutine, we must add the two arguments, then divide this number by two in order to get our answer. We must make sure that the place we use to store the result of the addition is large enough to hold it! For example, if we add the two-byte values 45960 and 37902, we get the result 83862, which requires a three-byte storage area. In this “beginner’s” average routine, we have set up a three-byte area—TEMPL, TEMPM and TEMPH—to hold the result of the addition.

Lines 200–290 add the two arguments and place them in the temporary hold area.

Lines 300–320 divide the value found in the temporary storage area by two. Remember that performing a logical shift right (LSR) on a byte will divide its contents by two. The rotate right (ROR) operation is used on subsequent bytes of a multiple-byte shift. If you are unclear about this process, re-read issue 19’s Boot Camp. After this divide is complete, the average of the two values is present in the TEMPL and TEMPM bytes.

Lines 330–360 move the final result to RESLO and RESHI, the two-byte location which returns the USR call’s result to BASIC.

Finally, Line 370 executes an RTS instruction, returning to BASIC.

Figure 2 shows the BASIC code needed to execute this USR call. When two values are entered, the subroutine retiu’ns the average of the two numbers to BASIC, which prints them.

10 FOR X=1536 TO 1599:READ N:POKE X,N:NEXT X
30 A=USR(1536,VAL1,VAL2)
50 GOTO 20
60 DATA 216,104,104,141,65,6,104,141,64,6,104,141,67,6,104,141,66,6,173,64,6,24,109,66,6
70 DATA 141,68,6,173,65,6,109,67,6,141,69,6,169,0,105,0,141,70,6,78,70,6,110,69,6
80 DATA 110,68,6,173,68,6,133,212,173,69,6,133,213,96

              Figure 2.

As I mentioned earlier, the solution shown above is a typical beginner’s answer to the problem. Let’s examine a more efficient example, shown in Figure 3.

10 RESLO   =   $D4
20 RESHI   =   $D5
30         *=  $0600
0100       CLD       ;DECIMAL MODE!
0110       PLA       ;DISCARD tt ARGS
0120       PLA       ;PULL ARG 1 HI
0130       STA ARG1H ;AND SAVE IT
0140       PLA       ;PULL ARG 1 LO
0150       STA ARG1L ;AND SAVE IT
0160       PLA       ;PULL ARG 2 HI
0170       STA RESHI ;SAVE IT
0180       PLA       ;PULL ARG 2 LO
0190       CLC       ;ADD IT TO
0200       ADC ARG1L ;ARG 1 LO
0220       LDA RESHI ;GET ARG2 HI
0230       ADC ARG1H ;ADD TO ARG 1
0250       ROR RESHI ;DIVIDE...
0260       ROR RESLO ;RESULT BY 2
0270       RTS       ;AND EXIT!
0280 ARG1L *=*+1
0290 ARG1H *=*+1
0300       .END

            Figure 3.

Looking at this new example, you will notice that it is much shorter than Figure 1 and has only two bytes reserved for data storage. Let’s see why.

Line 100 clears the decimal mode, as usual.

Line 110 pulls the number of arguments oft the stack.

Lines 120–150 perform the same function as in Figure 1, pulling the first argument off the stack and placing it in the ARG1L and ARG1H locations.

From this point on, the program’s operation is quite different from that of Figure 1. We no longer use the TEMP area to hold the results of the add. Instead, we use the two-byte BASIC result area, RESLO and RESHI.

“Wait a minute,” you say, “what about the overflow problem?” You’re right; we’ll have to consider that. However, because of the way this problem is structured, we can use a simple trick involving the carry flag to eliminate the danger of an overflow occurring.

In our problem, the largest argument values possible (using our two-byte format) are 65535 ($FFFF) and 65535 ($FFFF), which, when added, give a result of 131070 ($1FFFE). If we’re going to store this value (as we did in Figure 1), we must have a three-byte field in which to do it. Figure 4 shows the three-byte field before and after the shift operation. Note that, after the shift, the two lower-order bytes contain $FFFF, which is, of course, the average of the values $FFFF and $FFFF.


   TEMPH      TEMPM      TEMPL
│ 00000001 │ 11111111 │ 11111110 │ = $1FFFE


   TEMPH      TEMPM      TEMPL
│ 00000000 │ 11111111 │ 11111111 │ = $FFFF

                  Figure 4.

The thing that makes this particular problem easier to solve is the fact that, as soon as we add the values, we perform the right-shift operations to divide the answer by two. As you recall, when multi-byte values are added, the carry flag holds the bits that are to be carried to the next byte. In this case, only one bit ever gets canied to the third byte of TEMP, and, therefore, we can simply leave that bit in the carry flag—rather than storing it in a third byte. Since that bit stays in the carry flag, we can use the rotate instruction to shift the bit out of the carry flag and into the high-order byte of our result. Figure 5 shows the use of this technique with the same problem as Figure 4. Note, once again, that the result is $FFFF, the correct average.


  CARRY      RESHI        RESLO
  ┌───┐   ┌──────────┐ ┌──────────┐
  │ 1 │   │ 11111111 │ │ 11111110 │
  └───┘   └──────────┘ └──────────┘


  CARRY      RESHI        RESLO
  ┌───┐   ┌──────────┐ ┌──────────┐
  │ 0 │   │ 11111111 │ │ 11111111 │
  └───┘   └──────────┘ └──────────┘

           Figure 5.

Lines 160–170 pull the high-order byte of argument 2 from the stack and place it in the location labeled RESHI. Remember, values stored in RESHI and RESLO will be returned to BASIC automatically.

Line 180 pulls the low-order byte of argument 2 from the stack, leaving it in the accumulator.

Lines 190–200, instead of storing the low byte of argument 2, add it to the low byte of argument 1, which was stored earlier. The result of this add (the first part of a two-byte add) is stored in RESLO at Line 210.

Lines 220–240 complete the addition process by adding ARGIH (the high byte of argument I) to RESHI (the high byte of argument 2, which we stored earlier), and storing the result in RESHI. At this point—if we were averaging $FFFF and $FFFF-RESLO, RESHI and the carry flag would look exactly like the first part of Figure 5. We’re now ready to divide this answer by two for our average.

Line 250 performs a rotate right (ROR) on RESHI. This is the same as a shift operation, except that the carry flag is shifted into the leftmost bit, and the rightmost bit is shifted into the carry flag. Internally, the operation looks like this:

  CARRY      RESHI       CARRY
  ┌───┐   ┌──────────┐   ┌───┐
  │ 1 │──>│ 11111111 │──>│ 1 │
  └───┘   └──────────┘   └───┘

Line 260 performs another rotate right, but this time the byte RESLO is rotated. The carry from the previous rotate (Line 250) is shifted into the leftmost bit, and the rightmost bit is placed in the carry flag, like so:

  CARRY      RESHI       CARRY
  ┌───┐   ┌──────────┐   ┌───┐
  │ 1 │──>│ 11111110 │──>│ 0 │
  └───┘   └──────────┘   └───┘

After this instruction, RESLO, RESHI and the carry flag will look like the second part of Figure 5, and all the average calculations are complete!

At this point, all the program needs to do is return to BASIC, and it does so in Line 270 with an RTS instruction.

Once again, this routine is written to reside on page 6 of computer memory, and the BASIC code needed is shown in Figure 6.

10 FOR X=1536 TO 1567:READ N:POKE X,N:NEXT X
30 A=USR(1536,VAL1,VAL2)
50 GOTO 20
60 DATA 216,104,104,141,33,6,104,141,32,6,104,133,213,104,24,109,32,6,133,212,165,213,109,33,6
70 DATA 133,213,102,213,102,212,96

            Figure 6.

Run the program and try some different input. The program will print the average just like Figure 2 did, but with far less memory needed to perform the operation. The moral: don’t accept your first solution to any problem. Think it through and try to find the most efficient way to get the job done.

Square off.

Here’s a routine that’s a little different: an algorithm to calculate integer square roots! Figure 7 shows a flowchart of the algorithm, which is very simple.












Figure 7.

Let’s convert this flowchart into BASIC first, so we can get a “feel” for what’s going on. Figure 8 shows the BASIC code corresponding to the flowchart.


            Figure 8.

Type in this program and RUN it. When you are prompted for a value, type 9 and press RETURN. The computer will print out an answer of three, which is, of course, the square root of nine. Try typing 12 and pressing RETURN. Once again, the computer will give an answer of three. What happened? Well, this is an integer square root routine, and it will return only the integer portion of the answer. Since the actual square root of 12 is approximately 3.464, the integer result will be 3.

Now let’s write the same routine in assembly language. The subroutine will only require one argument, the number we wish to find the square root of. Let’s keep the routine simple and limit the arguments to a range of 0–255. If we do this, all the program variables can be one-byte areas, except for TEST. This is the variable compared to the argument to test for the end of the routine. If TEST is greater than the argument, the square root has been found. Since we are allowing arguments up to 255, TEST must be able to exceed 255 and, so, must be a two-byte value. Figure 9 shows the assembly language version of the square root routine.

10 RESLO  = $D4
20 RESHI  = $D5
30        *=  $0600
0100      CLD        ;DECIMAL MODE!
0110      PLA        ;DISCARD # ARGS
0120      PLA        ;DISCARD ARG HI
0130      PLA        ;PULL ARG LO
0150      LDA #1     ;PUT 1 IN...
0160      STA RESLO  ;RESULT,
0190      LDA #0     ;ZERO OUT...
0230      BNE END    ;WE'RE DONE!
0250      CMP TESTL  ;TEST?
0280      RTS        ;AND EXIT!
0300      INC SQADD  ;SQADD + 2
0320      CLC        ;ADD THE
0360      ADC #0
0370      STA TESTH
0390      JMP SQLP   ;AND LOOP BACK!
0400 NUMBER *=*+1
0410 TESTL  *=*+1
0420 TESTH  *=*+1
0430 SQADD  *=*+1
0440      .END

              Figure 9.

Line 100 clears the decimal mode.

Line 110 pulls the number of arguments from the stack. Since this routine will always be a single-argument one, this value is discarded.

Lines 120–140 pull the argument off the stack. Since we are only allowing values from 0–255, the high byte of the argument is discarded. The low byte is placed in the location labeled NUMBER.

Lines 150–210 initialize the subroutine variables. Note that Lines 190–210 set the high bytes of the result and the variable TEST to zero. Always be sure your multi-byte variables are properly initialized.

Lines 220–260 test for the end of the routine. In Lines 220–230, if the high byte of TEST is greater than zero, then we know TEST is greater than NUMBER (which is only a one-byte variable), and the program branches to END. If the high byte of TEST is zero. Lines 240–260 will compare NUMBER to the low byte of TEST. If NUMBER is greater than TESTE, the branch to NOSQ is taken and processing continues.

Lines 270–280 are used to exit the subroutine. Line 270 decrements the result by one, just like the BASIC program did in Line 30 of Figure 8. Line 280 executes an RTS instruction.

Lines 290–300 add two to SQADD by incrementing it twice.

Lines 310–370 add SQADD to TEST. Note that, since SQADD is a single-byte value and TEST is a two-byte value, a dummy value of zero is added to the high byte of TEST.

Line 380 adds one to the answer, stored in RESLO. This is done using the increment memory (INC) instruction.

Line 390 jumps back to SQLP to test the new square root value.

Figure 10 shows the BASIC code used to execute the machine language version of the subroutine. Try values from 0–255 to verify that the program works properly.

10 FOR X=1536 TO 1604:READ N:POKE H,N:NEXT X
30 A=USR(1536,VALUE)
50 GOTO 20
60 DATA 216,184,104,104,141,69,6,169,1,133,212,141,70,6,141,72,6,169,0,133,213,141,71,6,173
70 DATA 71,6,208,8,173,69,6,205,70,6,176,3,198,212,96,238,72,6,238,72,6,173,70,6,24
80 DATA 109,72,6,141,70,6,173,71,6,105,0,141,71,6,230,212,76,24,6

                Figure 10.

All of the programs we’ve written so far are fixed-argument subroutines. That is, the square root routine never uses more than one argument, and the average routine never uses less than two. Now let’s take a look at a short program which will take as many arguments as you send it!

Fun with arguments.

Let’s say you want to add a list of numbers, using an assembly language subroutine. There’s no telling how many numbers there will be, because it’s different every time. Fortunately, the BASIC USR call mechanism will tell our subroutine exactly how many arguments it needs to process. Up until now, we’ve ignored this value, which is always the first number pulled off the stack. Figure 11 shows a program which uses this information to add a variable number of values, returning the total to BASIC.

10 RESLO   =   $D4
20 RESHI   =   $D5
30         *=  $0600
0100       CLD       ;DECIMAL MODE!
0110       PLA       ;GET # ARGS
0130       LDA #0    ;ZERO OUT...
0150       STA RESHI
0160 ARGLP PLA       ;GET ARG HI
0170       STA ARGHI ;SAVE IT
0180       PLA       ;GET ARG LO
0190       CLC       ;AND ADD...
0200       ADC RESLO ;TO TOTAL,
0220       LDA ARGHI ;GET HI BYTE,
0270       RTS       ;ALL DONE!
0290 ARGHI *=*+1
0300 ARGCT *=*+1
0310       .END

              Figure 11.

Let’s look at the program and follow its logic.

Line 100 clears the decimal mode, as usual.

Lines 110–120 pull the number of arguments from the stack and store this value in the variable ARGCT.

Lines 130–150 zero out the total area, RESLO and RESHI. This ensures that the adding routine will start with zero.

Lines 160–170 pull the high byte of the argument from the stack and place it in the location labeled ARGHI.

Lines 180–210 pull the low byte of the argument from the stack, add it to the low byte of the total, and store the result back in RESLO.

Lines 220–240 add the high byte of the argument to the high byte of the total, storing the result back in RESHI.

Line 250 decrements the argument counter by one.

Line 260 will branch back to ARGLP if the result of the decrement was not zero (more arguments to process).

If there are no more arguments (ARGCT = 0), the program exits with the RTS instruction at Line 270.

That’s all there is to it! The BASIC code to use this USR call is shown in Figure 12. As it is, the routine will add the numbers from 1 to 10, returning the result 55. Try using different values. Just be sure you have at least one value to add and that the total will not exceed 65535.

10 FOR X=1536 TO 1569:READ N:POKE X,N:NEXT X
20 A=USR(1536,1,2,3,4,5,6,7,8,9,10)
40 END
50 DATA 216,104,141,35,6,169,0,133,212,133,213,104,141,34,6,104,24,101,212,133,212,173,34,6,101
60 DATA 213,133,213,206,35,6,208,234,96

              Figure 12.

The use of multiple-argument USR calls is almost unrestricted. The main thing to remember, as far as system limitations go, is that you can’t have more than about 27 arguments. Otherwise, BASIC will give you an ERROR’ 10 (argument stack overflow). Actually, 27 arguments is more than you’ll probably ever try, but I just thought I’d warn you!

Until next time…

Between now and next issue’s Boot Camp, try using and modifying these simple USR calls. Next time, we’ll go even further into this interesting area of assembly language.

A.N.A.L.O.G. ISSUE 27 / FEBRUARY 1985 / PAGE 61

Boot Camp

by Tom Hudson

In this installment of Boot Camp, we continue our work with BASIC USR calls, in order to become more familiar and comfortable with the 6502 instruction set.

It’s about time.

The first USR call we’ll look at this time is a simple timer. Timer programs are easy to write on Atari computers, because, inside each one, is a real-time clock. It doesn’t have any hands, but you can write a program to read it.

The Atari’s real-time clock is found in three memory locations: 18, 19 and 20 ($12, $13 and $14). The clock itself is updated by the system’s vertical blank interrupt (VBI) code, which is executed sixty times per second. Each 1/60th of a second is known as a jiffy. Each time the VBI code is executed, the byte at location $14 is incremented. When this value gets to 255, it is set to zero, and location $13 is incremented. When location $13 reaches 255, it is set to zero and location $12 is incremented.

In order for you to see exactly how this timer operates, type in the BASIC program shown in Figure 1 and RUN it.

As you can easily see, this program simply prints the contents of memory locations 18, 19 and 20 to the screen. You can actually watch each location being modified by the VBI routines. Note that location 20 takes roughly 4.25 seconds to go from 0–255 (256 * 1/60th of a second). Locations 19 and 20, which together make up a 2-byte counter ranging from 0–65535, take roughly 1092 seconds, or 18.2 minutes, to go from 0–65535. All three locations, making up a 3-byte counter ranging from 0–16777215, take about 77.6 hours to go from 0–16777215. I don’t recommend leaving your computer on long enough to test this principle; just take my word for it.

10 POKE 752,1:POSITION 2,0:PRINT PEEK(18);" ";PEEK(19);" ";PEEK(20);" ":GOTO 10

                Figure 1.

Now that we know how the internal real-time clock works, let’s write a USR call that will take advantage of it. This program will allow us to pass a value in jiffies from BASIC, ranging from 0–65535, that will make the computer wait that exact period of time.

This is actually a very simple routine. All we need to do is set the two low-order bytes of RTCLOK (realtime clock) to zero and wait for them to reach the jiffy count that BASIC asked for. The flowchart for this program is shown in Figure 2.

One thing important to note about the real-time clock bytes is that they are not ordered in memory from low- to high-order. Instead of RTCLOK containing the low-order value, RTCLOK+2 has it. In the same manner, RTCLOK contains the highest-order byte, not RTCLOK+2. This is one of the few cases where the low-order, high-order custom is broken, so keep this in mind when working with RTCLOK.

All right, now that we know what must be done, let’s write the 6502 code to do the job. Figure 3 shows one way to handle the timer.








Figure 2.
0100 WAITL  = $CB
0110 WAITH  = $CC
0120 RTCLOK = $12
0130        *= $0600
0140        CLD        ;CLEAR DECIMAL
0150        PLA        ;DISCARD #ARGS
0160        PLA        ;PULL WAIT HI
0170        STA WAITH  ;AND SAVE IT
0180        PLA        ;PULL WAIT LO
0190        STA WAITL  ;AND SAVE IT
0200        LDA #0     ;ZERO OUT...
0210        STA RTCLOK+1 ;CLOCK BYTE 2
0220        STA RTCLOK+2 ;CLOCK BYTE 3
0240        CMP WAITH   ;= WAIT HI?
0250        BNE WAITLP ;NO. LOOP BACK!
0270        CMP WAITL  ;= WAIT LO?
0280        BNE WAITL2 ;NO, LOOP BACK!
0290        RTS        ;WAIT'S OVER!

                Figure 3.

Let’s walk through the timer code and see what’s going on.

Line 140 clears the decimal mode. This isn’t necessary in this program, since we’re not doing any addition or subtraction, but let’s get into the habit of using this instruction.

Line 150 pulls the number of arguments from the 6502 stack. This number is assumed to be 1, and we’re going to simply discard it.

Lines 160–170 pull the high byte of the jiffy count off the stack and store the value in the location WAITH. We’ll use this location to test for the end of the timer period.

Lines 180–190 pull the low byte of the jiffy count off the stack and store it in WAITL. This location will also be used to test for the end of the wait period.

Lines 200–220 zero out the two low-order bytes of RTCLOK. Remember, the lowest-order byte is RTCLOK+2, and the middle-order byte is RTCLOK+1. This operation starts the timer at zero, and we can now waiting for the timer to reach the jiffy count specified by BASIC. We will compare each byte of the jiffy count with the corresponding byte of the real-time clock. When these bytes match, the wait is over, and we can return to BASIC.

Line 230, labeled WAITLP (wait loop), loads the middle-order byte of RTCLOK into the accumulator. We can now compare it to WAITH.

Line 240 compares the accumulator to the value in WAITH.

Line 250, a BNE instruction, will branch over to WAITLP if the accumulator is not equal to WAITH. If these bytes are equal, we need to compare the low-order bytes, and the program continues at the next instruction.

Line 260, labeled “WAITL2” (wait loop 2) loads the low-order byte of RTCLOK into the accumulator, and we’re ready to compare the low-order bytes.

Line 270 compares the accumulator to the value in WAITL.

Line 280, another BNE instruction, branches back to WAITL2 if the accumulator is not equal to WAITL. If the branch is taken, the program will continue at WAITL2, waiting for the low-order bytes to match. If the bytes are equal, then the wait is over, since the high-order and low-order bytes are the same.

Line 290 is executed when all the timer values match. This RTS statement simply returns control to BASIC.

You can try the timer routine for yourself Figure 4 shows the BASIC code necessary to set up and call the USR subroutine.

10 FOR X=1536 TO 1562:READ N:POKE X,N:NEXT X:TIMER=1536
30 ? "WAITING..."
50 ? "}TIME'S UP!":?
60 GOTO 20
100 DATA 216,104,104,133,204,104,133,203,169,0,133,19,133,20,165,19,197,204,208,250,165,20,197,203,268
110 DATA 250,96

                Figure 4.

Line 10 READs the assembly-language data in Lines 100–110 and POKEs them into memory, starting at location 1536 ($0600). Since the timer code is relocatable, you may place it in a BASIC string and call it that way, if you like.

Line 20 accepts the number of jiffies to wait from the keyboard, placing the value in the variable WAIT. You should limit this value to the range 1–65535, for a wait of from 1/60th of a second to 18.2 minutes, To wait exactly one minute, you should type 3600 (60 seconds times 60 jiffies per second).

Line 30 prints a message to let you know when the time period starts.

Line 40 calls the USR routine with the statement:


Note that, instead of using 1536 as the USR code address in the USR call, we have used the variable TIMER, which was set to 1536 in Line 10. This is a good practice, since it helps document what the USR call is doing. This can be very helpful later, when you need to change the program for some reason.

Lines 50–60 cause the console speaker to beep, print a “time’s up” message, and return to Line 20 to accept another time period.

Lines 100–110 contain the numeric data values which, when POKEd into memory, make up the timer USR subroutine.

After you have typed in the program, RUN it. The program will ask:


Type 60 and press RETURN. The computer should wait one second, beep, and print:


See? When you typed 60, BASIC told the USR subroutine to zero out the real-time clock and wait until it counted 60 jiffies. If you type 65535, the computer will wait 18.2 minutes before it beeps.

This routine can be very handy in almost any program which requires several time delays. You probably won’t use any time periods over a couple of minutes, but the program can handle it if the need arises.

PEEKing Tom?

How many times have you wanted to know the value stored in a 2-byte data item? For example, if you want to know where the display list begins, you must type:


If you have to do this a dozen times in a single program, each time with a different address, it can be a real pain—as well as use up memory.

Well, why not write a USR call that will do this tedious work for you? It’s simple and only takes 20 bytes of memory space.

We’ll call the USR function “DPEEK,” for double-byte PEEK. It will be set up so that, when the user furnishes the address of the first byte of the 2-byte value, the USR call will return the value contained in the 2 bytes.

This will be the first time we’ve used post-indexed indirect addressing, but don’t get nervous. It’s actually not as bad as it sounds, and is a very handy function of the 6502.

As you will recall, post-indexed indirect addressing uses 2 bytes on page (the first 256 bytes of memory) to form an address. It then uses the Y register to get an offset from this address. Let’s look at an example.

Let’s say the computer wants to execute the instruction: LDA (ADDR),Y. The location ADDR must be on page (this is a restriction of the 6502). Assume that location ADDR contains $4F, and ADDR+1 contains $60. The computer will form the address $604F from these 2 bytes, then add the Y register to this address. Assuming the Y register contains $06, the final address will be $6055, the total of $604F + $06. Therefore, the accumulator will be loaded from location $6055. Get it?

What we’ll do in this USR call is pass an address to the subroutine. The subroutine will store the address on page and indirectly load the byte at the address (the low byte of a 2-byte value) and the byte at the address + 1 (the high byte of a 2-byte value). The decimal equivalent of this number will be returned to BASIC. The flowchart of this procedure is shown in Figure 5.









Figure 5.

Now let’s look at the 6502 assembly code corresponding to the flowchart. It’s relatively short and easy to follow. This code is shown in Figure 6.

0100 PEEKL  =   $CB
0110 PEEKH  =   $CC
0120 RESLO  =   $D4
0130 RESHI  =   $D5
0140        *=  $0600
0150        CLD        ;CLEAR DECIMAL
0160        PLA        ;DISCARD HARGS
0170        PLA        ;PULL PEEK HI
0180        STA PEEKH  ;AND SAVE IT
0190        PLA        ;PULL PEEK LO
0200        STA PEEKL  ;AND SAVE IT
0210        LDY #0     ;Y REG =
0220        LDA (PEEKL),Y ;GET LO BYTE
0230        STA RESLO  ;AND SAVE IT
0240        INY        ;Y REG NOW = 1
0250        LDA (PEEKL),Y ;GET HI BYTE
0260        STA RESHI  ;AND SAVE IT
0270        RTS        ;ALL DONE!

                Figure 6.

Lines 100–110 set up equates for 2 bytes on page 0. Remember that BASIC only allows us to use page locations $CB-D1; using other addresses could prevent the subroutine from working properly—or even lock up the system. Note that these bytes are stored in low-byte, high-byte order. This is a must for indirect addressing.

Lines 120–130 set equates for RESLO and RESHI, the storage locations which will return the subroutine’s result to BASIC. For further information on these bytes, see issue 25’s Boot Camp.

Line 140 sets the program counter to $0600, placing this program on page 6. This subroutine will be relocatable, though, so the address really doesn’t matter.

Line 150 clears the decimal mode, placing us in binary mode. This program doesn’t do any math, but let’s get into the CLD habit, okay?

Line 160 starts the subroutine’s operation by pulling the number of arguments off of the stack. Assume the programmer has only sent one argument, the address the subroutine is to DPEEK at. After being pulled off the stack, this value is discarded.

Lines 170–180 pull the high byte of the address to be DPEEKed off the stack and store it in its page location, PEEKH.

Lines 190–200 pull the low byte of the address to be DPEEKed off the stack and store it in its page location, PEEKL. At this point, the program has set up its indirect memory pointer and is ready to perform the DPEEK operation.

Line 210 places a zero in the Y register. All post-indexed indirect instructions use the Y register to calculate an offset from the address used, and, since we want to load the first byte from the address in PEEKL and PEEKH with no offset, the Y register must be zero (no offset).

Line 220 loads the accumulator indirectly from the address in PEEKL and PEEKH. Since we are loading the first byte of the 2-byte value, this is the low order byte of the DPEEK value.

Line 230 stores the value just loaded into RESLO, the low-order byte of the result to be returned to BASIC.

Line 240 increments the Y register, changing it from 0 to 1. In this way, we’re now ready to retrieve the second byte of the DPEEK value, because a 1 in the Y register will cause the indirect load to get the byte from the address in PEEKL and PEEKH + 1.

Line 250 loads the accumulator indirectly from the address in PEEKL and PEEKH + 1. This is the high order byte of the DPEEK value.

Line 260 stores the high order byte of the DPEEK value in RESHI, so that it can be returned to BASIC.

Line 270 executes an RTS instruction to return control to BASIC. At this point, RESLO and RESHI contain the value that was DPEEKed out of the address passed to the subroutine by BASIC.

The BASIC code for your DPEEK subroutine is shown in Figure 7.

10 FOR X=1536 TO 1555: READ N:POKE X,N:NEXT X:DPEEK=1536
50 GOTO 20
100 DATA 216,104,104,133,204,104,133,203,160,0,177,203,133,212,200,177,203,133,213,96

                Figure 7.

Type in this short BASIC program and RUN it. When asked for a DPEEK address, type 88 and press RETURN. The program will print a number and ask for another DPEEK address.

The number printed by the subroutine is the value PEEK(88) + PEEK(89) * 256. To confirm this, stop the program by pressing BREAK and type:

PRINT PEEK(88)+PEEK(89)*256

The number printed by this statement when you press RETURN should match the one printed by the DPEEK function. If not, you probably mistyped one or more DATA values in Line 100.

What did we DPEEK? The addresses 88 and 89 are known as “SAVMSC.” These bytes point to the first byte of screen memory. To prove this, POKE a 1 into the address printed by the DPEEK subroutine. For example, if the DPEEK routine printed 40000, you would enter:

POKE 40000,1

You should see an exclamation point (!) at the upper left corner of your screen. The exclamation point is represented in screen memory by the number 1, so that’s what shows up. See how handy the DPEEK function is?

You can easily find out where the display list is by DPEEKing SDLSTL (560). To find where the DOS vector is pointing, DPEEK DOSVEC (10). You can use the DPEEK function to find the contents of any 2-byte pointer that is in standard low-byte, high-byte format.

To summarize, the DPEEK subroutine uses a value passed to it by BASIC to point to a location in memory. The contents of this location and the location + 1 are used to build a 2-byte value which is passed back to BASIC. Figure 8 shows a pictorial representation of this function.

FROM      PAGE ZERO     $6000       LO HI       TO
BASIC      ┌──┬──┐     ┌──┬──┐     ┌──┬──┐     BASIC
24576 -->  │00│60│ --> │12│9B│ --> │12│9B│ --> 39698 
($6000)    └──┴──┘     └──┴──┘     └──┴──┘    ($9B12)

                          Figure 8.

I’m sure most of you programmers out there will appreciate the DPEEK function. It makes the operation of checking pointers a lot easier.

Homework time.

Now that I’ve shown you an example of indirect addressing, it’s time for you to try one of your own. The assignment: write a companion subroutine for DPEEK that will perform a DPOKE function. That is, the subroutine accepts two arguments—the address to DPOKE and the value to by DPOKEd into the address. The function is very similar to the DPEEK operation, except that the program stores the second argument’s bytes into the address and address + 1, instead of reading them and returning them to BASIC.

After you code the program, verify that it is operating correctly by using the DPEEK function. For example, if you DPOKE address 1776 with a value of 65245, the DPEEK of address 1776 should return 65245.

Until next time, try coding this problem. Use the DPEEK routine as a guide, since its operation is very similar. If you have any problems, remember that you can contact me via the Atari SIG on CompuServe. My user ID is 70775,424. If you don’t have a modem, you can write the address below.

Boot Camp
c/o ANALOG Computing
RO. Box 23
Worcester, MA 01603
A.N.A.L.O.G. ISSUE 28 / MARCH 1985 / PAGE 68

Boot Camp

by Tom Hudson

All right, Boot Camp trainees, here we are again in the wonderful world of assembly language programming. This issue, we continue our work with BASIC USR calls, the mechanism which allows us to use assembly language routines in conjunction with BASIC.

DPOKE solution.

Last issue, we wrote a routine that allowed us to examine the contents of 2-byte data items in memory, and called it DPEEK (double PEEK). Your homework was to write a companion routine, DPOKE, which will POKE a 2-byte value into memory. We will write the USR call so that it can be called with the BASIC statement:


The DPOKE routine can be written very easily. In fact, the DPEEK routine from last issue can be used as a starting point. Figure 1 shows the assembly language source code for the DPOKE routine.

0100 POKEL  =   $CB
0110 POKEH  =   $CC
0120        *=  $0600
0130        CLD        ;CLEAR DECIMAL
0140        PLA        ;DISCARD #AHGS
0150        PLA        ;PULL POKE HI
0160        STA POKEH  ;AND SAVE IT
0170        PLA        ;PULL POKE LO
0180        STA POKEL  ;AND SAVE IT
0190        LDY #1     ;POINT Y TO HI
0200        PLA        ;PULL VALUE HI
0210        STA (POKEL),Y ;POKE HI VAL
0220        DEY        ;POINT Y TO LO
0230        PLA        ;PULL VALUE LO
0240        STA (POKEL),Y ;POKE LO VAL
0250        RTS        ;ALL DONE!

                Figure 1.

Let’s look at this code and see what makes it tick. For purposes of demonstration, we’ll assume that we’re DPOKEing the value 16479 ($405F) into location 560 ($0230).

Line 130 clears the decimal mode, placing us in binary math mode. This program doesn’t do any add or subtract operations, but let’s do this anyway, just to get into the habit.

Line 140 pulls the number of arguments off the stack. We will assume that the programmer has sent two arguments, and discard this value.

Line 150 pulls the high byte of the DPOKE address off the stack, placing it in the accumulator. At this point, the accumulator contains $02, the high-order portion of $0230.

Line 160 stores the high byte of the address in the location POKEH, at address $00CC. We use a page address for this value, since we’ll want to use the address as an indirect pointer.

Line 170 pulls the low byte of the DPOKE address off the stack, leaving it in the accumulator. The accumulator now contains the low-order portion of $0230, or $30.

Line 180 stores the low byte of the DPOKE address in the location labeled POKEL, which, like POKEH, is located on page 0. At this point, the 2 bytes POKEL and POKEH make up a 2-byte pointer for the specific location in memory corresponding to the first argument sent by BASIC. Using the address in our demonstration, POKEL contains $30, and POKEH contains $02, making a Z-byte pointer which points to $0230. We’re now ready to perform the DPOKE operation using the next two values on the stack.

Line 190 places a 1 in the Y register, readying it for the storage of the high byte of the DPOKE value.

Line 200 pulls the high byte of the value to be DPOKEd off the stack and places it in the accumulator. Once again, using our demonstration values, you can see that the accumulator will, at this point, contain $40, the high byte portion of $405F.

Line 210 stores the high byte of the value we want to DPOKE. As you see, the program uses the post-indexed indirect form of addressing to perform this function. POKEH and POKEL contain the values $02 and $30, and form a pointer to location $0230. The accumulator at this point contains $40, and the Y register contains the number 1. When we execute the instruction STA (POKEL),Y the computer will store the accumulator at location $0231, the address which is the sum of the pointer at POKEL ($0230) and the Y register ($01).

Line 220 decrements the Y register by 1, making it 0. This will enable the low byte of the 2-byte DPOKE value to be stored in Line 240. Line 230 pulls the low byte of the value to be DPOKEd from the stack, leaving it in the accumulator. At this point, using our example data, the accumulator would contain $5F, the low byte of the value to be POKEd, $405E

Line 240 stores the low byte of the DPOKE data in the low-order byte of the DPOKE address. The address used to store the accumulator is calculated as in Line 210. Using our example values, the address contained in POKEL and POKEH ($0230) plus the value in the Y register (0), gives a storage location of $0230. After this instruction has been executed, both bytes of the 2-byte DPOKE value have been properly stored, and we’re finished.

Line 250 executes an RTS instruction, which returns program control to BASIC.

10 FOR X=1536 TO 1553:READ N:POKE X,N:NEXT X:DPOKE=1536
50 GOTO 20
100 DATA 216,104,104,133,204,104,133,203,160,1,104,145,203,136,104,145,203,96

                Figure 2.

The BASIC program in Figure 2 allows you to test the DPOKE subroutine yourself. After typing in the program, type the following line and press RETURN.

? PEEK(560)+PEEK(561)*256

The number that BASIC prints is the address of the Atari computer’s display list. This is a set of specialized instructions used to generate the computer’s display. Add 1 to this number and write down the new value. Now RUN the BASIC DPOKE program. The computer will ask:


Type 560 and press RETURN. Memory locations 560 and 561 are a 2-byte pointer which tells the computer where the display list is in memory. We will change this pointer, using the DPOKE function. After you enter the DPOKE address, the computer will ask:


Now type the number you wrote down earlier (the display list address + 1) and press RETURN. You should see your computer’s display move up by one line.

What happened? Because we changed the display list pointer so that it points 1 byte higher than it originally did, the display processor starts eight scan lines farther into the display, and the display is shifted up. If you change the pointer back to its original value, the display returns to normal. This is just one example of how the DPOKE subroutine can help. You can write a program with two display lists in memory, then switch between them with one simple USR call.

The DPOKE subroutine can be a very handy addition to your utility subroutine library—and add convenience to programs which must alter the system pointers repeatedly.

One word of caution, though. Be sure you know what locations you’re changing! The DPOKE subroutine will allow you to change any 2-byte memory group without restrictions, and careless use of this freedom could destroy vital system data or your program…or it could crash the system.

You’ll flip.

Our next USR call example will give us some more experience with the post-indexed indirect addressing mode, this time in conjunction with BASIC strings.

Many times, you’ll want to manipulate the data in a BASIC string, or use the string as a method of storing miscellaneous data. When you do this, you must tell the USR subroutine where the string is and how long it is. This is actually quite simple.

This subroutine accepts two parameters, a string’s address and its length. It then flips the state of the 128 bit of each character in the string. Now, the 128 bit of a character byte has a special significance to the Atari display processor: this is the bit which tells whether or not a character is to be displayed in inverse video.

If the 128 bit is off (0), the character is displayed normally, white character on blue background. If the bit is on (1), the character will be displayed in inverse, a blue character on a white background.

How will we manage to flip the 128 bit? Remember the exclusive-or function? We discussed it in issue 18. Briefly, the exclusive-or operation will flip the state of any bit in the accumulator if the corresponding bit in the operand byte is on. We’ll use this principle to flip the high-order bit of each byte of the string. The USR statement that will be used to call this subroutine is of the form:


Now let’s look at the assembly code needed to perform this function. Figure 3 shows one possible solution.

0100 STRADL =  $CB
0110 STRADH =  $CC
0120 STRLEL =  $CD
0130 STRLEH =  $CE
0140        *= $0600
0150        CLD
0160        PLA        ;DISCARD # ARGS
0170        PLA        ;PULL ADDR HI
0190        PLA        ;PULL ADDR LO
0210        PLA        ;PULL LENGTH HI
0230        PLA        ;PULL LENGTH LO
0270        BNE FLIPIT ;IT'S NON-ZERO!
0280        RTS        ;ALL DONE!
0300        SEC        ;LENGTH COUNTER
0310        SBC #1     ;BY 1
0320        STA STRLEL ;AND PUT IT
0330        LDA STRLEH ;BACK!
0340        SBC #0
0350        STA STRLEH
0360        LDY #0     ;READY Y REG.
0370        LDA (STRADL),Y ;GET BYTE
0380        EOR #$80   ;FLIP HI BIT
0390        STA (STRADL),Y ;PUT BACK!
0400        LDA STRADL ;NOW ADD 1...
0410        CLC        ;TO STRING...
0420        ADC #1     ;ADDRESS...
0440        LDA STRADH
0450        ADC #0
0460        STA STRADH
0470        JMP INVLP  ;AND LOOP BACK!

            Figure 3.

Lines 100–110 reserve 2 bytes to store the address of the string that the subroutine will alter. Once again, since these bytes will be used as an indirect pointer in the post-indexed indirect instruction format, they must be stored on page 0.

Lines 120–130 reserve 2 more bytes to hold the string length value. This area will be used as a counter to determine when the flip process is complete.

Line 150 clears the decimal mode. This program uses the arithmetic instructions ADC and SBC, and works with binary math. Therefore, we must be sure that the 6502 processor is ready to work with binary values.

Line 160 pulls the number of arguments from the stack. We’ll assume the programmer has sent the proper number of arguments, and discard this value.

Lines 170–200 pull the 2 bytes that make up the string’s address from the stack and store them in the string address hold area (STRADL and STRADH) on page 0. Remember, it’s necessary for this value to be located on page 0, because we’re going to use it as an indirect pointer to the string. All indirect pointers used in pre- and post-indexed operations must be stored on page 0. This is a limitation of the 6502 processor.

Lines 210–240 pull the 2 bytes which make up the string’s length from the stack and place them in the string length hold area (STRLEL and STRLEH). At this point, we’re ready to begin processing the string and flipping bits.

Lines 250–260 first load the accumulator with the value in STRLEL (the low byte of the string length), then OR this value with the number in STRLEH (the string length high byte). By using the ORA instruction, we combine the bits in STRLEL with those in STRLEH, allowing us to check very quickly to see if they are both 0. If either STRLEL or STRLEH have bits on, they will show up in the accumulator, and we’ll know there are more characters left to process in the string. On the other hand, if the string length has reached 0, both STRLEL and STRLEH will be 0, and the ORA operation will result in a value in the accumulator.

Line 270 tests the result of the previous ORA instruction. If there are more characters to process in the string, the accumulator will not be 0, and the computer will BNE (branch if not equal/zero) to the location labeled FLIPIT, to process the next character. If the accumulator is 0, all the characters have been processed, and the program continues at the next instruction. Line 280 is executed after all the characters have been processed. This is simply an RTS instruction, and the computer resumes processing in BASIC.

Lines 290–350, labeled FLIPIT begin the actual bit-flipping operation. These lines subtract 1 from the string length counter, STRLEL and STRLEH. As each character in the string is processed, this counter is decremented by 1. When this counter reaches 0, the ORA instruction at INVLP detects the condition and terminates the subroutine.

Line 360 places a in the Y register, getting it ready for the post-indexed indirect operation that we’ll use to flip the string’s bits. By placing a in the Y register, the indirect operation will have a offset from the address in the pointer, STRADL and STRADH.

Line 370 loads the accumulator from the address contained in the pointer STRADL and STRADH, which contains one of the characters in the string. As mentioned above, the Y register is set to 0, so that the byte is loaded from the address in the pointer, with no offset added by the Y register. For example, if STRADL/H is pointing to $457F, the accumulator will be loaded from address $457F ($457F+0).

Line 380 exclusive-ORs the accumulator with the value $80 (128 decimal, 10000000 binary). As you can see from the binary representation, this will flip the highest bit of the value in the accumulator. If the bit was on before the operation, it will be turned off, and vice-versa. Since the value in the accumulator is one of the characters in the string, this will change normal characters to inverse, and inverse characters to normal.

Line 390 uses the post-indexed indirect addressing mode to store the character in the accumulator back into memory, after the flip operation is complete. One thing to note here is that you must pay close attention to what happens to the registers when programming in 6502 assembly language. For example, this STA instruction uses the Y register as an offset, and you should be sure that it’s not altered between the time you load the character value and store it. In this case, there’s nothing to worry about, but in larger programs, you could run into trouble if many registers are being used, and the Y register had been changed.

Lines 400–460 add 1 to the string address pointer, STRADL and STRADH. This advances the pointer to the next character in the string.

Line 470 executes a JMP instruction, looping the program back to the label INVLP, where it will test for more characters to process.

The BASIC program for the character flip program is shown in Figure 4. Type in the program and RUN it.

10 FOR X=1536 TO 1533:READ N:POKE X,N:NEXT X:FLIP=1536
20 DIM A$(20):A$="THIS IS A TEST"
100 DATA 216,104,104,133,204,104,133,203,104,133,206,104,133,205,165,205,5,206,208,1,96,165,205,56,233
110 DATA 1,133,205,165,206,233,0,133,206,160,0,177,203,73,128,145,203,165,203,24,105,1,133,203,165
120 DATA 204,105,0,133,204,76,14,6

                Figure 4.

As you can see, each time the program executes Line 30, the string A$ is changed from normal video to inverse and vice-versa. The program changes all the characters, because we told it to start at the address of A$, and to flip as many characters as A$ contains.

Let’s try something a little different. Change Line 30 to look like this and RUN the program again:

30 A=USR(FLIP,ADR(A$(11)),4)

Now you’ll see an interesting variation on the original function. As you’ll note when the program runs, only the word TEST is changing! We told it to change the eleventh character of A$ (ADR(A$(11))), and we told it to flip four characters. You can flip any portion of a string you like, and any number of characters.

Here’s another example of what this program can do. Change Lines 30 and 40 to read:

30 A=USR(FLIP,PEEK(88)+PEEK(89)*256,40)

After you’ve made the changes, RUN the program. You’ll see the top line on the graphics screen flash. How is this being done? Locations 88 and 89 are a 2-byte pointer to the start of screen memory. By sending their address to the subroutine instead of a string address, along with a length of 40 bytes, the subroutine will flip the actual screen memory’s inverse bits, and we have a flashing display line!

Stay tuned.

As you’ve seen from the examples I’ve used so far, you can perform a large variety of useful functions, very quickly, with USR subroutines. Next issue, we’ll wrap up our USR call series so that we can proceed to bigger and better things. We’ll still cover USR calls from time to time, but I’m sure there are a lot more areas that you’ll enjoy exploring.

Until then, play around with the 6502 and try writing your own USR calls. And, should you find yourself stuck, remember that you can get in touch with me on CompuServe via the Atari SIG (my user ID is 70775,424), or by writing.

Boot Camp
 c/o ANALOG Computing
 P.O. Box 23
 Worcester, MA 01603
A.N.A.L.O.G. ISSUE 29 / APRIL 1985 / PAGE 71

Boot Camp

by Tom Hudson

This issue, we conclude our coverage of the BASIC USR function, a handy statement that puts the speed and power of machine language to use in BASIC programs. We’ve looked at single- and multiple-argument operation, modifying strings, examining and changing system memory, and setting precision timers.

This issue, we’re going to look at a USR function that will generate random numbers within specified ranges. This can be done in several different ways, with varying degrees of speed. We’ll also see that you shouldn’t always accept the first solution you come up with, since there may be one which is more efficient.

Random ramblings.

At one time or another, we’ve all used random numbers. Whether in games or statistical analysis, random numbers have an important function in computing.

Would you like it if your computer chess program made the same moves every game? I wouldn’t—the games would get too predictable, and the chess disk would be quickly relegated to the “outdated program” pile. BASIC’s random number function, RND(n), produces random numbers between zero and one, and usually works fine for most applications.

Just for fun, assume that we’re simply not happy with BASIC’s RND function, and want one that’s more versatile. We want a function that will return a random integer value between two given numbers, or if only one parameter is given, between zero and that value. We could write the function as a BASIC subroutine, but top speed is essential. We need to write a USR subroutine.

Hats off!

The first method most people would come up with is what I call “pulling numbers out of a hat.” Simply stated, you get a random number, and if it’s in the range you want, you use it. If not, you reach into the hat and try another. This method works fine, but there’s one big drawback: speed.

30 RAND=PEEK(53770)+PEEK(53770)*256
50 ? RAND:GOTO 30

                Figure 1.

Figure 1 is the BASIC version of pulling numbers out of a hat. Type in the program and RUN it. You will be asked for a random number range. Type in:


and press RETURN. You will see the program happily print out random numbers ranging from 0 to 65535, at BASIC’s top speed. All’s well, right? Wrong! Press BREAK and RUN the program again. This time, when prompted for the random number range, type:


and press RETURN. If you see anything print out within three or four minutes, consider yourself lucky. What happened? Let’s look at the program and find out.

Lines 10–20 accept the random number range and store the low and high ranges in LO and HI, respectively. Any random values less than LO are rejected, as are any values greater than HI.

Line 30 generates a random number between and 65535, using the Atari’s random number generator, RANDOM. RANDOM is located at $D20A (53770 decimal) and gives a random value of 0 to 255 when PEEKed. This line reads RANDOM twice and builds a large random number (ranging from 0 to 65535) by setting RAND to PEEK(53770) + PEEK(53770) * 256.

Line 40 checks to see if the random number just generated falls between the values in LO and HI. If not, the program loops back to Line 30 to try pulling another random number out of the hat.

Line 50 prints any random numbers that are within the range specified by LO and HI.

Now can you see why this program works so slowly? When a large range (such as 0–65535) is specified, there is a better chance of the random number falling into that range. When a smaller range is given, the odds of picking a random number in that range can drop drastically, making the program take virtually forever.

“Aha,” you say, “I’ll just write this routine in assembly language and speed it up. Assembly language fixes everything!” Let’s see what happens.

0100 LOWL =  $CB     ;LOW LIMIT
0110 LOWH =  $CC
0120 HIGHL = $CE     ;HIGH LIMIT
0130 HIGHH = $CF
0140 RESLO = $D4     ;BASIC'S RESULT
0150 RESHI = $D5
0160 RANDOM = $D20A  ;RANDOM # 0-255
0170 ;
0180     *=  $0600   ;ROUTINE START
0190 ;
0200     CLD         ;CLEAR DECIMAL
0210     LDA #0      ;SET DEFAULT
0220     STA LOWL    ;LOW RANGE
0230     STA LOWH    ;VALUE
0240     PLA         ;GET #ARGS
0250     CMP #1      ;ONE ARGUMENT?
0270     PLA         ;PULL LOW HI
0280     STA LOWH    ;AND SAVE IT
0290     PLA         ;PULL LOW LO
0300     STA LOWL    ;AND SAVE IT
0330     PLA         ;PULL HIGH LO
0390     CMP HIGHH   ;TOO BIG?
0410     BNE GETRND  ;TOO BIG!
0430     CMP HIGHL   ;TOO BIG?
0450     BNE GETRND  ;TOO BIG!
0470     CMP LOWH    ;TOO SMALL?
0480     BCC GETRND  ;YES!
0490     BNE RANDOK  ;IT'S OK!
0500     LDA RESLO   ;IS LOW BYTE...
0510     CMP LOWL    ;TOO SMALL?
0520     BCC GETRND  ;YES!
0530 RANDOK RTS      ;IT'S OK, EXIT!

                Figure 2.

Figure 2 shows the assembly code equivalent of Figure 1, which can be called as a USR subroutine. It can be called by the following two USR statements:


The first USR statement will generate a random number between and the value of argh The second USR format will generate a random number between the value in arg1 and the value in arg2. Obviously, the USR subroutine must be able to determine how many arguments are supplied, and act accordingly. Let’s see how this subroutine works.

Line 200 clears the decimal mode, placing us in binary math mode.

Lines 210–230 will set the 2-byte work area LOWL-LOWH to 0. This ensures that, if there is only one argument, the low range will default to 0.

Line 240 pulls the number of arguments off the stack.

Line 250 compares the number of arguments to 1. If there is only one argument, we will want to go get the high range value.

Line 260 branches to PULLHI if the number of arguments if equal (BEQ) to 1. This will cause the computer to pull just one argument from the stack.

Lines 270–300 pull and store the low limit for the random number. If there are two arguments, this is the first.

Lines 310–340 pull and store the high limit for the random number. Of course, if there’s only one argument, specifying a range from 0 to arg1, this is the one that will be pulled, and the low limit (set in Lines 220–240) will be 0.

Lines 350–380, labeled GETRND, generate a random number between and 65535, placing it in the locations RESLO and RESHI. As you should know by now, RESLO and RESHI ($D4 and $D5) are the locations used to send values to BASIC from the USR subroutine. The random number is built by simply loading the accumulator twice, placing each random byte into the RESLO and RESHI locations.

At this point, I would like to discuss an important function in assembly language: comparisons. We’ve already seen how single-byte values can be compared easily, using the CMP instruction. Since we’re using 2-byte values here, we must learn how to perform multiple-byte comparisons.

A 2-byte comparison is not very different from a single-byte comparison. The obvious difference is that there are now 2 bytes to be compared instead of 1. What may not be obvious is that we must compare the high-order bytes first, then the low-order bytes. Figure 3 is a flowchart of possible comparison outcomes.

IS V1(H)
< V2(H)?

V1 < V2


IS V1(H)
> V2(H)?

V1 > V2


IS V1(L)
< V2(L)?

V1 < V2


IS V1(L)
> V2(L)?

V1 > V2


V1 = V2

Figure 3.

In Figure 3, we are comparing the values V1 and V2. V1 and V2 are both 2-byte values, and their high-order and low-order portions are designated H and L, respectively.

As you can see, there are three possible outcomes in any comparison: greater than, less than, and equal to. The flowchart is fairly straightforward, showing the step-by-step procedure for comparing any two 2-byte values. Note that, since the high-order bytes are the most significant bytes, they are compared first. After all, if the high byte of V1 is greater than that of V2, V1 is greater than V2, no matter what the low bytes of the values contain.

Note, however, that if the high-order bytes of V1 and V2 are equal, we must compare the low bytes to complete the comparison properly. Figure 4 shows the assembly code equivalent of Figure 3.

10   LDA V1H
20   CMP V2H
30   BCC V1LTV2
40   BNE V1GTV2
50   LDA V1L
60   CMP V2L
70   BCC V1LTV2
80   BNE V1GTV2
90   BEQ V1EQV2

                Figure 4.

The first operation in Figure 4 is the actual CMP operation on the high-order bytes of V1 and V2, in Lines 10–20. At this point, the CARRY and ZERO flags are set according to the comparison results. If V1 is greater than V2, the carry flag is set to 1. If V1 is less than V2, the carry flag is cleared. If V1 and V2 are equal, the zero flag and the carry flag are set.

Next, the computer branches to V1LTV2 (V1 Less Than V2) if the carry flag is cleared (BCC V1LTV2).

The next operation is somewhat tricky. Since an equal condition sets the carry flag as well as the zero flag, we BNE (Branch Not Equal) to V1GTV2 (V1 Greater Than V2). This insures that we will only branch to V1GTV2 when V1 is greater than V2. The program will fall through to the next instruction if V1H is equal to V2H.

At this point, we know that the high bytes of V1 and V2 are equal, and we have to compare the low-order bytes. This happens if V1 = $4F00 and V2 = $4F9B, V1 = $007F and V2 = $0020, etc.

Lines 50–60 compare the low bytes of V1 and V2, just as the high bytes were compared. Now we’re ready to finish the 2-byte comparison.

Line 70 branches if the carry flag is clear (BCC) to V1LTV2. Remember that if the carry is clear after a compare, the accumulator value (V1L, in this case) is less than the byte it was compared to (V2L).

Line 80 branches if the compare was not equal (BNE) to V1GTV2. Once again, this branch operation is used instead of BCS, because an equal condition also sets the carry flag. In this case, since the BNE is used after a BCC instruction, the BNE can be considered a kind of “branch if greater than” instruction.

Line 90 branches to V1EQV2 using the BEQ instruction. At this point, we know V1 equals V2, since the high bytes are equal, and the low bytes are equal.

Multiple-byte comparisons can be somewhat confusing at first, but we’ll be using them often in Boot Camp programs, and you’ll soon feel comfortable with them. Now, let’s return to our “walk-through” of the first random number program.

Line 390 compares the accumulator (which contains the high byte of the random number) to HIGHH, the high byte of the upper random number limit. This is the start of a 2-byte comparison to see if the random number we just built is greater than the upper random number limit.

Line 400 branches if the carry is clear (BCC) to CHEKLO. If the carry is clear, we know that the high byte of the random number is less than the high byte of the upper limit, and we can go on to check the random number to see if it is less than the lower limit.

Line 410 branches if not equal to GETRND, since a not-equal condition (the same as “branch if greater than,” when used after a BCC instruction) means that the random number is greater than the upper random number limit, and we have to reach into the hat for another random number.

Lines 420–430 compare the low byte of the random number to the low byte of the upper limit. At this point, we know that the high-order byte of the random number is the same as that of the upper limit, so we need to compare the low-order bytes to complete the comparison operation.

Line 440 branches if the carry is clear (random < limit) to CHEKLO. We now know that the random number is less than the upper limit, and must check to see if it is above the lower limit.

Line 450 branches if not equal (random > limit) to GETRND, since this shows that the random number is greater than the upper limit.

Lines 460–470, labeled CHEKLO, begin the process of comparing the random number to the lower limit. The high value of the random number (RESHI) is loaded into the accumulator and compared to LOWH, the high byte of the lower random number limit.

Line 480 branches if the carry flag is clear (random < lower limit) to GETRND, because the random number is less than the lower limit.

Line 490 branches if not equal (random > lower limit) to RANDOK, since this indicates that the random number is greater than the lower limit.

Lines 500–510 compare the low byte of the random number to the low byte of the lower random number limit. This is done only when the high bytes of the random number and low limit are equal.

Line 520 branches if the carry flag is clear (random < lower limit) to GETRND to try another random number. If this branch is not taken, we know that the random number is greater than or equal to the lower limit, and is acceptable.

Line 530 returns to BASIC when the random number is greater than or equal to the lower limit, and less than or equal to the upper limit. The random number is in BASIC’s return area (RESLO and RESHI), ready to be used by the BASIC program.

Now that we’ve completed the random number subroutine (“hat” version), let’s use it in a BASIC program. Figure 5 shows the subroutine installed in a BASIC program.

10 DIM D(319):FOR X=0 TO 319 :D(X)=192:NEXT X
30 FOR X=1536 TO 1598: READ N:POKE X,N:NEXT X:RAND=1536
40 A=USR(RAND,0,319)
50 D(A)=D(A)-1:PLOT A,D(A)
60 GOTO 40
100 DATA 216,169,0,133,203,133,204,104,201,1,240,6,104,133,204,104,133,203,104,133,207,104,133,206,173
110 DATA 10,210,133,212,173,10,210,133,213,197,207,144,10,208,240,165,212,197,206,144,2,208,232,165,213
120 DATA 197,204,144,226,208,6,165,212,197,203,144,218,96

                    Figure 5.

After typing in Figure 5, RUN it. In a few seconds (required to initialize the program), you will see a graphic representation of the random numbers being generated by the subroutine. The program is generating random numbers between and 319, and plotting them on a graphics 8 screen, each value plotting in the appropriate X column. Like our first BASIC program, this looks fine, doesn’t it?

Stop the program by pressing BREAK and change Line 40 to read:

40 A=USR(RAND,100,101)

This will change the random number range from 0–319 to 100–101, a much smaller range. After changing the program, RUN it. See how much more slowly the columns grow? Even in ultra-fast machine code, the “hat” method has speed problems. What can we do to fix this problem? Our next program will show a technique which works just fine.

Who was that masked program?

One of the many nice things about assembly language is the degree of control you have over the computer. You can rewrite I/O routines, alter the display with control structures known as “interrupts,” and manipulate data in many useful ways. We’re going to use this latter feature to help us write a better, faster random number generator.

The reason our first random number subroutine didn’t work as fast as we wanted was that it was taking every number that came along and checking to see if it was in the specified range. Sooner or later, a number comes along that fits, but we don’t want to wait that long. If you’re interviewing people for a nuclear physicist’s job, you don’t want to talk to everyone in the state of New York, so you place a classified ad listing the qualifications—to limit the number of people you have to interview. That’s just what we’re going to do, only we’ll do it with numbers.

Inside the computer, all numbers are stored in binary format, a series of on or off bits. Using a technique called “masking,” we’ll preprocess the random numbers, making a match in the range we want more likely.

Here’s how it works. First, we get and store the limits of random number values, say, from 200 to 1580. Next, we find the difference between these “endpoint” values, which, in this case, is 1580 - 200, or 1380. Knowing this range makes the random number generation much easier, since we only have to generate a number from 0 to 1380, then add the low limit of 200 to it.

The real “meat” of this technique lies in masking the “raw” random number, so that it will be more likely to fall into the specified range. We take the binary representation of 1380 and make a mask that stops at the highest bit, like this:

1380: 10000111 01101101
MASK: 00000111 11111111

Next, we build a 2-byte random number from the RANDOM location, then AND it with the mask, like so:

RANDOM: 11001011 01101001 = 52073
  MASK: 00000111 11111111
RESULT: 00000011 01101881 = 873

As you can see, the original random number 52073, has been masked down to 873, which is within our range of 1380. We then add 200 (the low limit of our random number) to the previous result, giving a final random number of 1073.

It is possible for the masked random number to exceed our range, but if that happens, we merely try the operation again. In any case, it’s much faster than the “hat” method.

0100 LOWL =  $CB     ;LOW LIMIT
0110 LOMH =  $CC
0140 RANGEH = $D0
0150 RESLO = $D4     ;BASIC'S RESULT
0160 RESHI = $D5
0180 ;
0190     *=  $0600   ;SUBROUTINE STAR
0200 ;
0210     CLD         ;BINARY MATH!
0220     LDA #0      ;INITIALIZE...
0230     STA LOHL    ;LOW RANGE...
0240     STA LOHH    ;DEFAULT (0)
0250     PLA         ;GET # OF ARGS
0260     CMP #1      ;1 ARGUMENT?
0270     BEQ ARG2    ;YES!
0280     PLA         ;PULL AND STORE
0290     STA LOWH    ;LOW RANGE
0300     PLA
0310     STA LOWL
0320 ARG2 PLA        ;PULL AND STORE
0340     PLA         ;PULL HIGH LO
0350     SEC         ;SUBTRACT...
0360     SBC LOWL    ;LOW LIMIT...
0370     STA RANGEL  ;FROM...
0380     LDA HIGHH   ;HI LIMIT...
0390     SBC LOWH    ;AND GET THE...
0410     LDA #$FF    ;INIT LOW MASK
0420     STA LOMASK
0430     LDX #0      ;START W/HI BIT
0450     AND RANGEH  ;IS IT ON?
0470     INX         ;NEXT BIT
0480     CPX #0      ;DONE ALL 8?
0510     TAX         ;ZERO X REGISTER
0530     AND RANGEL  ;IS IT ON?
0550     INX         ;NEXT BIT
0560     CPX #8      ;DONE ALL 8?
0590     BEQ RNDIT   ;NOW GET RAND#!
0610     STA HIMASK  ;SAVE IT,
0620     JMP RNDIT   ;AND GET RAND#!
0700     STA RESLO   ;AND SAVE IT.
0710     LDA RESHI   ;IS HI BYTE
0720     CMP RANGEH  ;>LIMIT?
0730     BCC LOWOK   ;NO, BOTH OK!
0760     CMP RANGEL  ;>LIMIT?
0770     BEQ LOWOK   ;NO, IT'S =, OK!
0800     CLC         ;BY ADDING...
0810     ADC LOWL    ;BASE VALUE...
0820     STA RESLO   ;TO RANDOM...
0830     LDA RESHI   ;NUMBER...
0840     ADC LOWH    ;AND RETURN...
0850     STA RESHI   ;TO BASIC!
0860     RTS
0870 ;
0890 ;
0900 BITS .BYTE $80,$40,$20,$10
0910     .BYTE $08,$04,$02,$01
0920 MASKS .BYTE $FF,$7F,$3F,$1F
0930     .BYTE $0F,$07,$03,$01
0960     .END

                    Figure 6.

Figure 6 shows the assembly code for the random number masking method. Let’s walk through it together, finding out how it works.

Line 210 clears the decimal mode, to ensure that we’re working with binary arithmetic. This is absolutely essential in this program, since we’ll be doing addition.

Lines 220–310 retrieve the low random number limit, just as in Figure 2. Once again, if only one argument is sent by BASIC, the low limit will default to 0.

Lines 320–330 pull and store the high byte of the upper range limit temporarily.

Lines 340–400 pull the low byte of the upper limit, then subtract the low limit from the upper limit, giving the range of values. This number is stored in the locations RANGEL and RANGEH.

Lines 410–420 initialize the low byte mask to $FF (11111111 binary).

Lines 430–500 make up a loop which scans the high byte of the range to find the first “on” bit. This is done by using the BITS table at Lines 900–910. The X register is used to index each byte in the bits table, which is, in turn, ANDed with RANGER. If the result of the AND operation is nonzero, the bit is on, and the program branches to GOTHLM to select the proper mask for the high byte. If no bits are on in the high byte of the range, the HIMASK mask is set to 0. Three typical bytes and their associated masks are shown in Figure 7.

HI BYTE: 10110001
   MASK: 11111111

HI BYTE: 00110100
   MASK: 00111111

HI BYTE: 00000000
   MASK: 00000000

        Figure 7.

Lines 510–590 perform the same function as Lines 430–500, except that they find the highest bit in the low byte of the range. This code is only performed if no bits were found in the high byte of the range. If no bits are on in the low byte, the mask is set to 0, and the program will branch to RNDIT, where a random number will be generated.

Lines 600–620 load the appropriate high-byte bit mask from the MASKS table, placing it in the location HIMASK, then jump to RNDIT, to generate a random number.

Lines 630–640 load the mask for the low byte of the random number from the masks table. This byte is placed in LOMASK.

Lines 650–700 load random bytes from the location RANDOM, mask them with the LOMASK and HIMASK masks, and place them in the RESLO and RESHI bytes. Remember, we must still compare this number to the random number range to be sure it’s not too big, before returning to BASIC.

Lines 710–780 perform a 2-byte comparison operation RESLO & RESHI and RANGEL & RANGEH. If the random number generated is greater than the range, the program loops back to RNDIT to try again.

Lines 790–850 are executed when the random number generated is acceptable. They add the random value to the low range limit, placing it back into RESLO and RESHI. At this point, the subroutine is finished, and we have a random number between the specified upper and lower limits.

Line 860 returns to BASIC with the RTS instruction.

Lines 900–930 are .BYTE directives which set up the bits and masks tables. These are used in Lines 440–640 to set up the appropriate data mask values. Note that each table is made up of 8 bytes, and that each byte of the masks table is the mask for the corresponding byte of the bits table.

Lines 940–950 are the storage locations for the high and low byte masks.

Figure 8 is a BASIC program with the “masking” random number subroutine. Type it in and RUN it.

10 DIM D(319):FOR X=0 TO 319:D(X)=192:NEXT X
30 FOR X=1536 TO 1687:READ N:POKE X,N:NEXT X:RAND=1536
40 A=USR(RAND,0,315)
50 D(A)=D(A)-1:PLOT A,D(A)
60 GOTO 40
100 DATA 216,169,0,133,203,133,204,104,201,1,240,6,104,133,204,104,133,203,104,133,206,104,56,229,203
110 DATA 133,207,165,206,229,204,133,208,169,255,141,150,6,162,0,189,134,6,37,208,208,26,232,224,8
120 DATA 208,244,141,151,6,170,189,134,6,37,207,208,19,232,224,8,208,244,141,150,6,240,15,189,142
130 DATA 6,141,151,6,76,88,6,189,142,6,141,158,6,173,10,210,45,151,6,133,213,173,18,218,45
140 DATA 158,6,133,212,165,213,197,208,144,10,208,232,165,212,197,287,248,2,176,224,165,212,24,181,283
150 DATA 133,212,165,213,101,204,133,213,96,128,64,32,16,8,4,2,1,255,127,63,31,15,7,3,1
160 DATA 0,0

                Figure 8.

Once again, you will see the random numbers selected graphically represented by columns on your screen. As you can see, the subroutine returns random values quickly. Now stop the program with the BREAK key and change Line 40 to read:

40 A=USR(RAND,100,101)

RUN the program again. See how fast columns 100 and 101 grow? Seeing is believing: the masking method of generating random numbers gives much faster results than the “hat” method, even when the random number range is small.

Don’t change that dial!

Next issue, we’ll delve into new areas of assembly language programming on the Atari personal computers. Until then, study these program examples to increase your understanding. Remember, if you get stuck, you can contact Charles Bachand or me on CompuServe, or by writing.

Boot Camp
c/o ANALOG Computing
P.O. Box 23
Worcester, MA
A.N.A.L.O.G. ISSUE 33 / AUGUST 1985 / PAGE 33

Boot Camp

by Tom Hudson

Yes, loyal Boot Camp fans, it’s finally back! Boot Camp has been absent from the pages of ANALOG Computing for the last few issues because of the time required to get the ANALOG Computing Telecommunications System running, as well as the hours spent working with the new Atari ST computer. But enough excuses! Let’s get started.

We’ve covered 6502 assembly language well enough that you should now be able to use some of the built-in features of your Atari computer. This issue, we’ll begin working with the Central Input/Output system, or CIO.

Now the fun really begins.

You’ve probably been aching to try input and output operations on your Atari computer in machine language, but, up to this point, haven’t had the information necessary to do it. Well, get ready to start learning how, because, beginning with this issue, we’ll find out what CIO is and how we can put it to use in machine language programs.

Knowing how to use CIO opens doors to literally any application you want to write; games, utilities, business programs, and so on. With CIO, you can access the built-in screen editor, printers, disk drives. modems, and even create your own device handlers (such as issue 31’s V: or my Unicheck U: device). The possibilities are virtually endless.

This month, we’ll begin covering the CIO fundamentals, data storage and the CIO subroutine itself, and demonstrate the use of CIO with simple program segments. Future issues will cover more complex uses of CIO, including multi-file disk operations and modem usage.

In order to have a good reference for the CIO system, I suggest that you invest in the following books: Mapping the Atari, by Ian Chadwick from COMPUTE! Books, P.O. Box 5406, Greensboro, NC 27403, $14.95; and Atari Technical Reference Notes, from Atari Customer Relations, P.O. Box 61657, Sunnyvale, CA 94088, $29.95.

These books, especially the Atari Technical Reference Notes, will prove invaluable when you start working with the CIO system.

What is CIO?

When the Atari personal computers were designed, the creators of the operating system (OS) wanted a way for programs to access the various devices present in a computer system, such as disk drives, cassette drives, printers, and so on. In addition, they wanted the devices to be accessed in such a manner that the program using the device didn’t necessarily have to know what the device was in order to use it.

For example, in the Atari DOS Disk Utility Program (DUP), the user may copy a file from a disk file to a printer or the screen, merely by specifying the device names P: or S:, respectively. The DUP program doesn’t care which device is specified, because both the screen and the printer are accessed through CIO in the same manner. The only thing the DUP program needs to access these devices is the device name, P: or S:. This is known as “device independence.”

What does device independence mean to us? It means that we can call just about any device in the Atari computer system with the same set of commands and parameters. They all conform to a set of governing rules, and to access them, all we do is provide the required information.

Some devices, such as disk drives, have additional capabilities, such as random access (the ability to write or read anywhere in a data file, and not just sequentially). In order to accommodate these additional commands, CIO treats all command numbers above a certain point specially. We’ll talk more about this later, when we start working with graphics and disks.

Not only is CIO device-independent when you call it, but it provides a consistent way to report errors. This is known as “unified error handling.”

Whenever a CIO device handler encounters an error, it returns a value greater than 128 in the 6502 Y register. If the end of a file is reached, CIO returns a 136 (end of file error) in the Y register. A serial bus error is indicated by the number 140.

The error-handling is called “unified” because each device handler uses the same error numbers to report common errors. That is, both the end of file on a cassette file and the end of file on a disk file are reported as an error 136. In this way, your program can be written to handle common errors automatically for any device used.

As with commands, some devices can generate errors that are unique to that type of device. For example, an error 171 (POINT invalid) can only occur on a disk drive. A cassette I/O operation would never return this code, nor would a keyboard operation. CIO sets up a standard set of error codes, but it allows the flexibility to create your own error numbers if you’re writing a special device handler.

Device handlers.

Each device you want to access (screen, disk, cassette, etc.) must have a set of machine language subroutines in memory. These subroutines are known as “device handlers.” When the Atari is powered on, it installs a default set of device handlers: the screen editor (E:), screen (S:), cassette (C:), printer (P:) and keyboard (K:). These were considered the essential devices at the time the OS was developed.

Additional device handlers are loaded into memory as needed. The disk handler (D:) is loaded whenever you boot a DOS disk. The RS-232 handler (R:) is loaded from the 850 interface when requested by a special booter program. Other special handlers, such as Unicheck’s U: and the V: memory storage device, may be loaded by a user program via an AUTORUN.SYS file.

The Atari OS will allow twelve device handlers to be in memory at once. With the five automatically loaded at power-up time, that leaves seven for you to work with if you ever want to add your own handlers (more, I suspect, than you’ll ever need). If seven handlers aren’t enough (!), you can even replace the default handlers with your own! We’ll cover this advanced topic in the future—right now, let’s find out how to access these device handlers.

The IOCBs.

From a CIO user’s point of view, the most important (and most visible) feature of CIO is the Input/Output Control Block (IOCB). You’ve used the IOCBs in any program that did I/O with a file. For example, in the following BASIC statement:

OPEN #1,8,0,"P:"

the printer (P:) is opened by using IOCB number 1 (OPEN #1). There are eight IOCBs available for use by programs, numbered 0–7. IOCB number 0 is used by default for the screen editor (E:), which allows you to communicate with your computer via the keyboard and the screen. It’s a powerful device, and one which we’ll be using extensively.

All the IOCBs are independent of each other and, therefore, can all be used at the same time with different devices. This gives an incredible amount of flexibility in programming complex applications.

Whenever you open an IOCB to use a device, CIO “looks up” the specified device in the device handler table, then sets special system information in the data fields of the IOCB you specified for the I/O operation.

Each I/O operation is accomplished by placing I/O command information in the selected IOCB and then performing a JSR to the Central I/O Vector, CIOV, which is located at $E456. CIOV then takes the command information from the IOCB and attempts to perform the requested operation. Due to the device-independent nature of CIO, all I/O for all devices is done through the CIOV subroutine. This makes I/O programming on the Atari computers very easy.

┌─────────┐ $0340
│ IOCB #0 │
├─────────┤ $0350
│ IOCB #1 │
├─────────┤ $0360
│ IOCB #2 │
├─────────┤ $0370
│ IOCB #3 │
├─────────┤ $0380
│ IOCB #4 │
├─────────┤ $0390
│ IOCB #5 │
├─────────┤ $03A0
│ IOCB #6 │
├─────────┤ $03B0
│ IOCB #7 │

    Figure 1.

The eight IOCBs are arranged in memory starting at $0340, as shown in Figure 1. Each IOCB is 16 bytes long. Each byte has a particular function, listed below. The address given for each byte is the address of that field in IOCB #0. We’ll see why we use the IOCB #0 addresses in a moment.

┌───────┐ $0340
├───────┤ $0341
├───────┤ $0342
├───────┤ $0343
├───────┤ $0344
├───────┤ $0345
├───────┤ $0346
├───────┤ $0347
├───────┤ $0348
├───────┤ $0349
├───────┤ $034A
│ ICAX1 │
├───────┤ $034B
│ ICAX2 │
├───────┤ $034C
│ ICAX3 │
├───────┤ $034D
│ ICAX4 │
├───────┤ $034E
│ ICAX5 │
├───────┤ $034F
│ ICAX6 │

    Figure 2.

For the following field descriptions, refer to Figure 2.

ICHID ($0340) is a field used by CIO indicating the I/O handler identification. It is an index into the system device table, set when the IOCB is opened. If the IOCB is not open, this byte will contain $FF. Do not change this byte.

ICDNO ($0341) is the device’s number. When there can be more than one I/O device of a particular type, such as disk drives, the OPEN command may include a device number after the device’s letter code. For example, disk drive number 2 is indicated by D2:, and disk drive number 4 is indicated by D4;. CIO determines the device number when the IOCB is opened and places the binary value of the device number in this location. For D2: then, ICDNO would contain $02. This byte is set by the system; do not change it.

ICCMD ($0342) is a numeric value provided by the programmer which indicates the function CIO is to perform. This value must be set before each call to CIO. The standard commands are: OPEN FILE ($03)—Ready file for I/O; CLOSE FILE ($0C)—Close file (I/O is finished); GET CHARACTERS ($07)—Get n bytes from device; PUT CHARACTERS ($0B)—Write n characters to device; GET RECORD ($05)—Read a line of text; PUT RECORD ($09)—Write a line of text; GET STATUS ($0D)—Determine device status; and SPECIAL (>$0D)—Device-specific command. These commands will be covered in detail in a few minutes.

ICSTA ($0343) is a 1-byte status code returned by CIO after each CIO call. The high-order (sign) bit is set to indicate error conditions (remember, any status codes greater than 127 indicate that an error has occurred). A status code of $01 indicates that the I/O operation was successfully completed.

ICBAL ($0344) and ICBAH ($0345) make up a 2-byte pointer which contains the address of the I/O data buffer. The data buffer is the place where: (1) data is to be read into or written from, or (2) the device:filename is stored for an OPEN. You set ICBAL to the low-order byte of the data buffer and ICBAH to the high-order byte. You may change this pointer at any time.

ICPTL ($0346) and ICPTH ($0347) is set up to point to the device handler’s PUT CHARACTER (-1) routine when the IOCB is opened. This pointer was used by the BASIC cartridge and is not normally used.

ICBLL ($0348) and ICBLH ($0349) make up a 2-byte binary value indicating the buffer size for read and write operations. It is not required for OPEN. After the read or write, CIO will set this value to the number of bytes actually transferred into or out of the data buffer. In this way, you can test for certain errors.

ICAX1 ($034A) and ICAX2 ($034B) are auxiliary information normally used by the OPEN command. ICAX1 controls the input and output functions. The 4 bit is set for input, and the 8 bit is set for output. That is, to open a file for input, place $04 in ICAX1 before OPENing the IOCB. For output, place $08 in ICAX1. For both input and output, place $0C (12 decimal) into ICAX1. The remaining bits in ICAX1 and all of ICAX2 are device-dependent and, usually, not required. You’re probably familiar with these bytes from the BASIC OPEN statement:

OPEN #1,8,0,"P:"

In this statement, the 8, corresponding to ICAX1, indicates that the file is to be opened for output. The 0, corresponding to ICAX2, indicates that ICAX2 is set to for the printer. ICAX2 is most commonly used for setting the short inter-record gap mode on the cassette, where it has the value 128. These bytes must be set by the user before an OPEN command is issued.

ICAX3 ($034C), ICAX4 ($034D), ICAX5 ($034E) and ICAX6 ($034F) are all auxiliary bytes not normally used by CIO. They’re primarily used by certain disk I/O operations, and we aren’t concerned about them right now. Just be aware that they’re available for use by the I/O device handler and may be set by the user to utilize special functions.

Now that we’ve defined the IOCB data fields, let’s take a look at the various CIO commands and what fields we need to set in order to use them.

Command performance.

The Atari OS has a set of eight basic functions that must be supported by all device handlers. Obviously, some functions cannot be used with certain devices (such as a READ from a printer); in such cases, the device handler would return an error 146, indicating that the requested function is not implemented in the handler.

In order to make CIO execute a desired I/O function, you must set up an IOCB with a numeric value indicating the function you want executed. This is the “command” byte. Each command must have various parameters set, so that CIO knows where the data is to be transferred to or from. Now, let’s examine each of the eight basic commands and their use.

Opening files.

Before you use any I/O device, an IOCB pointing to that device must be OPENed by CIO. This is done with (what else?) the OPEN command.

In order to issue the OPEN command, the following fields must be set up in the desired IOCB:

Set to $03
2-byte pointer to device/filename specification
Device direction and device-dependent information
Device-dependent information

Let’s see how this command relates to its BASIC counterpart, the OPEN statement. In BASIC, the command to open the keyboard looks like this:

OPEN #1,4,0,"K:"

This command uses IOCB number 1 to open the keyboard. The direction value, 4, indicates that the file has been opened for input only.

In assembly language, the same open command would look like this:

0100  LDX #$10       ;IOCB #1
0110  LDA #$03       ;OPEN COMMAND VALUE
0120  STA ICCMD,X    ;PUT IN IOCB #1
0130  LDA #KEYBD/256 ;HI ADDR OF "K:"
0140  STA ICBAH,X    ;PUT IN IOCB #1
0150  LDA #KEYBD&255 ;LO ADDR OF "K:"
0160  STA ICBAL,X    ;PUT IN IOCB #1
0170  LDA #$04       ;INPUT ONLY
0180  STA ICAX1,X    ;PUT IN IOCB #1
0190  LDA #$00       ;ZERO AUX BYTE 2
0200  STA ICAX2,X    ;PUT IN IOCB #1
0210  JSR CIOV       ;OPEN IT!
0220 KEYBD .BYTE "K:",$9B

Line 100 of this code sets the X register to $10. This is an offset which is used to point to IOCB number 1. This is extremely important: ALWAYS set the X register to the IOCB number you’re using times 16 before calling CIO! The X register is used by the CIO subroutine so that it knows which IOCB it’s supposed to work with. This is a simple process—if you want to use IOCB #0, set the X register to $00. If using IOCB #1, set the X register to $10. If using IOCB #7, set the X register to $70.

Now that we’ve set the X register to point to IOCB number 1, we need to set the rest of the CIO parameters used for the OPEN command. Lines 110–120 place the value $03 in the ICCMD location of IOCB number 1. This sets the command byte in IOCB number 1 (ICCMD,X) to $03, the numeric value of the OPEN function.

Using the X register as an offset on the ICCMD field stores the command in ICCMD + 16 (remember, the X register contains $10, or 16 decimal). Since ICCMD is defined at its location in IOCB number ($0342), adding $10 to its address points to the ICCMD byte in IOCB number 1, at location $0352 (see Figure 1). By using the X register in this way, we can store commands in any IOCB by using the same X register setting that’s used to tell CIO which IOCB we’re using! It’s a convenient feature of the CIO system.

Lines 130–160 in the code set the 2-byte pointer, ICBAL and ICBAH, to point to the address of our keyboard device string, which is stored at the label KEYBD. This is a simple text string, terminated with the byte $9B, which is the ATASCII end-of-line (EOL) character. CIO will use this string to find out which device it is to open.

Lines 170–180 set the ICAX1 byte in IOCB number 1 to $04, indicating that the keyboard is to be opened for input operations. This is the same number four that was used in the BASIC OPEN statement above.

Lines 190–200 set the ICAX2 byte in IOCB number 1 to $00. Once again, this corresponds to the 0 in the BASIC OPEN statement.

Line 210 finishes the OPEN process by calling the main CIO entry point, CIOV, located at $E456. It is imperative that you be sure the X register contains the value of the IOCB number times 16 before executing the JSR CIOV statement.

When you perform a JSR to the CIOV routine, the system sets up the requested IOCB to be used for I/O to the specified device. After the CIO operation is finished, control resumes at the statement after the JSR.

At this point, both the Y register and the ICSTA byte will contain information on the status of the CIO call. Remember, a status value of $01 indicates that the I/O operation was successful; a value greater than or equal to $80 indicates an error has occurred. You can use the 6502 CPY instruction right after the CIO call to test for specific error codes.

Now that the file is opened, you may perform other I/O operations on it, as listed below.

Close file.

The CLOSE command (numeric code $0C) is used when you’re through using the I/O device. This is an especially important command when you’re writing a file to disk or cassette, since it insures that all the data sent to the file is actually written to the device. Remember: always CLOSE your files.

The only IOCB parameter you need to set up for the CLOSE command is the command byte, ICCMD. Simply set it to $0C, load the X register with the IOCB number times 16, and JSR CIOV. The file will be closed, any data in the output buffer (if the file is opened for output) will be written, and the IOCB will be released for other uses. The following code shows how to close the file we opened in the previous OPEN example:

LDX #$10       ;POINT TO IOCB #1

Closing files is an easy operation, but its importance cannot be overestimated. Always double-check your code to be sure your files are closed properly. In addition, remember to check the status of every file CLOSE operation. Errors are possible on CLOSE commands, so be sure you test for them by checking either the Y register or the ICSTA location.

Getting characters.

Once your file is opened for input, you can read data from it in two ways. You can get characters regardless of what they are, or you can read lines of text which are terminated with the ATASCII end-of-line (EOL) character.

The first method we’ll examine is GETting characters. In this mode, CIO will read the specified number of characters into the data buffer you define, no matter what the characters are. The IOCB fields used with the GET CHARACTERS command are:

Set to $07
Pointer to input data buffer
2-byte value indicating the data buffer length.

On a GET CHARACTERS command, the ICCMD location is set to $07, the proper numeric code for the GET operation.

The 2-byte pointer made up of ICBAL and ICBAH should be set to point to the beginning address of your data buffer. The data buffer is where CIO will place the characters read from the device. It should be set up so that it’s long enough to hold all the characters you tell CIO to read. Set ICBAL to the low-order byte of the buffer’s address and ICBAH to the high-order byte.

The 2-byte value made up of ICBLL and ICBLH tells CIO how many bytes you want to GET from the device. The low-order byte of the count should be placed in ICBLL, and the high-order byte should be placed in ICBLH. Be sure the byte count is not larger than the size of the buffer pointed to by ICBAL and ICBAH. If you GET more characters than your buffer can hold, CIO will clobber whatever data or program instructions follow the buffer.

The following example will GET 548 characters from the file indicated by IOCB number 6, placing them in a buffer called MYBUF. MYBUF is set up to be 1000 characters long, which is a safe size for our read of 548 characters. (IOCB number 6 is assumed to be open already.)

0100  LDX #$00       ;IOCB #6
0110  LDA #$07       ;GET CHARACTERS
0120  STA ICCMD,X    ;PUT IN IOCB #6
0140  STA ICBAH,X    ;PUT IN IOCB #6
0160  STA ICBAL,X    ;PUT IN IOCB #0
0170  LDA #548/256   ;HI BYTE COUNT
0180  STA ICBLH,X    ;PUT IN IOCB #6
0150  LDA #548&255   ;LO BYTE COUNT
0200  STA ICBLL,X    ;PUT IN IOCB #6
0210  JSR CIOV       ;GET THE BYTES!
0220 MYBUF *= *+1000 ;DEFINE BUFFER

Line 100 of this code points to IOCB number 6, as explained earlier.

Lines 110–120 set the ICCMD location of IOCB number 6 to the numeric value of the GET CHARACTERS command, $07

Lines 130–160 set ICBAL and ICBAH to point to the input buffer, MYBUF. When we call CIO, it will GET the characters from the device and place them in memory starting at the first byte of MYBUF.

Lines 170–200 set up ICBLL and ICBLH in order to tell CIO to get 548 characters from the device indicated by IOCB number 6.

Line 210 calls the CIO subroutine, which will attempt to read 548 bytes into MYBUF.

After CIO GETs the characters from the device, the Y register and ICSTA location will contain the status of the GET operation. It is important to check the status of a GET, because the end of the file may have been reached.

Also, after the GET is complete, the ICBLL and ICBLH will have been changed by CIO to tell you how many bytes were actually read into the buffer. If all 548 bytes were read, ICBLL and ICBLH will contain the value 548. If, on the other hand, the end of the data file was reached or another error occurred, ICBLL and ICBLH will indicate how many bytes were actually read into the buffer. In this way, you can properly handle end-of-file or other error conditions.

A special case of the GET CHARACTER command exists if you only want to get one character from the input device. This special option is indicated if you set the buffer length value to 0. If this is done, CIO will get one character from the input device as it becomes available, then put it in the accumulator.

Obviously, this variation on the GET CHARACTERS command does not require that the buffer address (ICBAL and ICBAH) be set before calling CIO. All you need to set is the command byte ($07) and the buffer length (0).

Putting characters.

The output equivalent of the GET CHARACTERS command is the PUT CHARACTERS command. As you might suspect, this function will write the specified number of bytes to the output device. It works just like the GET CHARACTERS command, but in reverse. You need to set the following IOCB variables to use this command:

Set to $0B
Pointer to start of data buffer
Number of bytes to write to device

Like GET CHARACTERS, PUT CHARACTERS has a special 1-byte PUT option. By setting ICBLL and ICBLH to 0, CIO will write the byte in the accumulator to the specified device. Once again, for this special case, ICBAL and ICBAH are not used, and you don’t need to set them up.

More to come.

Next issue, we continue our look at CIO and will complete the summary of CIO commands. We’ll also begin writing programs which use CIO to accomplish input and output operations. Until then, it’s a good idea to try and pick up the two reference manuals mentioned earlier in this article and read the sections on CIO.

A.N.A.L.O.G. ISSUE 34 / AUGUST 1985 / PAGE 33

Boot Camp

by Tom Hudson

Last issue’s Boot Camp introduced you to the exciting world of the Atari computers’ Central Input-Output system, CIO. This issue, we’ll conclude our examination of the CIO functions and write a short program demonstrating the various CIO routines.

If you haven’t read last issue’s Boot Camp, I suggest you do so before tackling this installment. There is a lot of background information you’ll need to understand before trying to use these functions.

Reading records.

We’ve already seen how the Atari CIO routine can read and write individual characters. What happens when we want to work with whole sentences? Fortunately, CIO has the ability to work with data in sentence form as well, by using the GET and PUT RECORD commands.

The Atari CIO routines consider a “record” to be any group of characters which ends with the ATASCII end-of-line (EOL) character, $9B (155 decimal). When typing, the EOL character is generated each time you press the RETURN key on your keyboard. Files such as LISTed BASIC and assembly language programs are lines of text terminated with EOLs. To read them, you would use the CIO GET RECORD command; to write them, you would use the CIO PUT RECORD command.

To use the GET RECORD command, you need to set the following IOCB parameters:

Set to $05
Pointer to start of input data buffer
Maximum number of bytes to read, including EOL.

This command is rather straightforward. When you call CIO, it will read characters from the specified file until an EOL character is read. The 2-byte pointer made up of ICBAL and ICBAH tells CIO the address where you want it to place the record read from the file. You must be sure that the 2-byte value ICBLL and ICBLH is set to the size of the buffer you’ve reserved, otherwise CIO may read in extra characters and wipe out important data that resides in the memory after the buffer.

You must also be sure that your input buffer is large enough to hold the biggest record that will be read from the file. If CIO reads a record too long to fit in the buffer, it will fill the buffer with as many characters as it can, then continue reading until an EOL character is read or the end of the file is reached.

All the characters read after the buffer is full will be discarded, and your program will never see them. After the EOL is read from the file, CIO places an EOL at the end of the buffer and returns an ERROR #137, to indicate that a “truncated record” was read. It’s not a good practice to have records truncated in your programs, so just be sure you’ve made your buffer large enough to hold the largest record that’s possible in the file.

Typical text files created by the Atari screen editor will have records of less than 128 characters. I usually allocate a 256-byte buffer for text I/O operations, to provide a wide safety margin.

After issuing a GET RECORD command, CIO will set the ICSTA (status) byte and the Y register to indicate the result of the input operation. It will also change the ICBLL and ICBLH bytes to inform you of the number of bytes that were actually read from the file. With a 256-byte buffer, the number of bytes read could range from 1 to 256. This number can come in handy in some types of I/O processing.

The example below shows a typical GET RECORD operation from a file. The IOCB used is number four, and is assumed to be open already.

0100    LDX #$40        ;IOCB #4
0110    LDA #$05        ;GET RECORD
0120    STA ICCMD,X     ;PUT IN IOCB 4
0130    LDA #BUFFER/256 ;PUT BUFFER...
0140    STA ICBAH,X     ;ADDRESS...
0150    LDA #BUFFER&255 ;IN IOCB...
0160    STA ICBAL,X     ;NUMBER 4
0170    LDA #256/256    ;PUT NUMBER...
0180    STA ICBLH,X     ;OF BYTES...
0190    LDA #256&255    ;IN IOCB...
0200    STA ICBLL,X     ;NUMBER 4
0210    JSR CIOV        ;GET RECORD!
0220    BPL RECOK       ;NO ERROR!
0230    CPY #137        ;TRUNCATED?
0240    BEQ TRUNREC     ;YES!
0250    CPY #136        ;END-OF-FILE?
0260    BEQ EOFILE      ;YES!
0270 BUFFER *=*+256     ;RESERVE BUFFER

Line 100 of this code tells the CIO that it is to use IOCB number four for this I/O operation. Remember, the IOCB number is passed to CIO through the X register, and is the value of the IOCB number times 16. Fortunately for us humans, who don’t enjoy working in base 16, this number works out nicely: IOCB number is indicated by $00, IOCB number 1 by $10, IOCB number 2 by $20, and so on.

Lines 110–120 set the CIO command byte (ICCMD) in IOCB number four (indexed by the X register) to $05, the command byte for the GET RECORD command.

Lines 130–160 set the ICBAL and ICBAH bytes to point to the input buffer we’ve set up at Line 270. All characters read from the file will be placed in memory starting at this address.

Lines 170–200 set ICBLL and ICBLH to the length of our data buffer. It’s been set up to hold 256 bytes, so we place the value 256 into the ICBLL and ICBLH bytes. It’s important that you always set these bytes to the buffer length—if this value is smaller than the buffer size, you may get a truncated record error; if it’s larger than the buffer, the GET RECORD operation may wipe out important data that follows the buffer.

Line 210 jumps to the CIO subroutine vector, CIOV, to perform the GET RECORD function. After CIO has completed the I/O operation, it will return to this point in the code.

Line 220 branches to the label RECOK, if the GET RECORD operation was completed successfully. Remember, if CIO returns with a negative value in the Y register, some sort of error has occurred, and your program must take the appropriate action. The BPL RECOK instruction is only executed if the Y register is POSITIVE (after a JSR CIOV call, the sign flag contains the status of the Y register), indicating a successful operation.

Lines 230–240 compare the Y register to the value 137, the error code for a truncated record. As indicated above, the truncated record error lets you know that the I/O buffer was too small for the record you tried to read. In the case of this example, a truncated record error would occur if the code tried to get a record longer than 256 bytes. If the error is equal to 137, the program branches to the label TRUNREC, where appropriate code would handle the error and report it to the user.

Lines 250–260 check to see if the error that occurred was a 136, or an end-of-file condition. If so, the program branches to the label EOFILE, where the file would be closed, and processing would continue.

Testing for other errors could also be added, if necessary. I’ve only showed a couple of the possible tests, so you can see how they would be handled.

Putting records.

As with the GET CHARACTERS command discussed last issue, the GET RECORD command has an output counterpart, known as the PUT RECORD command. It works just like the GET RECORD command, but, instead of reading records from an I/O device, it writes them. Let’s see how it works.

The following fields must be set up in the IOCB before executing the PUT RECORD command:

Set to $07
Two-byte pointer to output data buffer
Two-byte value indicating maximum data length

Like the GET RECORD function, the PUT RECORD function operates on data that is terminated with the EOL character (ATASCII 155 decimal, $9B hex). The file that you’re writing to must be opened for either output or input/output operations.

The PUT RECORD function is also similar to the GET RECORD function in that the ICBAL and ICBAH values must be set to the starting address of the data buffer.

The ICBLL and ICBLH values are similar to those used by the GET RECORD command, in that they tell CIO the maximum number of bytes contained in the buffer. When the PUT RECORD command is issued, CIO will write bytes to the output device until it sends an EOL character or reaches the end of the data buffer.

If the end of the buffer is reached before an EOL character is encountered, it will automatically send the EOL character to the file for you. It’s better programming practice, however, to make sure your output buffer contains an EOL character. It is always better to rely on your own programming than to assume that the system will do the operation for you.

The following example shows how to write a text message to a file. The program assumes that IOCB number three is already opened as output to some I/O device. It could be a printer, disk drive, screen or any other I/O device, but it doesn’t matter as far as our program is concerned—CIO’s “device independence” lets us perform most I/O operations without regard to what type of device we’re using.

0100    LDX #$30        ;IOCB #3
0110    LDA #$09        ;PUT RECORD
0120    STA ICCMD,X     ;PUT IN IOCB 3
0130    LDA #RECORD/256 ;PUT BUFFER...
0140    STA ICBAH,X     ;ADDRESS...
0150    LDA #RECORD&255 ;IN IOCB...
0160    STA ICBAL,X     ;NUMBER 3
0170    LDA #92/256     ;PUT NUMBER...
0180    STA ICBLH,X     ;OF BYTES...
0190    LDA #92&255     ;IN IOCB...
0200    STA ICBLL,X     ;NUMBER 3
0210    JSR CIOV        ;PUT RECORD!

Line 100 of this example sets the X register to $30, indicating that we’re using IOCB number three for our output operation.

Lines 110–120 set the CIO command byte to $09, telling CIO that the command we’re issuing is a PUT RECORD command.

Lines 130–160 point the ICBAL and ICBAH pointer to our I/O record, which, in this case, is a simple string in Line 220 stating, THIS IS A TEST. Note that the text is terminated with the ATASCII EOL character, $9B. More on this in a moment.

Lines 170–200 set the CIO buffer length to 92. This is obviously longer than our short message, but that’s all right—CIO will stop as soon as it reaches the EOL character at the end of the message. What if we had not included the EOL? CIO would simply keep sending characters until all 92 were sent or an EOL had been encountered in the memory following the message. The moral: Always specify an EOL character at the end of each text string!

Line 210 calls CIO to perform the PUT RECORD operation.

Upon return from the CIOV subroutine, the Y register and ICSTA location will contain the result of the output operation. A successful operation wall result in a status value of $01, while an error will give a result of $80 or greater.

Finding the status.

There are times when you’ll want to get the current status of an I/O device without actually performing a function that will transfer data. CIO provides a special STATUS function to let you do this quickly and easily.

In order to execute the STATUS function, you must set the following IOCB parameters:

Set to $0D
Pointer to device/filename specification if the file isn’t already open

As with the other commands used by CIO, the command byte tells the CIO subroutine what function it’s to perform. For the STATUS command, the command byte is $0D. The second setting for the STATUS command, ICBAL and ICBAH, is optional. If you want to check the status of the device and the IOCB is already opened to that device, you don’t need to set this parameter.

If the IOCB isn’t already opened, you must set the ICBAL and ICBAH bytes to point to a string which indicates the device (and the filename, if applicable) you want to check. Let’s look at two examples:

0100    LDX #$10       ;IOCB #1
0110    LDA #$0D       ;STATUS...
0120    STA ICCMD,X    ;IN IOCB #1
0130    JSR CIOV       ;GET STATUS!

The above code shows how to check the status of a device if the IOCB is already opened to it. All you need to do is specify the IOCB number in the X register (Line 100), set the command number to $0D (Lines 110–120), and call the CIO subroutine (Line 130). CIO will return the device status in the ICSTA location and the Y register, as well as 4 bytes from the device controller in the DVSTAT locations, from $2EA to $2ED in memory.

Location $2EA in DVSTAT contains device error status and command status information as follows:

0Invalid command frame received.
1Invalid data frame received.
2Output operation error.
3Write-protected disk.
4System inactive (on standby).
7Intelligent peripheral controller.

Location $2EB in DVSTAT holds device status information. For the disk drive, it contains information from the drive’s controller chip status register.

Location $2EC in DVSTAT contains the maximum device timeout value in seconds.

Location $2ED in DVSTAT contains the number of bytes in the output buffer.

When the device whose status you wish to check on isn’t open, you must use the “implied open” option of the STATUS command. Here’s an example :

0100    LDX #$40        ;IOCB #4
0110    LDA #$0D        ;STATUS...
0120    STA ICCMD,X     ;IN IOCB #4
0130    LDA #DEVICE/256 ;PUT DEVICE.
0140    STA ICBAH,X     ;ADDRESS...
0150    LDA #DEVICE&255 ;IN...
0160    STA ICBAL,X     ;IOCB #4
0170    JSR CIOV        ;GET STATUS!
0180 DEVICE .BYTE "R:",$9B

Line 100 of this code sets the X register to $40, indicating that the command is to use IOCB number four.

Lines 110–120 set the CIO command byte to $0D (STATUS).

Lines 130–160 set the ICBAL and ICBAH pointer to point to the string located at the label DEVICE. This string indicates the device we want to check the status of, and must be terminated by an EOL character. In this particular example, we’re checking the status of R:, the number one RS-232 port of the 850 interface module. You could have specified any device or device:filename specification.

Line 170 executes the usual CIO subroutine, CIOV CIO will perform an implied OPEN operation, check the status of the device, CLOSE the device, and return to the calling program. Upon return, you can examine the ICSTA location, Y register, and the contents of the four DVSTAT bytes.

Line 180 contains the device which is to be examined for status information. Always be sure you terminate device name strings with the EOL character ($9B).

Special functions.

One of the nice things about CIO is that you can write your own custom device handlers with functions that are unique to that device. For example, the disk file system used by the Atari computers has several functions that other devices, such as cassette drives, cannot use. In these cases, the device drivers can be written to use command numbers greater than $0D. These commands are known as “special” commands.

Special command values are specified by the device they’re to be used with, and we’ll cover them as required in future installments. Right now, let’s write a program that uses some CIO calls!

A keyboard test.

Listing 1 shows a simple program which uses CIO to get characters from the computer keyboard. If the key is one of the numbers through 9, the program selects the corresponding color and changes color of the screen to that value. Any other key will have no effect. If any errors are encountered during the program’s execution, it will change the screen’s border color to red. The program must be stopped with SYSTEM RESET or the BREAK key, if you’re using the Atari Assembler Editor cartridge or ANALOG Computing’s H:BUG.

Let’s walk through the program in Listing 1 and see what it does.

Lines 140–200 define equates for the CIO commands. Instead of performing an LDA #$07, you can specify LDA #GETCHR. You’ll probably agree that this makes your assembly language source code much easier to follow, since the GETCHR label instantly tips you off that the operation is getting ready for a GET CHARACTER CIO call. Remember that, when you use these equates for CIO commands, you must place the # symbol in front of the equate label, indicating that it’s an immediate instruction. LDA # GETCHR loads the accumulator with the value $07; LDA GETCHR loads the accumulator with whatever is stored in memory location $0007.

Lines 240–250 set up equates for the screen color registers 2 and 4. COLOR2 is the register which controls the character color in graphics mode 0, and COLOR4 controls the screen’s border color.

Lines 290–360 set up equates for the CIO IOCB fields. Again, using descriptive labels like ICCMD (COMMAND) make your program easier to follow when reading the source code. Remember that there are actually eight IOCBs, and these equates merely point to the first IOCB. We will use the X register to “index” into the IOCB we want to use.

Line 370 sets up the CIO subroutine equate, CIOV. After setting up all the appropriate CIO information in the IOCB, perform a JSR CIOV to execute the I/O operation.

Line 410 sets the starting address for our keyboard input test program. The program will load at $3000 (you can change this value to any address in free RAM that you like—I arbitrarily picked $3000).

Line 420 sets the X register to $10, indicating that we will be using IOCB number one. This is a “safe” IOCB to use, since the only IOCB opened by the system is IOCB number zero. More on this later.

Lines 430–440 set the command byte in IOCB number one to the value for the OPEN command, $03. The X register is used to index 16 ($10) bytes past IOCB number zero, which is IOCB number one.

Lines 450–480 point the ICBAL and ICBAH to the filename we want to open. In this case, we’re opening the keyboard, which is specified by the string K: in Line 930. Note that the string is terminated with the ATASCII EOL character, $9B.

Lines 490–500 place the number 4 in the ICAX1 byte of IOCB number one. This tells CIO that the keyboard is to be opened as input.

Lines 510–520 set the ICAX2 byte of IOCB number one to 0. The keyboard does not use ICAX2, but it’s a good idea to zero this byte, anyway. Other functions use this byte, and getting into the habit of setting all the IOCB parameters affecting a function to a known value is a good idea.

Line 530 performs a JSR to the CIOV subroutine, opening the keyboard for us.

Line 540 branches if the 6502 SIGN flag is set to the ERROR routine at Line 870. This happens if CIO encounters an error condition when it tries to open the keyboard. The Y register will contain the error number. If the open was successful, the Y register and the status byte of IOCB number one will contain $01 (operation successful), the SIGN flag will be cleared, and the program will continue operating at Line 610.

Line 610, labeled GETKEY, is where we’ll try to GET a character from the keyboard. This line sets the X register to $10, once again indicating that we’re going to use IOCB number one for a CIO operation.

Lines 620–630 place the GET CHARACTER command in IOCB nimiber one’s command byte. This will instruct CIO that it is to GET a character from the keyboard.

Now we must tell CIO how many characters we want to get from the keyboard. Lines 640–660 do this by placing Os in both the BUFFER LENGTH LOW (ICBLL) and BUFFER LENGTH HIGH (ICBLH) bytes. “A buffer length of zero?” you ask. Remember the special case of the GET CHARACTERS and PUT CHARACTERS commands mentioned last issue? If you set the buffer length to 0, CIO will get 1 byte and place it in the 6502 accumulator! This is a handy option when you only want to GET one byte from a device. No buffer address is needed for this option.

Line 670 performs a JSR to CIOV to GET the character from the keyboard. When we JSR to CIOV, it will wait until a key is pressed, get the key’s ATASCII code, and return it to the calling program in the accumulator.

If any error occurred during the GET CHARACTER operation, Line 680 will detect it and branch to the ERROR location, where the screen’s border color will be changed to red, to let you know an error occurred. If the GET CHARACTER function operated properly, the character typed on the keyboard is in the accumulator, ready to be used, and execution continues at Line 730.

Lines 730–740 subtract 48 from the ATASCII value of the key, which is in the accumulator. If the key was the zero key, whose ATASCII value is 48, the accumulator will contain $00 after the subtraction. If the key was the one key, the ATASCII value of the number 1 (49) will be reduced to $01. The same applies to the other keys from two through nine. After this operation, if a key from zero through nine has been pressed, the accumulator will contain a value from $00 through $09. Other keys on the keyboard will have their ATASCII values adjusted in the same manner, but the final value in the accumulator will be something other than $00–$09.

Line 750 compares the accumulator to the number 10, to see if the key pressed was from one to nine.

If the key value was greater than nine, Line 760 branches to the GETKEY label to try getting another key from the keyboard.

If the key pressed was in the range one through nine. Lines 770–820 shift the numeric value of the key left 4 bits (a multiply-by-16 operation), OR this result with $04, and place it in the COLOR2 color register. This sequence of instructions has the same effect as the BASIC instruction:


where N is the number of the key pressed, from zero through nine. The color registers of the Atari computers contain a color value from 0 through 15 in the upper 4 bits, and a luminance value from 0 through 15 in the lower 4 bits. Shifting the key number (zero-nine) left 4 bits places the key number in the color bits, and ORing the byte with $04 sets the luminance to 4.

Line 830 loops back to the GETKEY label to get another key. This process continues until you stop the program or an error occurs.

Lines 870–880, labeled ERROR, are executed when a CIO error occurs. These lines change color register 4 (the Atari’s screen border color) to $32 ($30 for red + $02 for luminance), making the border a dark red.

Line 890, labeled FOREVER, is an infinite loop which merely JMPs to itself until you stop the program. When an error occurs, the screen border will change to red, and the program will loop here forever.

What kind of errors?

What kind of errors could occur with this program? Type it in, and let’s find out.

After typing and assembling the program into memory, execute it. Press each of the keys from zero through nine and note that the screen color changes each time a different key is pressed. Press the alphabetic keys on the keyboard and note that nothing happens. This is because Lines 730–760 reject any keys other than zero through nine.

No errors yet? Good! This indicates that you’ve probably typed the program in properly. Now, if you’re not using a debug program, press the BREAK key. The screen border should turn red, indicating an error has occurred.

What happened? Simple—pressing the BREAK key during any type of input (to the screen editor or keyboard) generates an ERROR 128, or BREAK KEY ABORT. This is a way for your program to detect when the user presses the BREAK key during program operation.

Now let’s see what other errors can occur. Change Line 420 in the program’s source to read:

0420    LDX #$00   ;POINT TO IOCB #0

and re-assemble the program. Using your debugging utility, execute the code as before. The screen border should instantly change to red, indicating an error. Using your debugging utility, intercept the program and examine the Y register. It should contain $81, or 129 decimal.

This is the IOCB error number, which indicates that you tried an operation on an IOCB that was already open. By changing Line 420 as above, you told CIO to OPEN IOCB zero for the keyboard. Unfortunately, IOCB zero is used by the operating system for the screen editor device, E:, and we can’t use it. All the other IOCBs, from one to seven, are usually available for our use.

More fun coming up.

Next issue’s Boot Camp will concentrate on more interesting applications of CIO, with several useful programming examples. Until then, please review this and last month’s Boot Camp articles, so that you’ll be familiar with the CIO terminology. The best is yet to come!

Send letters to:

Boot Camp
c/o ANALOG Computing
P.O. Box 23
Worcester, MA 01603
Listing 1.
BASIC listing.
0100 .OPT NO LIST 0110 ; 0120 ;KEYBOARD INPUT TEST PROGRAM 0130 ; 0140 OPEN = $03 0150 CLOSE = $0C 0160 GETCHR = $07 0170 PUTCHR = $0B 0180 GET REC = $05 0190 PUTREC = $09 0200 STATUS = $0D 0210 ; 0220 ;SCREEN COLOR REGISTERS 0230 ; 0240 COLOR2 = $02C6 0250 COLOR4 = $02C8 0260 ; 0270 ;CIO EQUATES 0280 ; 0290 ICCMD = $0342 0300 ICSTA = $0343 0310 ICBAL = $0344 0320 ICBAH = $0345 0330 ICBLL = $0348 0340 ICBLH = $0349 0350 ICAX1 = $034A 0360 ICAX2 = $034B 0370 CIOV = $E456 0380 ; 0390 ;NOW, HERE'S THE PROGRAM! 0400 ; 0410 *= $3000 ;START AT $3000 0420 LDX #10 ;POINT TO IOCB #1 0430 LDA #OPEN ;OPEN COMMAND 0440 STA ICCCMD,X ;PUT IN IOCB #1 0450 LDA #KEYBD/256 ;HI ADDR OF "K:" 0460 STA ICBAH,X ;PUT IN IOCB #1 0470 LDA #KEYBD&255 ;LO ADDR OF "K:" 0480 STA ICBAL,X ;PUT IN IOCB #1 0490 LDA #4 ;INPUT 0500 STA ICAX1,X ;PUT IN IOCB #1 0510 LDA #0 ;NO AUX 2 USED 0520 STA ICAX2,X ;PUT IN IOCB #1 0530 JSR CIOV ;NOW OPEN IT! 0540 BMI ERROR ;IF Y<0, BAD OPEN! 0550 ; 0560 ;NOW THAT THE KEYBOARD IS OPEN, 0570 ;WE WILL GET CHARACTERS FROM IT 0580 ;AND CHANGE THE SCREEN COLOR 0590 ;ACCORDING TO THE CHARACTER! 0600 ; 0610 GETKEY LDX #$10 ;IOCB #1 0620 LDA #GETCHR ;GET CHAR COMMAND 0630 STA ICCMD,X ;STORE COMMAND 0640 LDA #0 ;ZERO OUT BUFFER... 0650 STA ICBLL,X ;LENGTH (PUTS BYTE... 0660 STA ICBLH,X ;IN ACCUMULATOR) 0670 JSR CIOV ;GET A BYTE! 0680 BMI ERROR ;IF Y<0, BAD GET! 0690 ; 0700 ;NOW TURN BYTE INTO A NUMBER 0710 ;FROM 0-9 FOR SCREEN COLOR! 0720 ; 0730 SEC ;GET READY FOR SUB. 0740 SBC #48 ;SUBTRACT 48 FROM IT 0750 CMP #8 ;>9? 0760 BCS GETKEY ;YES, TRY AGAIN! 0770 ASL A ;SHIFT BYTE... 0780 ASL A ;LEFT 4 TIMES... 0790 ASL A ;FOR THE... 0800 ASL A ;COLOR, 0810 ORA #$04 ;ADD BRIGHTNESS 0820 STA COLOR2 ;STORE IT 0830 JMP GETKEY ;AND LOOP BACK! 0840 ; 0850 ;CHANGE BORDER TO RED IF ERROR 0860 ; 0870 ERROR LDA #$32 ;GET RED COLOR 0800 STA COLOR4 ;CHANGE BORDER! 0890 FOREVER JMP FOREVER 0900 ; 0910 ;OTHER DATA 0920 ; 0930 KEYBD .BYTE "K:",$9B 0940 ; 0950 ;THAT'S ALL, FOLKS! 0960 ; 0970 .END
A.N.A.L.O.G. ISSUE 37 / DECEMBER 1985 / PAGE 43

Boot Camp

by Tom Hudson

In our last Boot Camp, we looked at a simple example of using CIO to examine the keyboard and return the ASCII value of the key that was pressed. This issue, we’ll begin looking at the finer points of keyboard data entry, including error-trapping and the printing of error messages to the screen. As you can imagine, the knowledge of the process is essential to most advanced machine language applications.

Record or character?

As we’ve seen in previous Boot Camp installments, the Central Input/Output (CIO) system of the 8-bit Atari computers is designed to receive text input in two different ways: characters and records. With character I/O, the system gets or puts one character at a time. We used this type of input in the last Boot Camp (issue 34), when we accepted characters from the keyboard and changed the screen color accordingly.

This issue’s Boot Camp will show how to accept data from the keyboard in records. These are strings of characters terminated with the ATASCII End-Of-Line (EOL) character, which has a value of $9B. We’ll also see how to output records to the screen. This is perhaps the most important I/O operation, since without it, the computer wouldn’t be able to communicate with the user. All the I/O in this installment of Boot Camp will be the record format, using CIO’s two record I/O operations, GET RECORD and PUT RECORD.

GET RECORD review.

In order to input records, the computer must have three pieces of information in addition to the Input/Output Control Block (IOCB) number. These are shown below.

The most important piece of information for the GET RECORD command is the command byte, which is placed in ICCMD. For GET RECORD, this byte is $05.

Since CIO is going to be reading data into memory, we’d better tell it where it’s supposed to put the data record. This address is supplied via the IOCB’s ICBAL and ICBAH variables, which hold the low and high bytes of the input data buffer’s address, respectively. It is absolutely essential to set this address before you call CIO, or CIO will read the data into whatever address is in these bytes, merrily wiping out screen memory, the system variable area, the stack, or (gasp!) your program! ’Nuff said.

People sometimes (all too often, actually) make mistakes, so the third critical parameter is supplied in order to avoid the problem of reading too many characters into memory on the GET RECORD command. Obviously, the GET RECORD command is intended to read a group of characters, terminated with an EOL, into memory, but the program has no way of knowing beforehand how many bytes will be contained in the string.

The string could be 40 characters, 0 characters (just an EOL), or all the way up to 65536 characters. If you were expecting a 40-character string, and the string you received was 65536 bytes long, it could clobber huge pieces of your program, data, screen, and so forth—thus making you look like a pretty pathetic example of a machine language programmer!

Fortunately, the third parameter of the GET RECORD command allows you to tell CIO the maximum length of a record read by the GET RECORD operation, potentially saving your reputation. To use it, simply place a 2-byte character count into the IOCB parameter bytes ICBLL and ICBLH (Buffer Length Low and Buffer Length High) before calling CIO.

Potential problems.

A wise man by the name of Murphy once stated that “if anything can go wrong, it will.” Mr. Murphy must have been a computer programmer, because this statement has been proven an untold number of times in the computer industry. Don’t get caught assuming the person entering the data won’t make mistakes. You and I both know we never make mistakes, but those “other” people out there can’t be trusted as far as you can throw them. Examples coming right up!

Example #1: Martha, the data entry person, fatigued after typing for eight hours straight, was in a hurry to finish the Jones account report on a custom program written by Fred, the careless programmer. Her fingers flying and flapping over the keys with blinding speed, her right pinkie made a one-centimeter error, striking the BREAK key instead of the RETURN key. Fred’s careless programming failed to handle the BREAK key properly, and eight hours of nonstop typing instantly made its way to ATASCII heaven.

Example #2: Freddy Fruegle, 8-year-old boy wonder, had just finished his first machine-language word processing program on his Atari 130XE, and proceeded to type like mad on his 80-page nuclear physics term paper, due the next day. At 6 a.m., his 320,000-character essay (average sentence length: 287 characters), outlining obscure physical properties of energy plasma, was complete.

He proceeded to print out the masterpiece, only to find that each sentence had been chopped off at five characters! Freddy may have been just eight years old, but his mastery of verbal obscenity was matched only by his 200+ IQ. As you can see in the above examples, untrapped errors in machine language programs using CIO can result in some heartbreaking experiences. This doesn’t have to happen, though, as we’ll soon see.

BREAKing away.

In the case of Martha, Fred, the careless programmer, obviously failed to handle the BREAK key properly.

When CIO is accepting data from the keyboard, it reads characters until the RETURN key is pressed (generating an EOL) or the BREAK key is pressed. If CIO encounters the BREAK key, a special error condition (error number 128, BREAK key abort) is generated and returned to the program in the Y-register. Whatever Fred was doing, he handled the BREAK incorrectly and blew away hours of effort. Whenever you’re handling keyboard input, it’s absolutely essential to test for the BREAK key abort error after the CIO call, and handle it properly.

If the BREAK key abort does happen, you should detect it by testing the Y-register for 128 ($80) and report the problem to the user. The input buffer will contain text entered to that point, but will not have a terminating EOL character. After telling the user that the line was lost, return to the input routine to try again.

If the BREAK key isn’t used to stop a scrolling listing or to perform some other function in your program, you can “mask” the BREAK key interrupt by performing the following set of instructions at program initialization time and each time the graphics mode is changed:

POKMSK = $10
    AND #$7F

This code changes the IRQ enable control register, so that the BREAK key is completely ignored. This will prevent the BREAK key abort error from occurring. The high-order bit of the IRQEN register controls the BREAK key interrupt, and when this bit is turned off (with the AND #$7F instruction), any presses of the BREAK key are not detected by the system.

Truncated lines.

Freddy Fruegle’s despair could have been prevented if he had simply taken two simple steps when programming his word processor.

First, he accidentally forgot to set the input buffer length (ICBLL and ICBLH) to the maximum line length his program was to accept. Apparently, buffer length had previously been set to 5 bytes, and CIO, assuming that the input buffer was only five characters long, diligently ignored all characters after the fifth one entered!

Second, Freddy’s program ignored the errors returned by CIO each time a line longer than five characters was entered. Every time this happened, CIO returned an ERROR 137 ($89) in the ICSTA variable and the 6502 Y-register. Had Freddy been thinking properly, he would have had his program examine the Y-register upon return from CIO and print a message warning him about the truncated input. See what careless programming can do? If anything can go wrong, it will, but CIO gives you the chance to recover without undue effort.

This issue’s program.

The example program in this issue illustrates the principles I’ve been talking about and demonstrates how to set up and print prompts and error messages. Briefly, when executed, the program opens the keyboard for input, accepts text records from it, and exits with a BRK instruction if an End-Of-File (EOF) is detected.

An EOF is generated on the keyboard by pressing CTRL-3 (CTRL and 3 keys at the same time). After each line of text is entered, the computer prints the line back to the user. If the BREAK key is pressed, the program will alert the user. If a line longer than 40 characters is entered, the user is notified. Let’s walk through the program and see how it works.

Lines 110–200 set up the equates for the system variables we’ll be using. COLOR4 is included so that, in the event that we can’t print text, we can change the color of the screen to indicate the error. Failure to print text will usually only result from the screen editor (device E:) not being opened properly. Assuming you are using a debugger such as ANALOG Computing’s H:BUG or the Atari Assembler Editor cartridge, IOCB #0 will always be open as the screen editor, ready for your use.

Line 240 sets the start of our program at $6000, since this program is too large to fit on page 6 of memory ($0600–06FF). If you have less than 32K of memory, you’ll have to change this line to a safe area of memory (about 512 bytes).

Line 280 clears the decimal mode (never, never forget this instruction if your program is going to do any math operations.

Line 290 begins the set of instructions that opens the keyboard (K:) for input. This line loads the X-register with $10, indicating the IOCB #1 is to be used for the keyboard.

Lines 300–310 set the CIO command byte to $03, the OPEN command, once again getting ready to open the keyboard.

Lines 320–350 point the IOCB buffer address to the keyboard device string, KEYBD, defined at Line 1760. This string indicates that the device we want to open IOCB #1 for is the keyboard (K:).

Lines 360–390 set the IOCB auxiliary bytes for the open. ICAX1 is set to $04, telling CIO that the keyboard is to be opened for input, and ICAX2 is set to (has no function for the keyboard device handler).

Line 400 calls CIO to execute the OPEN function.

Line 410 branches to OPNERR if the keyboard OPEN resulted in an error. We’ll look at the error handler in a few moments.

If the keyboard was opened successfully, Lines 450–560 print a prompt to the screen, as follows:

Line 450 sets the X-register to $00, indicating that we’re going to work with IOCB #0, the screen editor.

Lines 460–470 will place $09 in the IOCB command byte, ICCMD, which is the command number for Put RECORD. The record we’re going to output is the initial prompt for the program.

Lines 480–510 point the buffer address (ICBAL and ICBAH) to the text string labeled PROMPT. This is defined at Line 1680. Note that, since PROMPT is considered a text record, it must be terminated with an ATASCII EOL character, $9B.

Lines 520–540 set the text buffer length value to $FFFF, telling CIO that the longest string we want to write is 65,536 bytes long. Obviously, the PROMPT string at Line 1680 isn’t anywhere near 65,536 bytes long, but as long as you place an EOL character at the end of the string you’re printing, CIO will stop when it reaches the string’s end. Setting the length of $FFFF is simply an easy way to ensure that the whole string gets printed without actually counting the characters in it.

Line 550 performs a JSR to CIOV to actually print the string on the screen.

Line 560 branches to the PRTERR error routine if the print operation encountered an error.

The next section of the program is the main loop. It accepts a text record from the keyboard and prints it back to the user.

Lines 600–620 point to IOCB #1 and set the command byte to $05, for a GET RECORD operation. Lines 630–660 point to out text input buffer, INBUF, which is defined at Line 1770. When CIO accepts text, it will be placed in this area of memory.

Lines 670–700 tell CIO that the buffer length is 40 bytes. No matter how many keys the user types before passing RETURN, CIO won’t try to place more than 40 bytes in the INBUF buffer. If more than forty characters are typed, CIO will place thirty-nine of the characters in the buffer, plus an EOL as the fortieth character, then return with the ICSTA byte and the Y-register, indicating a truncated record error.

Line 710 performs a JSR to CIO to perform the GET RECORD operation.

Lines 760–880 work just as Lines 450–560 do, except that this time, the record being printed is the text input buffer, INBUF. We also set the text length to the maximum buffer size, 40 bytes.

After the text is printed back to the user, Line 890 loops back to GETTXT, Line 600, to get another line of text.

Lines 960–1080, labeled OPNERR, print an error message if the keyboard couldn’t be opened successfully. This operation is just like the opening prompt print operation in Lines 450–560, except that the text to be printed is labeled OEMSG (Open Error Message). After the text is printed, a BRK operation is executed to return control to the debugging program.

Lines 1120–1140 are executed anytime a text print operation fails. They change the screen border color to red, then JMP to FINISH to exit the program.

Lines 1180–1640 are a very important part of this program. They’re executed when an error is encountered during a GET RECORD operation from the keyboard, and determine which error was encountered. In our example, the three important errors are ERROR #128 (BREAK key abort), ERROR #137 (truncated record), and ERROR #136 (end-of-file). Other errors are reported as an unknown error.

Line 1180 checks the Y-register to see if it contains an ERROR #136 (EOF).

If the error is not an EOF error, Line 1190 branches to NOTEOF to test for the next error.

Lines 1200–1240 are executed if the EOF has been detected. They close the keyboard (IOCB #1) and exit the program with a BRK instruction.

Line 1250, or NOTEOF, checks the Y-register to see if it contains the ERROR #128 (BREAK key abort). If the error isn’t a BREAK key abort, Line 1260 branches to NOTBRK to continue testing. If the error was a BREAK key abort, Lines 1270–1380 print the BREAK key error message (BRKMSG), as was done with the main prompt at Lines 450–560, then loop back to GETTXT to get the next text record.

Line 1390 tests the Y-register to see if an ERROR #137 (truncated record) was encountered.

If the error was not #137, Line 1400 branches to NOTTRN (Not Truncated), to report that an unknown error was encountered.

If the record was truncated, Lines 1410–1520 print the TRNMSG text and loop back to get the next text record.

Lines 1530–1640 print the OTHER message text, to let the user know that an error occurred, but the error is not one of the three normal errors. After the message is printed, control is passed back to GETTXT.

Lines 1680–1720 are the text messages used by the program. Note that all are terminated with EOL characters ($9B).

Line 1760 is the keyboard device string, “K:”. It, too, must be terminated with an EOL.

Line 1770 is the program’s text input buffer. For this time, it’s been set to 40 bytes. You can change this if you like, but be sure to change the text length settings in Lines 670–700 and 830–860.

Testing the program.

When you execute the program, you’ll be told to enter text and press CTRL-3 when you want to exit. Type HELLO and press RETURN. The computer will print the word HELLO after you press RETURN. As you’re typing, the characters do not appear on the screen. This shows one important thing—the keyboard is an input-only device, and won’t echo your characters to the screen as you type. More on this in a moment.

Try pressing the BREAK key once. The computer should scold you for pressing it. Some debugging programs may use the BREAK key, but the Assembler Editor cartridge will allow our program to react properly if the BREAK key is not pressed repeatedly, too quickly.

Now enter more than forty characters and press RETURN. Once again, an error message will be printed. You can see that we are catching the errors properly, avoiding nasty problems.

When you’re done testing, press CTRL-3, and the program will return control to your debugger.

You noticed how the keyboard didn’t echo your keystrokes to the screen—try changing the “K:” in Line 1760 to “E:”. This will set up IOCB #1 as a screen editor for input only, and when you run the program, you’ll see your text as it’s entered.

The K: device should be used to get keystrokes when you don’t want them echoed to the screen, and the screen editor should be used at all other times.

Last words.

You can use the principles in this program to create your own text entry routines and error message subroutines. You can modify the text input buffer to accept more characters. Just be sure to change the buffer size in Line 1770. Next month, we’ll expand this idea, and get into disk I/O, so stay tuned!

Boot Camp
c/o ANALOG Computing
P.O. Box 23
Worcester, MA 01603
Listing 1.
BASIC listing.
0100 .OPT NOLIST 0110 COLOR4 = $02C8 0120 ICCMD = $0342 0130 ICSTA = $0343 0140 ICBAL = $0344 0150 ICBAH = $0345 0160 ICBLL = $0348 0170 ICBLH = $0349 0180 ICAX1 = $034A 0190 ICAX2 = $034B 0200 CIOV = $E456 0210 ; 0220 ;SET STARTING ADDRESS 0230 ; 0240 *= $6000 0250 ; 0260 ;NOW OPEN KEYBOARD FOR INPUT 0270 ; 0280 CLD ;BINARY MODE! 0290 LDX #$10 ;IOCB #1 0300 LDA #$03 ;SET FOR... 0310 STA ICCMD,X ;OPEN COMMAND 0320 LDA #KEYBD/256 ;POINT TO... 0330 STA ICBAH,X ;K: TEXT... 0340 LDA #KEYBD&255 ;FOR OPEN... 0350 STA ICBAL,X ;OPERATION 0360 LDA #$04 ;SET FILE... 0370 STA ICAX1,X ;FOR INPUT 0380 LDA #$00 ;AND CLEAR... 0390 STA ICAX2,X ;ICAX2! 0400 JSR CIOV ;OPEN THE KEYBD! 0410 BMI OPNERR ;BRANCH IF ERR! 0420 ; 0430 ;KEYBOARD'S OPEN, PRINT PROMPT! 0440 ; 0450 LDX #$00 ;IOCB #0 (SCREEN) 0460 LDA #$09 ;SET COMMAND... 0470 STA ICCMD,X ;FOR PUT RECORD 0480 LDA #PROMPT/256 ;POINT TO... 0490 STA ICBAH,X ;STARTIHG... 0500 LDA #PROMPT&255 ;PROMPT... 0510 STA ICBAL,X ;MESSAGE 0520 LDA #$FF ;SET FOR... 0530 STA ICBLL,X ;MAXIMUM TEXT... 0540 STA ICBLH,X ;LENGTH 0550 JSR CIOV ;PRINT IT! 0560 BMI PRTERR ;BRANCH IF ERROR 0570 ; 0580 ;NOM ACCEPT A STRING FROM KEYBD 0590 ; 0600 GETTXT LDX #$10 ;IOCB #1 (KEYBD) 0610 LDA #$05 ;SET UP... 0620 STA ICCMD,X ;GET RECORD CMD 0630 LDA #INBUF/256 ;POINT TO... 0640 STA ICBAH,X ;THE TEXT... 0650 LDA OINBUF&255 ;INPUT... 0660 STA ICBAL,X ;BUFFER 0670 LDA #40 ;ALLOW MAXIMUM... 0680 STA ICBLL,X ;OF 40 BYTES... 0690 LDA #0 ;ON THE... 0700 STA ICBLH,X ;INPUT OPERATION 0710 JSR CIOV ;GET TEXT! 0720 BMI GETERR ;OOPS! 0730 ; 0740 ;NOW REPEAT IT BACK TO USER! 0750 ; 0760 LDX #$00 ;IOCB #0 (SCREEN) 0770 LDA #$09 ;SET UP FOR... 0780 STA ICCMD,X ;PUT RECORD 0790 LDA #INBUF/256 ;POINT TO THE... 0800 STA ICBAH,X ;TEXT THE... 0810 LDA BINBUF&255 ;USER JUST... 0820 STA ICBAL,X ;TYPED IN 0830 LDA #40 ;WE KNOW THERE... 0840 STA ICBLL,X ;WON'T BE MORE... 0850 LDA #0 ;THAN 40 BYTES! 0860 STA ICBLH,X 0870 JSR CIOV ;REPEAT TEXT! 0880 BMI PRTERR ;ERROR! 0890 JMP GETTXT ;LOOP FOR MORE 0900 ; 0910 ;HERE ARE THE ERROR HANDLERS 0920 ;--------------------------- 0930 ; 0940 ;KEYBOARD OPEN ERROR 0950 ; 0960 OPNERR LDX #$00 ;IOCB #0 (SCREEN) 0970 LDA #$09 ;SET FOR... 0980 STA ICCMD,X ;PUT RECORD 0990 LDA #OEMSG/256 ;POINT TO... 1000 STA ICBAH,X ;KEYBOARD OPEN... 1010 LDA U0EMSG&255 ;ERROR MESSAGE 1020 STA ICBAL,X 1030 LDA #$FF ;SET LENGTH... 1040 STA ICBLL,X ;TO MAXIMUM 1050 STA ICBLH,X 1060 JSR CIOV ;PRINT MESSAGE! 1070 BMI PRTERR ;BRANCH IF ERROR 1080 BRK ;AND EXIT! 1090 ; 1100 ;TEXT PRINT ERROR 1110 ; 1120 PRTERR LDA #$34 ;PUT RED... 1130 STA COLOR4 ;IN BACKGND COLOR 1140 JMP FINISH ;AND EXIT! 1150 ; 1160 ;INPUT ERROR 1170 ; 1180 GETERR CPY #136 ;ERROR #136? 1190 BNE NOTEOF ;NO, NOT EOF. 1200 FINISH LDX #$10 ;GOT EOF... 1210 LDA #$0C ;CLOSE THE... 1220 STA ICCMD,X ;KEYBOARD... 1230 JSR CIOV 1240 BRK ;AND EXIT! 1250 NOTEOF CPY #128 ;ERROR #128? 1260 BNE NOTBRK ;NO, NOT BREAK 1270 LDX #$00 ;IOCB #0 (SCREEN) 1280 LDA #$09 ;PUT RECORD 1290 STA ICCMD,X 1300 LDA #BRKMSG/256 ;POINT TO... 1310 STA ICBAH,X ;BREAK KEY... 1320 LDA #BRKMSG&255 ;ERROR MESSAGE 1330 STA ICBAL,X 1340 LDA #SFF ;SET FOR... 1350 STA ICBLL,X ;MAXIMUM... 1360 STA ICBLH,X ;TEXT LENGTH 1370 JSR CIOV ;PRINT IT, 1380 JMP GETTXT ;GO GET TEXT. 1390 NOTBRK CPY #137 ;TRUNCATED? 1400 BNE NOTTRN ;NO, NOT BREAK 1410 LDX #$00 ;IOCB HO {SCREEN) 1420 LDA #$09 ;PUT RECORD 1430 STA ICCMD,X 1440 LDA #TRNMSG/256 ;POINT TO... 1450 STA ICBAH,X ;TRUNCATION... 1460 LDA HTRNMSG&255 ;ERROR MESSAGE 1470 STA ICBAL,X 1480 LDA #SFF ;SET FOR... 1490 STA ICBLL,X ;MAXIMUM... 1500 STA ICBLH,X ;TEXT LENGTH 1510 JSR CIOV ;PRINT IT, 1520 JMP GETTXT ;GO GET TEXT. 1530 NOTTRN LDX #$00 ;IT'S ANOTHER... 1540 LDA #$09 ;ERROR, SO... 1550 STA ICCMD,X ;LET'S PRINT... 1560 LDA #OTHER/256 ;A MESSAGE... 1570 STA ICBAH,X ;INFORMING... 1580 LDA 80THER&255 ;THE USER. 1590 STA ICBAL,X 1600 LDA #$FF 1610 STA ICBLL,X 1620 STA ICBLH,X 1630 JSR CIOV ;PRINT MESSAGE 1640 JMP GETTXT ;GET MORE TEXT! 1650 ; 1660 ;HERE ARE THE TEXT MESSAGES 1670 ; 1680 PROMPT .BYTE "ENTER TEXT, CTRL-3 TO EXIT",$9B 1690 OEMSG .BYTE "*** KEYBOARD OPEN ERROR ***",$9B 1700 BRKMSG .BYTE "*** DON'T PRESS THE BREAK KEY! ***",$9B 1710 TRNMSG .BYTE "*** TEXT TOO LONG! ***", $3B 1720 OTHER .BYTE "*** UNKNOWN ERROR!!! ***",$9B 1730 ; 1740 ;MISCELLANEOUS DATA 1750 ; 1760 KEYBD .BYTE "K:",$9B 1770 INBUF *=*+40 1780 .END
A.N.A.L.O.G. ISSUE 38 / JANUARY 1986 / PAGE 83

Boot Camp

by Tom Hudson

We’ve been examining the use of the Atari central I/O routines for the last few installments of Boot Camp. This time, we’ll write a file utility program which will copy any text file to the computer screen. It will also demonstrate the use of a simple subroutine which can save computer memory (and typing time).

Using subroutines.

We discussed the concept of subroutines some time ago in Boot Camp, but so far haven’t really written any. A subroutine is a set of instructions capable of being executed by other parts of a program. When a section of the main program calls the subroutine with the JSR instruction, the 6502 processor jumps to the subroutine, but remembers where it left the main program.

The subroutine code then executes. When finished, the subroutine executes an RTS instruction, and the 6502 returns to the place in the main program where it left off.

Subroutines are complex structures, but, fortunately for us lazy programmers, the microprocessor does all the work—isn’t that what computers are for?

You’ve probably been using subroutines for years in your BASIC programs, utilizing the GOSUB and RETURN statements. The JSR and RTS instructions perform the same functions, but in assembly language.

For the last few issues, in our discussions of the CIO system, we’ve been using the JSR instruction to call the central I/O routine. CIO performs the requested task, and control returns to our program. So, as you can see, you’ve been using subroutines all along, and there’s nothing scary about them. They’re just another tool for the assembly language programmer to master.

In last issue’s program, we had to print a number of error messages to the user. To do this, we coded each print operation separately; it took eleven instructions each time we did a print. Those eleven instructions took 30 bytes of memory each time they were used, as well as a lot of typing. In a situation where you want to save memory—and do a lot of printing—a subroutine can save a bunch of RAM and hunt-and-peck typing!

The heart of a subroutine is its ability to perform a certain operation for many different parts of the main program. In many cases, a subroutine accepts various parameters which are used in calculations.

For example, you may have a BASIC subroutine which calculates the sum of two numbers. To be sure that the subroutine gets the values it needs, you set up a fixed set of parameters that are used as input to the subroutine. In the BASIC sum subroutine, we could set up the variables A and B as input to the subroutine, with the subroutine placing the result of the addition in the variable C. In BASIC, the code necessary to set up, call and print the result of the subroutine would look like this:

10 A=10
20 B=7
30 GOSUB 1000

In assembly language, we have several options for passing parameters to subroutines. We can place them in specific locations in memory (as is done with CIO via the Input/Output Control Blocks, or IOCBs), or we can pass them by placing the values in the 6502 A-, X- or Y-registers before performing the JSR instruction. The registers can be used if there are just a few parameters to be passed, while the fixed-memory parameter passing must be used if there are many parameters.

The subroutine we’ll use in this program is a simple print routine, which accepts the address of a string in memory as the only parameter. This value is a 16-bit address, which can be easily split into two 8-bit values. We’ll use the 6502 Accumulator and Y-register to pass the high and low portions of the address to the subroutine, since the X-register will be used by the subroutine itself, to index into the IOCB tables used by CIO.

One word of warning here: be sure that you preserve any registers which you don’t want destroyed by the subroutine. When subroutines are called, they usually alter one or more 6502 registers, including the status register, so don’t count on yom- register data being there after the subroutine returns. This is one of the most common errors made by the new assembly language programmer, and it can be very frustrating. Remember—if in doubt about whether or not a subroutine preserves register contents, save the registers before calling the subroutine and restore them after the JSR.

In this subroutine, the Accumulator will be used to pass the high portion of the address to the subroutine, and the Y-register will be used to pass the low portion of the address. The subroutine takes these values and places them in the buffer address of IOCB #0, for the screen editor, and executes a PUT RECORD command to print the string. The address you place in these registers must point to a string that is terminated with the ATASCII End-Of-Line (EOL) character.

Each subroutine call looks like this:


The LDA instruction loads the Accumulator with the high-order 8 bits of the string’s address (don’t forget the # symbol), and the LDY instruction loads the Y-register with the low-order 8 bits of the string’s address. The JSR calls the PRINT subroutine, which prints the specified string on the screen. This set of instructions uses only 9 bytes. Compare this with the 30 bytes used by the individual PUT RECORD operations, and you can see that we’ll save quite a bit of memory by using the subroutine!

Of course, the subroutine still takes around 30 bytes, but it’s only coded one time. If a program does ten print operations, using individual PUT RECORD operations will take 10*30, or 300 bytes. The same ten print operations with the subroutine approach takes only 30 + (9* 10), or 120 bytes. Not bad, huh?

Some assembly language “speed freaks” will point out that the subroutine approach is slower than using separate operations, and they’re correct. If you’re writing a real-time application that needs all the speed it can get, it may help to use in-line code instead of subroutines.

With today’s 128K-plus computers, lack of memory is rarely an obstacle, so if you feel you need the speed, by all means, use the in-line code method. Subroutines, however, do have the advantage that, if a change needs to be made, it only has to be made in one place, instead of in every piece of code that performs the operation.

The program.

The program in Listing 1 uses principles we covered in earlier installments of Boot Camp, to read and print the contents of a file. The file can be a cassette or disk file, and can even be the screen editor (E:) itself, thanks to the device-independence of CIO.

We’ve covered CIO to the point where I’ll no longer explain every line of code in detail. Instead, groups of code will be summarized by their function, and the comments in the program listing provide the details.

As was mentioned last issue, our programs are now getting so large that they won’t fit in page 6 of system RAM any more, so we must set the initial program counter to a point higher in memory. In this listing, the program starts at $6000 (Line 240). Depending on the memory in your system, you may have to change this value to a lower memory location.

Lines 300–320 set up the parameters as described above, and print the PROMPT string by calling the PRINT subroutine (Lines 1280–1400). This string, defined in Lines 1450–1470, instructs the user to enter the name of the file they want to display.

Lines 360–470 use the GET RECORD function of CIO to accept the name of the file to be displayed. You must include the device specifier (D:, C:, etc.), so that CIO can determine the device to be used.

Line 480 loops back to re-try the filename entry if any errors occurred.

Lines 520–630 attempt to open the file just entered for input.

Line 640 branches to READIT to process the file, if the file was opened successfully. If there was an error in opening the file. Lines 650–670 print the error message, using the PRINT subroutine as described earlier, then Lines 680–710 close IOCB #1. If the IOCB is not closed after such an error, it remains in use and camrot be opened later. After the file is closed, the program loops back, so the user can re-enter the filename.

Now that the file is open, we can read all the records in the file and print them to the screen.

Lines 770–880 use the GET RECORD command to read records from the file. The input buffer area, BUFFER, holds 1000 characters, which is usually long enough for most text files. If an error occurs during the GET RECORD operation. Line 890 branches to BADREC to handle it.

If the record was read successfully. Lines 930–960 print the record that was just read (contained in BUFFER) and loop back to READIT to get the next record from the file.

Lines 1000–1110 handle errors when reading the file. If the error is an end-of-file (EOF) error, a value of 136 in the Y-register, an appropriate message is printed. If the error was another error, such as a truncated record, a general error message is printed. Both routines then go through the QUIT routine, to complete processing and exit.

Lines 1150–1200, labeled QUIT, close the input file (IOCB #1) and terminate the program with the BRK (break) instruction.

Lines 1280–1400 are the PRINT subroutine, used whenever a string is to be printed to the screen.

Line 1290 sets the X-register to point to IOCB #0, indicating that the operation will use the screen editor.

Line 1300 moves the Accumulator, which contains the high portion of the string’s address, the the high buffer address for the operation.

Lines 1310–1320 move the contents of the Y-register (the low portion of the string’s address) to the low buffer address for the CIO operation. Note that the 6502 won’t allow a STY ICBAL.X operation, so we must first transfer the Y-register to the accumulator and store it from there.

Lines 1330–1380 perform the other setup operations necessary for a PUT RECORD operation and call CIO to print the string. After printing, if there was an error, the subroutine branches to the FATAL routine, to change the screen color to indicate the error.

Line 1400, an RTS instruction, returns to the part of the program which called the subroutine.

Lines 1410–1440 are used if it’s impossible to print to the screen. They change the screen’s border to red and terminate the program with the BRK instruction.

Lines 1450–1570 are the various data items for the program, including prompts and data buffers. Note that the text prompts don’t have to be defined on a single line—multiple lines may be used, as long as the EOL character ($9B) is used as the last character.

When you RUN the program, try entering various types of filenames—disk, cassette, even the screen editor (E:). With the editor, the computer will echo every line you type back to you. The End-Of-File (EOF) for the screen editor is generated when you press CTRL-3 (CTRL and 3 keys pressed simultaneously). The great thing is, we didn’t have to write any special code to allow the program to read from all these devices. CIO’s device independence takes that worry away from us!

Next month…

Next issue, we’ll play around with creating disk files and copying data, using CIO. Until then, experiment with this program. Try adding descriptive error messages to all the errors you could get when reading a file. Working with programs is an excellent way to get comfortable with the assembly language.

Listing 1.
Assembly listing.

0100     .OPT NO LIST
0110 COLOR4 = $82C8
0120 ICCMD = $0342
0130 ICSTA = $8343
0140 ICBAL = $0344
0150 ICBAH = $0345
0160 ICBLL = $0348
0170 ICBLH = $0349
0180 ICAX1 = $034A
0190 ICAX2 = $0348
0200 CIOV = $E456
0210 ;
0230 ;
0240  *= $6000
0250 ;
0270 ;
0280  CLD            ;BINARY MODE!
0290 GETFN
0330 ;
0350 ;
0360  LDX #$00       ;EDITOR: IOCB #0
0370  LDA #$05       ;GET RECORD...
0390  LDA #FNAME/256 ;POINT...
0400  STA ICBAH,X    ;TO...
0410  LDA #FNAME&255 ;FILENAME...
2132  LDA #20        ;MAXIMUM NAME...
0440  STA ICBLL,X    ;= 20 CHARS
0450  LDA #0         ;(20 IN LO,
0460  STA ICBLH,X    ;0 IN HI)
0470  JSR CIOV       ;GET RECORD!
0490 ;
0510 ;
0520  LDX #$10      ;USE IOCB HI
0530  LDA #$03      ;SET UP...
0550  LDA #FNAME/256 ;POINT...
0560  STA ICBAH,X   ;TO...
0570  LDA #FNAME&255 ;USER'S...
0590  LDA #$04      ;OPEN FILE...
0610  LDA #$00      ;AUX2...
0630  JSR CIOV      ;OPEN IT*
0650  LDA #OPNERR/256 ;UH-OH, PRINT...
0680  LDX #$10      ;BETTER...
0690  LDA #$0C      ;CLOSE...
0700  STA ICCMD,X   ;IOCB HI...
0730 ;
0750 ;
0770  LDX #$10       ;IOCB #1
0780  LDA $05        ;SET TO...
0800  LDA #BUFFER/256 ;POINT...
0810  STA ICBAH,X     ;TO...
0820  LDA #BUFFER&255 ;INPUT...
0840  LDA #1000/256  ;MAXIMUM...
0850  STA ICBLH,X    ;READ...
0860  LDA #1000&255 ;1000...
0880  JSR CIOV       ;READ IT!
0890  BMI BADREC     ;ERROR!
0900 ;
0920 ;
0930  LDA #BUFFER/256 ;POINT TO...
0950  JSR PRINT       ;PRINT IT!
0970 ;
0990 ;
IOIO  CPY #136        ;EOF?
1030  LDA #EOFMSG/256 ;PRINT...
1040  LDY #EOFMSG&255 ;EOF...
1050  JSR PRINT       ;MESSAGE
1060  JMP QUIT        ;AND QUIT!
1100  JSR PRINT       ;MESSAGE
1110 ;
1130 ;FILE (IOCB #1) AND EXIT!
1140 ;
1150 QUIT
1160  LDX #$10       ;IOCB #1
1170  LDA #$0C       ;SET CLOSE...
1190  JSR CIOV       ;CLOSE IT!
1200  BRK            ;AND EXIT
1210 ;
1230 ;
1240 ;INPUT:
1270 ;
1280 PRINT
1290  LDX #$00       ;USE EDITOR
1310  TYA            ;PUT Y IN A REG.
1330  LDA #$09       ;SET UP...
1350  LDA #$FF       ;SET BUFFER...
1360  STA ICBLL,X    ;LENGTH...
1380  JSR CIOV       ;PRINT IT!
1390  BMI FATAL      ;ERROR!
1400  RTS            ;OK, RETURN
1410 FATAL
1420  LDA #$34       ;CHANGE...
1440  BRK            ;AND EXIT
1470 .BYTE "(INCLUDE D:)",$9B
1500 .BYTE "-- TRY AGAIN",$9B
1530 .BYTE "FILE! ***",$9B
1550 .BYTE "*** END-OF-FILE ***",$9B
1560 FNAME *=*+20
1570 BUFFER *=*+1000
1580 .END
A.N.A.L.O.G. ISSUE 40 / MARCH 1986 / PAGE 27

Boot Camp

by Tom Hudson

Are you ready to dive further into the world of 8-bit assembly language? I hope so, because this issue we’ll look at two programs that further illustrate the use of the Atari Central Input-Output (CIO) system.

Program number one.

The first program we’ll look at, shown in Listing 1, is a modification of the last Boot Camp program. As you recall, we wrote a program which copied any input file (keyboard, disk, etc.) to the screen. This time, we’ll have one that does just the opposite. It will copy whatever you type on your computer screen to a disk file, the printer, or any other device.

If you take a look at issue 38’s Boot Camp, you’ll notice that this issue’s Listing 1 is a relatively minor modification of that program. The filename input section (Lines 290–480 in both programs) is identical, since the entry of a filename through the keyboard remains the same, regardless of the program’s function.

Lines 520–720 make an attempt to open the file indicated as output. IOCB number 1 is used for the output file (Line 520).

Lines 520–630 perform the open operation. If the file was opened successfully. Line 640 branches to FILEOK, where actual processing begins.

If the file couldn’t be opened due to an invalid filename or device specifier. Lines 650–670 print an error message (OPNERR) to the user, informing them that the file can’t be opened and they must try again. After printing the message, IOCB number 1 is closed (Lines 680–710), and the program jumps back to the GETFN routine, to accept another filename.

If the file was successfully opened, the FILEOK routine prints a message to the user that it’s ready to accept text. The Atari screen editor (opened by the system as IOCB number 0) will accept lines of text (text records) up to three screen lines long (120 characters). You can type lines of text as long as you like, up to the 120-character limit, terminating each with the RETURN key. As you enter each line, the program will write the text to the specified device.

If you indicated D:TEXTFILE as your output filename, the text will be written to a disk file named TEXTFILE. Entering P: as your filename will cause the text to be written to the printer. If you don’t have the device specified in the filename connected to yom- computer, the open operation will fail; you’ll be asked to enter another filename.

To tell the computer you’re done entering text, press CTRL-3 (the CTRL and 3 keys simultaneously). The output file will be closed. As mentioned in earlier installments of Boot Camp, the CTRL-3 sequence on the keyboard indicates an end-of-file, the code that tells the system there’s no more input available.

Lines 860–980 handle text entry. The text editor is used to receive text (IOCB number 0), which is placed into a block of memory labeled BUFFER (defined in Line 1970 as 128 bytes in length). Lines 930–960 tell the system to limit the input text length to 128 bytes. In this case, the system automatically limits text to 120 characters, because of the text editor’s built-in limitation. It’s a good idea to set up the buffer length values anyway, to prevent possible trouble.

If there’s a problem getting a text record. Line 980 branches to the BADRD (BAD READ) routine, which checks the type of error that occurred.

If the line of text was read successfully. Lines 1020–1130 write the text to the output device. Lines 1030–1060 point to the text buffer; Lines 1090–1110 tell the maximum number of bytes to write ($FFFF, or 65535); and Line 1120 writes the text.

If the write operation fails for some reason (printer not ready or disk full, for example), Line 1130 branches to BADWRT (BAD WRITE) to report the error If the write was successful, the program branches back to the READIT routine, where another line of text will be accepted from the screen editor.

Lines 1180–1260 check the error condition when an error occurs on text input. If the error code (found in the 6502 Y-register) is a 136, the end-of-file indicator has been read, and the program jumps to the QUIT routine to finish processing.

Any other error on input is handled by Lines 1230–1260. These print an error message and branch back to get another line of text at READIT.

Lines 1270–1290, which are labeled BADWRT, are executed when an error occurs when writing to a file. The program performs a jump to subroutine (JSR) to WEPRNT (WRITE ERROR PRINT), which is a short subroutine to print the write error message. This message may be printed elsewhere in the program, so, in order to save memory, it’s made into a short subroutine that may be called from anywhere in the program.

After printing the message at the WEPRNT subroutine, the program jumps to the QUIT routine. The JMP instruction isn’t necessary in this case, because the QUIT code starts with the next instruction. It’s good programming practice, however, to put the JMP in—you (or someone else) may later add a routine at this point and forget that the BADWRT routine falls through to the QUIT code. A JMP only takes a few microseconds longer to execute and doesn’t noticeably affect the program’s speed.

Lines 1330–1450, labeled QUIT, perform the final processing necessary to finish the program. First, in Lines 1340–1380, the program closes IOCB number 1, the output file. Because of the way the disk operates, final data may not actually be written to the disk until this point in the program. If the disk is full, a write error may be encountered. For this reason, we do a check, to see if there was an error on the write operation and branch to BADEND if something went wrong. If we didn’t check for an error at this point, one could occur on the close operation, and the user would never notice it.

If the output file was closed properly, the program prints the DONE message and exits with the BRK instruction.

The BADEND routine, as mentioned above, is executed if an error occurs on the close operation. It calls the WEPRNT subroutine in order to print an error message, then executes a BRK to exit.

The WEPRNT (WRITE ERROR PRINT) subroutine, Lines 1490–1530, is a simple subroutine which prints the WRTERR (WRITE ERROR) message on-screen, using the PRINT subroutine. It then returns to the calling code via the RTS instruction.

This is a good example of the use of short subroutines to shorten your programs—rather than duplicate this print code twice in the program (in the BADWRT and BADEND routines), we make it a subroutine. This saves space and also makes it easier to change the program if we need to alter the write error message (the change only has to be made in one place).

Lines 1540–1770 make up the PRINT subroutine, which we wrote in issue 38’s installment of Boot Camp. This routine simply prints the specified string to the screen.

Lines 1780–1950 are the various text messages used by the program. All of these must be terminated by an ATASCII End-Of-Line (EOL) character ($9B) to print properly.

Line 1960 is the block of memory (20 bytes) reserved for the output filename. Line 1970 is a 128-byte block of memory reserved for the text input buffer.

Comparing this program to the one in issue 38, you can see that functions are easily changed by modifying a few lines and using prewritten “building blocks,” like the PRINT subroutine. You may want to save useful subroutines to disk for later—you can merge them with your code and write programs more quickly. Be sure to document the operation of the subroutine in comments (as in Lines 1540–1600), so that you can remember how the subroutine works!

Program number two.

The second program this time is a further modification of the first, so it can use any input file (disk, screen editor, etc.) and copy the output to any other file. You may, for instance, copy from one disk file to another, from the disk to the screen, from the screen editor to the printer, and so forth. The device independence of CIO allows you almost unlimited flexibility. Let’s take a look at the program.

Listing 2 is the universal copy program. Just looking at it, you can see its similarity to Listing 1—it has roughly the same filename entry (two filenames this time, though, instead of just one), a copy loop reading the first file and writing to the second, error message handling, and the PRINT subroutine. It’s just an extension of the principles we’ve used up to this point, scaled up slightly.

In this program, we’ll accept two filenames, one specifying the input file, the other the output file. Because we don’t know beforehand what devices wall be used, we must open both files to separate IOCBs.

In the earlier examples, one file was usually the screen editor, opened by the operating system as IOCB number 0. We have eight IOCB’s available (actually seven, since IOCB is used for the screen editor), so we’ll pick IOCB number 1 for the input file and IOCB number 2 for the output file (we could use any of the seven available IOCBs for the program; I just picked 1 and 2 arbitrarily). Simple enough.

Lines 290–480 are quite similar to their counterparts in Listing 1, but you’ll notice that the labels have changed slightly. We must be able to differentiate between the two file input sections and the two filenames, so the files are called FNAME1 (the input file) and FNAME2 (the output file). The labels for getting the files are correspondingly labeled GETFN1 and GETFN2.

After accepting the first filename and placing it in FNAME1, we must try opening the file for input—to be sure it exists—before moving on and accepting the output filename. If, for instance, the user types P: (the printer) as the input device, the system won’t want to open it as input. The printer is an output-only device, so CIO will return an error. If the input file is a disk file, opening it for input makes sure it’s actually on the disk before proceeding. If there’s any error in opening the file, the program asks for another filename.

Once we’re sure the input file is ready to go, we move on to the GETFN2 routine, Lines 760–1160. This routine is virtually identical to the GETFN1 section, except that the specified file is opened as output, using IOCB number 2 (Line 960).

If an error occurs opening for output (this could happen if the printer wasn’t ready or the disk drive power was off), the program prints a message and asks for another filename. You’ve seen all this code before, so I won’t go into a painful, line-by-line analysis.

Now that both files are open, the input file using IOCB number 1 and the output file using IOCB number 2, we’re ready to start copying. The easiest copy, which we’ll employ, is a byte-by-byte copy. That is, we’ll read 1 byte from the input file and immediately write it to the second file, repeating this process until the end-of-file on the input file is reached. There are other, faster ways to get the job done, but, for the time being, let’s keep it simple.

Lines 1210–1230 print a message which informs the user that the program is beginning to copy the files. If you’ve specified a file other than the screen editor as input, sit back and relax as the computer takes care of the “dirty work.” If you’re using the screen editor as input, type the text you want, ending each line with the RETURN key. When finished, press CTRL-3 to tell the system you’re done.

Lines 1270–1350, labeled COPYIT, are the section of the program which reads 1 byte from the input file. The main point of interest here is that Lines 1310–1330, which set the number of bytes to read, set the byte cotmt to 0.

As explained in an earlier Boot Camp, this tells CIO to read 1 byte only, and to place it into the 6502 accumulator. We’ll have to take certain precautions to be sure the contents of the accumulator aren’t altered, because this is the data we’re copying! If the read operation resulted in an error condition. Line 1350 branches to BADRD, to determine what kind of error occurred and handle it properly.

If the read was successful, we have a byte from the input file sitting in the accumulator and are ready to write it to the output file. The only problem is that we need to use the accumulator to set up the command bytes in IOCB number 2 for the write operation. If we alter the accumulator, we’ll clobber the data.

No problem—we’ll simply save the data from the accimiulator somewhere else, putting it back in the accumulator when we’re ready to use it. The most convenient place to stick the byte in this case is the Y-register. We aren’t using that for anything and we won’t be changing it to set up IOCB number 2, so a simple TAY (Transfer Accumulator to Y-register) in Line 1390 moves the byte to a safe place.

We now set up IOCB number 2 in the normal fashion for a 1-byte write from the accumulator (Lines 1400–1450), put the byte back in the accmnulator from the Y-register with the TYA (Transfer Y-register to Accumulator) instruction, and write it to the file in Line 1470.

If an error occurred on the write, Line 1480 will branch to the error routine, BADWRT. Otherwise, the program branches back to COPYIT to continue copying the next byte. This process continues until the EOF on the input file is reached.

This operation illustrates one of the most difficult tasks facing the assembly-language programmer: you must be aware of what’s in each register and keep track of what happens to each of them through the execution of your program. With one mistake in coding, you can blow away important data.

The moral: always be aware of the 6502 register contents—never assume that a piece of data is safe. If in doubt, save it somewhere. The few microseconds you waste saving a value will never be noticed, and they could save your reputation, as well as a great deal of time!

Lines 1530–1640 are the error-handling routines for the copy program. They’re similar to those of Listing 1, but, instead of going back for another line of input if an error occurs, this program will print an error message and quit, because lost data of any kind means the file is unusable when not the screen editor.

Lines 1680–1840 close the two files, printing an error if the output file close produces an error.

The remaining lines of the program are explained in Listing 1, except for the two filenames, FNAME1 and FNAME2. These are the 20-byte areas of memory reserved for the two filenames. Note that there’s no buffer used in this program for data, because we’re using the accumulator to hold each byte as it’s read and written.

Play around with this program and modify it so that the two filename input operations (Lines 360–480 and 830–950) are reduced to a single subroutine. It’s not too hard—I’ll show a way to do it next time. We’ll also start looking at how to work with the computer’s graphics through CIO.

Listing 1.
0100     .OPT NO LIST
0110 COLOR4 = $02C8
0120 ICCMD = $0342
0130 ICSTA = $0343
0140 ICBAL = $0344
0150 ICBAH = $0345
0160 ICBLL = $0348
0170 ICBLH = $0349
0180 ICAX1 = $034A
0190 ICAX2 = $034B
0200 CIOV = $E456
0210 ;
0230 ;
0240     *= $6000
0250 ;
0270 ;
0280     CLD         ;BINARY MODE!
0290 GETFN
0300     LDA #PROMPT/256 ;HI PART IN A
0310     LDY #PROMPT&255 ;LO PART IN Y
0330 ;
0350 ;
0360     LDX #$00    ;EDITOR: IOCB #0
0370     LDA #$05    ;GET RECORD...
0390     LDA #FNAME/256 ;POINT...
0480     STA ICBAH,X ;TO...
0410     LDA HFNAME&255 ;FILENAME...
0430     LDA #20     ;MAXIMUM NAME...
0440     STA ICBLL,X ;= 20 CHARS
0450     LDA #0      ;(20 IN LO,
0460     STA ICBLH,X ;0 IN HI)
0470     JSR CIOV    ;GET RECORD!
0490 ;
0510 ;
0520     LDX #$10    ;USE IOCB #1
0530     LDA #$03    ;SET UP...
0550     LDA #FNAME/256 ;POINT...
0560     STA ICBAH,X ;TO...
0570     LDA #FNAME&255 ;USER'S...
0590     LDA #$08    ;OPEN FILE...
0610     LDA #$00    ;AUX2...
0620     STA ICAX2,X ;NOT USED
0630     JSR CIOV    ;OPEN IT!
0650     LDA #OPNERR/256 ;UH-OH, PRINT
0680     LDX #$10    ;BETTER...
0690     LDA #$0C    ;CLOSE...
0700     STA ICCMD,X ;IOCB #1...
0720     JMP GETFN   ;TRY AGAIN!
0730 ;
0750 ;
0770     LDA #PROMP2/256 ;POINT TO...
0780     LDY #PROMP2&255 ;TEXT PROMPT
0790     JSR PRINT   ;PRINT IT!
0800 ;
0840 ;
0860     LDX #$00    ;IOCB #0
0870     LDA #$05    ;SET TO...
0890     LDA #BUFFER/256 ;POINT...
0900     STA ICBAH,X ;TO...
0910     LDA HBUFFER&255 ;INPUT...
0930     LDA U128/256 ;MAXIMUM...
0940     STA ICBLH,X ;READ...
0950     LDA #128&255 ;128...
0970     JSR CIOV    ;READ IT!
0990 ;
1010 ;
1020     LDX #$10    ;IOCB #1
1030     LDA #BUFFER/256 ;POINT TO...
1070     LDA #$09    ;PUT RECORD...
1090     LDA #$FF    ;SET UP...
1180     STA ICBLL,X ;MAXIMUM...
1120     JSR CIOV    ;WRITE IT!
1150 ;
1170 ;
1180 BADRD
1190     CPY #136    ;EOF?
1210     JMP QUIT    ;AND QUIT!
1230     LDA #RDERR/256 ;GOT AN ERROR,
1240     LDY #RDERR&255 ;PRINT ERROR..
1290     JMP QUIT    ;AND QUIT!
1300 ;
1320 ;
1330 QUIT
1340     LDX #$10    ;IOCB #1
1350     LDA #$0C    ;SET CLOSE...
1370     JSR CIOV    ;CLOSE IT!
1390     LDA #DONE/256 ;EOF, WE'RE...
1480     LDY #DONE&255 ;ALL DONE...
1420     BRK         ;AND EXIT
1450     BRK         ;AND EXIT!
1460 ;
1480 ;
1500     LDA #WRTERR/256 ;POINT TO...
1520     JSR PRINT   ;PRINT IT
1530     RTS
1540 ;
1560 ;
1570 JINPUT:
1600 ;
1610 PRINT
1620     LDX #$00    ;USE EDITOR
1640     TYA         ;PUT Y IN A REG.
1660     LDA #$09    ;SET UP...
1680     LDA #$FF    ;SET BUFFER...
1690     STA ICBLL,X ;LENGTH...
1710     JSR CIOV    ;PRINT IT!
1720     BMI FATAL   ;ERROR!
1730     RTS         ;OK, RETURN
1740 FATAL
1750     LDA #$34    ;CHANGE...
1770     BRK         ;AND EXIT
1790     .BYTE "ENTER OUTPUT "
1795     .BYTE "FILENAME "
1800     .BYTE "(INCLUDE D:)",$9B
1810 PROMP2
1820     .BYTE "ENTER TEXT, "
1825     .BYTE "TYPE CTRL-3 "
1830     .BYTE "TO QUIT.",$9B
1850     .BYTE "CAN'T OPEN FILE! "
1860     .BYTE "-- TRY AGAIN",$9B
1870 RDERR
1880     .BYTE "*** ERROR GETTING "
1890     .BYTE "RECORD! ***",$9B
1910     .BYTE "*** ERROR WRITING "
1920     .BYTE "FILE! ***",$9B
1930 DONE
1940     .BYTE "*** FILE WRITE "
1950     .BYTE "COMPLETE! ***",$9B
1960 FNAME *= *+20
1970 BUFFER*= *+128
1980     .END
Listing 2.
0100     .OPT NO LIST
0110 COLOR4 = $02C8
0120 ICCMD = $0342
0130 ICSTA = $0343
0140 ICBAL = $0344
0150 ICBAH = $0345
0160 ICBLL = $0348
0170 ICBLH = $0349
0180 ICAX1 = $034A
0190 ICAX2 = $034B
0280 CIOV = $E456
0210 ;
0230 ;
0240 *= $6000
0250 ;
0270 ;
0280     CLD         ;BINARY MODE!
0290 GETFN1
0300     LDA #PFILE1/256 ;HI PART IN A
0310     LDY UPFILE1&255 ;LO PART IN Y
0330 ;
0350 ;
0360     LDX #$00    ;EDITOR: IOCB #0
0370     LDA #$05    ;GET RECORD...
0390     LDA #FNAME1/256 ;POINT...
0400     STA ICBAH,X ;TO...
0410     LDA #FNAME1&255 ;FILENAME...
0430     LDA #20     ;MAXIMUM NAME...
0440     STA ICBLL,X ;= 20 CHARS
0450     LDA #0      ;(20 IN LO,
0460     STA ICBLH,X ;0 IN HIJ
0470     JSR CIOV    ;GET RECORD!
0490 ;
0510 ;
0520     LDX #$10    ;USE IOCB #1
0530     LDA #$03    ;SET UP...
0550     LDA #FNAME1/256 ;POINT...
0560     STA ICBAH,X ;TO...
0570     LDA #FNAMEI&255 ;USER-S.,.
0590     LDA #$04    ;OPEN FILE...
0610     LDA #$00    ;AUX2...
0620     STA ICAX2,X ;NOT USED
0630     JSR CIOV    ;OPEN IT!
0640     BPL GETFN2  ;OPENED OK!
0650     LDA #BADIN/256 ;UH-OH, PRINT
0680     LDX #$10    ;BETTER...
0690     LDA #$0C    ;CLOSE..,
0700     STA ICCMD,X ;IOCB SI...
0720     JMP GETFN1  ;TRY AGAIN!
0730 ;
0750 ;
0760 GETFN2
0770     LDA #PFILE2/256 ;HI PART IN A
0780     LDY #PFILE2&255 ;LO PART IN Y
0800 ;
0820 ;
0830     LDX #$00    ;EDITOR: IOCB SO
0840     LDA #$05    ;GET RECORD...
0860     LDA SFNAME2/256 ;POINT...
0870     STA ICBAH,X ;TO...
0880     LDA #FNAME2&255 ;FILENAME...
0900     LDA #20     ;MAXIMUM NAME...
0910     STA ICBLL,X ;= 20 CHARS
0920     LDA #0      ;(20 IN LO,
0930     STA ICBLH,X ;0 IN HI)
0940     JSR CIOV    ;GET RECORD!
0960     LDX #$20    ;USE IOCB #2
0970     LDA #$03    ;SET UP...
0990     LDA SFNAME2/256 ;POINT...
1000     STA ICBAH,X ;TO...
1010     LDA #FNAME2&255 ;USER'S...
1030     LDA #$08    ;OPEN FILE...
1050     LDA #$00    ;AUX2...
1060     STA ICAX2,X ;NOT USED
1070     JSR CIOV    ;OPEN IT!
1090     LDA #BADOUT/256 ;UH-OH, PRINT
1120     LDX #$20    ;BETTER...
1130     LDA #$0C    ;CLOSE...
1140     STA ICCMD,X ;IOCB #1...
1160     JMP GETFN2  ;TRY AGAIN!
1170 ;
1210     LDA #CPYMSG/256 ;POINT TO...
1230     JSR PRINT   ;PRINT IT!
1240 ;
1260 ;
1280     LDX #$10    ;IOCB 81
1290     LDA #$07    ;SET TO...
1310     LDA #0      ;READ 1 CHAR...
1320     STA ICBLL,X ;AND PUT IN...
1340     JSR CIOV    ;READ IT!
1360 ;
1380 ;
1390     TAY         ;PUT CHAR IN Y
1400     LDX #$20    ;IOCB #2
1410     LDA #$0B    ;PUT CHARS...
1430     LDA #0      ;TELL CIO TO...
1440     STA ICBLL,X ;WRITE 1 BYTE...
1460     TYA         ;GET CHAR IN ACC,
1470     JSR CIOV    ;WRITE IT
1500 ;
1520 ;
1530 BADRD
1540     CPY #136    ;EOF?
1560     JMP QUIT    ;AND QUIT!
1580     LDA #RDERR/256 ;GOT AN ERROR,
1590     LDY #RDERR&255 ;PRINT ERROR
1610     JMP QUIT    ;AND ABORT!
1640     JMP QUIT    ;AND QUIT!
1650 ;
1670 ;
1680 QUIT
1690     LDX #$10    ;IOCB #1
1700     LDA #$0C    ;SET CLOSE...
1720     JSR CIOV    ;CLOSE IT
1730     LDX #$20    ;IOCB #2
1740     LDA #$0C    ;SET CLOSE...
1760     JSR CIOV    ;CLOSE IT!
1780     LDA #DONE/256 ;EOF, WE'RE...
1790     LDY #DONE&255 ;ALL DONE...
1810     BRK         ;AND EXIT
1840     BRK         ;AND EXIT!
1850 ;
1870 ;
1890     LDA #WRTERR/256 ;POINT TO...
1910     JSR PRINT   ;PRINT IT
1920     RTS
1930 ;
1950 ;
1960 ;INPUT:
1990 ;
2000 PRINT
2010     LDX #$00    ;USE EDITOR
2030     TYA         ;PUT Y IN A REG.
2050     LDA #$09    ;SET UP...
2070     LDA #$FF    ;SET BUFFER...
2080     STA ICBLL,X ;LENGTH..,
2100     JSR CIOV    ;PRINT IT!
2110     BMI FATAL   ;ERROR!
2120     RTS         ;OK, RETURN
2130 FATAL
2140     LDA #$34    ;CHANGE...
2160     BRK ;AND EXIT
2170 PFILE1
2180     .BYTE "ENTER INPUT "
2190     .BYTE "(INCLUDE D:)",$9B
2200 PFILE2
2210     .BYTE "ENTER OUTPUT "
2215     .BYTE "FILENAME "
2220     .BYTE "(INCLUDE D:)",$9B
2230 BADIN
2240     .BYTE "CAN'T OPEN FILE! "
2250     .BYTE "-- TRY AGAIN",$9B
2270     .BYTE "*** BAD INPUT "
2280     .BYTE "FILENAME ***",$9B
2300     .BYTE "COPYING FILE...",$9B
2310 RDERR
2320     .BYTE "*** ERROR READING "
2330     .BYTE "FILE! ***",$9B
2350     .BYTE "*** ERROR WRITING "
2360     .BYTE "FILE! ***",$9B
2370 DONE
2380     .BYTE "*** FILE WRITE "
2390     .BYTE "COMPLETE! ***",$9B
2400 FNAME1 *= *+20
2410 FNAME2 *= *+20
2420     .END