The Apple II, the original personal computer, was superseded by the IBM PC Model 5150 in 1983. The IBM PC was not much faster, and certainly not more ingenious, but its 8088 processor could address much more memory, and the video system was more flexible. The 5150 even copied the Apple II in many respects, even to open supply of full information and documentation. It used DOS instead of Applesoft for control, but BASIC was still present, and DEBUG did the job of the Monitor command line. The flexibility of the DOS system and the increase in memory could not be duplicated by the Apple, so it was the IBM PC that evolved into today's PC, not the Apple II.
Nevertheless, the final Apple IIe was a well-made and durable machine, with a complete keyboard and optional 80-column display. It is excellent for learning machine-language programming, and for control applications. Most high-level programming languages were available, as well as an assembler, and a complete range of I/O peripherals. I have recently reviewed the operation of my Apple IIe, which has come back to life with its full powers. I have a monochrome monitor and use 40-column display, with two disk drives. There is an IEEE-488 interface card, serial card, parallel card, and a Motorola 68000 development card.
This article discusses getting the Apple II into operation, the use of the Monitor, and machine language programming. If you are interested, dust off your Apple II, find the manuals and the diskettes, and have some fun.
You will need the system unit with keyboard, a monitor, a bridge, disk drive card, disk drives with flat cables, a shielded cable with RCA connectors on each end, and a power cord for the Apple, and perhaps another for the monitor. The disk card should be in slot #6, with the diskette drive cables plugged in. The top connector is for Drive 1, the other for Drive 2. If there is only one drive, it should be Drive 1. The bridge goes over the Apple IIe, and the monitor rests on the bridge. The disk drives can be put anywhere convenient. The shielded cable goes from the RCA receptacle on the monitor marked IN to the RCA receptacle on the back of the Apple, on the left as you face the back. Plug in the Apple and the monitor power, and you are ready to go.
Put an INITialized diskette in Drive 1. This can be the System Disk, if you have no other initialized diskettes. DOS must be booted from Drive 1. It is booted on CTRL-RESET or on power-on. Turn on the monitor and then the Apple II, with the switch on the back, on your left. The Apple will beep and show a welcome display on the screen, if it is healthy. Remember that you can always reboot with CTRL-RESET if anything goes awry. The Apple IIe wakes up as a machine that is running Applesoft, and you can dig in immediately.
The Applesoft prompt is ], which should be shown. If you type INT and press RETURN, the prompt will change to the > of Integer BASIC. To return to Applesoft, type FP and press RETURN. In what follows, keep pressing RETURN after each command without being told to. To get to the Monitor, execute CALL -151. The prompt is a *. To return to Applesoft or Integer BASIC, whichever was running before, execute a CTRL-C or CTRL-B. CTRL-C will save a BASIC program that you may have in memory, while CTRL-B will erase it. Programs are also erased when you execute INT or FP. Integer BASIC is like Applesoft BASIC, but has only integer numbers and runs much faster. One feature is that it includes a Mini-Assembler at F666G (execute this GO command from either ] or >. Its prompt is !. To leave it, execute $FF69G. The $ says it is a Monitor command, and the rest is another GO command. Practice getting any of the prompts ], >, * and !, and returning safely to Applesoft. CTRL-RESET is always there in an emergency.
To INIT a disk, execute NEW (which clears program memory) and enter a short HELLO program with PRINT statements that state the system, date and perhaps your name. Insert a disk in Drive 1, then execute INIT HELLO. Execute CATALOG to see what is on the disk (HELLO). Try booting the disk with CTRL-RESET. This so-called "slave diskette" must be used only on systems with the same memory size. Apple II's came in 16K and 48K sizes. A "master diskette" can be used on systems of any size. Master diskettes are created with the CREATE MASTER program. If you use your diskettes only on this system, you have no need of Masters. This is not the same distinction as system or data disks in MS-DOS. A slave diskette is a system disk. Make a slave diskette for your casual use that is not write-protected.
The Monitor contains machine-language routines that are used to operate the computer services. It also includes a command processor, with the prompt *, as mentioned above. CALL -151 at either the ] or > prompt will take you there. Its facilities are similar to those of DEBUG, in the IBM PC. All numbers are taken as hexadecimal. They may have a leading $, but this is not necessary. If you type a number at the * prompt and press RETURN, you will open that address, and its contents will be displayed. Further RETURNs will step to the next address, and print its contents. If you type two addresses separated by a ".", the contents of all the locations in that range will be printed. In this way, you can examine the contents of any memory location in the Apple.
Entering a target address, a <, and a range, followed by the letter M, will move the contents of the range beginning at the target address. To change the contents of the currently open address, enter ":" and then a hexadecimal byte. Subsequent addresses will be changed if you enter a series of hexadecimal bytes separated by spaces. Of course, only RAM contents can be changed; ROM addresses and similar will not be changed.
If you enter an address followed by L, a screenful of disassembly of the bytes, considered as machine instructions, will be displayed. You must start at the first byte of an instruction, or the results will be garbage. Any byte following a 00 may be assumed to be the first byte of an instruction (though it may just be data). When you use L, you usually know the proper place to begin.
An address followed by a G begins execution at the byte at the address. You should be careful to start only at a proper starting point, not in the middle of an instruction. When execution begins, the registers of the processor are filled with the values that are displayed when you execute CTRL-E. You are then given an opportunity to change these values. If the values matter, you should change them before using G.
The routine you jump to with G should end with a RTS, hex 60, to send execution back to the Monitor. The Monitor has no single-step facility, but you can make up for the deficiency easily, by storing the processor registers when your routine (which can be a single instruction) return. They should be stored where the Monitor keeps initial values, in $45-$49, where A, X, Y, P and S are stored, in that order. The routine to do this is STA $45, STX $46, STY $47, PHP, PLA, STA $48, TSX, INX, INX, STX $49, RTS, or 85 45 86 46 84 47 08 68 85 48 BA E8 E8 86 49 60. The purpose of the two INX instructions is to put the saved value where S will be when the subroutine returns, so the final value of S will be the same as the starting value. I stored this routine beginning at 03A0. To save it for future use, go to the ] prompt and execute BSAVE TAIL ,A$03A0, L$10. The A parameter is the starting address, and the L parameter is the length. Be careful with the spaces and $'s. Without the $'s, the addresses should be decimal. When you need this routine, you can load it anywhere convenient with, say, BLOAD TAIL ,A$0320 which would put it at 0320. Now your routines should end with a JMP to the start of this routine (4C 20 03 in this case). Without the address, BLOAD uses the address from which the routine was saved originally. In this way you can view the contents of the processor registers before and after an instruction is executed. The jump to the register save routine can be inserted anywhere in a program as a debugging tool.
To study single instructions, put a jump to TAIL in 303 (4C A0 03) and load your instruction at 300, followed by NOP's (EA) if it is less than three bytes long. This way, it is easy to change the instruction. Execute CTRL-E to see the starting register values, and change them if necessary. Then execute 300G, and CTRL-E again to see the final register values. You'll have to decode the status byte, which has the flags in the order NV-BDIZC. When the flags are clear, this reads as 30 (- reads as 1, and B is usually set).
Routines saved with BSAVE are identified by the letter B in the CATALOG listing. Applesoft files saved with SAVE are identified by the letter A. Binary files can be BLOADed and BRUN. Applesoft files are LOADed or RUN. The number of sectors used is also shown; 2 is the minimum for A and B files. There are 31 x 16 = 496 sectors available. A sector holds 256 bytes. DELETE removes a file, RENAME a,b renames it from a to b.
The Mini-Assembler, mentioned above, allows you to type in assembly language at the ! prompt, and it will be assembled to memory. Start by entering the beginning address followed by a ":". Then enter the first assembly language instruction and press ENTER. For the second instruction, type a space, then the instruction, and press ENTER. Repeat as often as necessary. If you make an error, nothing will be assembled, and the line may be entered again. The assembly language instructions are like those produced by the L command, and must be typed exactly right. Execute any Monitor command in the Mini-Assembler by preceding it with a $. In fact, $FF69G (-151 decimal, GO) gets you back in the Monitor. It takes some practice to use the Mini-Assembler. I find it easier to do the assembly myself in most cases, for short routines. This is not difficult with the 6502 processor.
The 6502 processor can address 65KB of memory, addresses 0000 to FFFF. Apple RAM extends from 0000 to BFFF (49151), which is 48KB. Locations C000 to CFFF are used for I/O, and locations D000 to FFFF are ROM locations, where the Monitor and other things reside. A "page" of memory is 256 bytes. Page 0, from 0000 to 00FF, has a special function. The system stack resides in Page 1, building down from 01FF. Page 2 is the keyboard input buffer. Only the first 237 bytes are used, so you can use the rest for temporary storage. Pages 4 through 7 are the primary text and graphics display buffer. The alternate display buffer is pages 8 through 11, and high-resolution graphics display buffers are in pages 32 to 63 and 64 to 95. These are available for use in most cases.
Page 3 has high locations used by the Monitor and DOS. The Monitor uses 3F0 to 3FF, and DOS uses 3D0-3EF, 48 bytes in all. The remaining 208 bytes are available for machine language routines. This is the best place to experiment. Zero-page locations that are free to be used are 06-09, 1A-1C, EB-EF and F9-FD, 17 locations in all.
Programs use the memory between the locations LOMEM and HIMEM. LOMEM is usually set at 0800, just above the primary display buffer. When DOS is booted on a 48KB system, HIMEM is placed at 9600 (-27136 decimal), and DOS uses the space above. Applesoft program lines push LOMEM up, while variables are stored starting at LOMEM and building up. Integer BASIC program lines start at HIMEM and build down, as do Applesoft strings. Memory may be reserved by moving HIMEM lower. HIMEM: -28160 will reserve 1024 bytes = 1KB, from 9200 to 92FF. This gets clobbered on a RESET; only Page 3 is really safe. In Applesoft, HIMEM is at $73-$74. In Integer BASIC, $4C-$4D.
The reader should refer to a 6502 programming manual, like those in the References, for a complete listing of 6502 instructions. Machine language programming is actually an easy and straightforward subject, but can be confusing until some familiarity is gained by practice. The elements will be outlined here, and the Apple is an excellent tool for practice.
The processor puts certain logic levels on its 16 address lines. This address, one of 65,536 possible addresses, selects a device that can put logic levels on the 8 data lines, or else can accept the logic levels put on these lines by the processor. The data consists of one of 256 possible bytes. Control signals issued by the processor then cause the logic levels to be presented by the external device (memory) and loaded (read) by the processor, or cause logic levels presented by the processor to be stored by (or written to) the external device. The processor can also combine data held internally with data from the memory, and perhaps write the result back to the memory. These actions are carried out by executing instructions that the processor must load from the memory. Incoming instructions and incoming and outgoing data share the same data lines and the same memory in this computer structure.
Every instruction begins with an opcode byte, which may be followed by one or more operand bytes, which refer to memory addresses or to immediate data (data that is included in the instruction, rather than loaded from memory). The opcode tells how many operand bytes follow. Data are held internally in registers, shown at the right. These are the accumulator A and the index registers X and Y. The stack pointer is in S, while the status register P contains flags (single bits) that are set according to the results of operations, or by the program to control the processor. These 8-bit registers are available to the programmer. S is indirectly accessed via A with TAS and TSA. P can be pushed on the stack (PHP) and popped into A (PLA). The 16-bit instruction pointer, IP, or program counter, PC, cannot be directly loaded, but can be altered by certain instructions (JMP, JSR, branches). It points to the location in memory where the next instruction byte is located. It is automatically incremented as each instruction byte is read.
For example, the instruction to copy the byte at the memory location 0320 into the accumulator is AD 20 03. Note that the address is low byte first. This is the standard order for the bytes of a 16-bit quantity. Since the PC increments, the low byte is read first, and can be processed before the high byte, in the logical order for addition. If the high byte were read first, it would have to be stored internally until the low byte was read and processed, and then used. These three bytes are the machine language. The assembly language for this instruction is LDA $0320. The $ indicates a hexadecimal number. This text can be interpreted by an assembler program into the three bytes of the machine language. It is very easy to perform the assembly "by hand", however. The LDA is an example of a mnemonic. There are 56 different mnemonics, but 151 different opcodes, since the opcode specifies the addressing mode as well.
To store this byte at location 0321, the instruction is 8D 21 03, or STA $0321. To see how this works, use the Monitor to put the following bytes in memory, beginning at 0300: AD 20 03 8D 21 03 60. The final byte, 60, is the instruction to return from subroutine, or RTS. Load FF into 0320, and 00 into 0321. List the program with 300L, and you will see LDA $0320, STA $0321, RTS. Now execute the program with 300G. Look at locations 320 and 321 (320.321) and you will see FF FF. Location 320 has indeed been copied into location 321.
The way the address is used to fetch the data is called the addressing mode. What we have just shown is the absolute mode. If, instead, we wanted to load a constant, say 00, into A the instruction would be A9 00 or LDA #$00. The "00" is the immediate data, signalled by the # in the assembly language. If we wished to load the byte at 003C, we could use AD 3C 00. However, there is a special opcode for data on page zero, so the instruction A5 3C would do the same thing. This zero-page addressing takes fewer bytes and is faster, taking only 3 cycles instead of the 4 required for absolute addressing. Further addressing modes use the index registers X and Y. BD 20 03 loads the byte at the address formed by adding the value of X to 0320. B9 20 03 uses the Y register instead. The assembly language for these instructions is LDA $0320,X and LDA $0320,Y.
The addressing mode determines how an instruction secures its data on the basis of numbers that are part of the instruction. The 6502 has 10 addressing modes (or 13, if you add some special modifications). One, implicit addressing, is for instructions that need no data, like CLC, clear the carry flag. If it operates on A, this is also called "accumulator addressing". Full information is included in the opcode. Another, relative addressing, is used only by the branch instructions. It is a byte added to the current IP to get the IP after the branch. The byte is interpreted as a signed integer. This leaves 8 modes that are used for fetching numerical data. One, immediate, uses the next byte in the instruction stream as numerical data. This byte is always flagged with an initial #. For example, LDA #0 or A9 00 loads zero in the accumulator. If the data is in zero page, it may be identified by a single byte XX (the complete address is $00XX), which is zero-page addressing. In general, however, a 16-bit address is necessary, specified in two bytes following the opcode, low order byte first. This is absolute addressing, probably the most important mode.
The remaining 5 modes are indexed modes, using the X or Y register to modify the absolute address. An absolute address addr can be modified by either the X or Y register, so that the effective address is addr + X or addr + Y. A zero-page address can be modifed only by the X register. In assembly language, these modes are expressed as LDA addr,X LDA addr,Y where addr is two bytes, and as LDA addr,X where addr is one byte. That leaves two modes, in which the data initially obtained is a pointer to the effective address. These are called indirect modes.
In indirect addressing the processor reads the 16 bits stored at the address in the instruction, and then uses this value as an address for reading the actual data. The address in the instruction holds a pointer to the data, and not the data itself. With LDA and STA, indirect addressing always uses an index register as well, and the pointer must be in zero page. If the pointer is located at, say $003C plus the value in the X register, the instruction is A1 3C. If the pointer is located at the address at $003C plus the value in the Y register, the instruction is B1 3C. The first case is written LDA ($3C,X), and X is added before the address is fetched. This is called indexed indirect addressing. The second case is written LDA ($3C),Y and Y is added after the address is fetched. This is indirect indexed addressing. Note that only two bytes are required, so these instructions are very economical in program space. The best way to understand how they work is to try them out.
The unconditional jump instruction has both absolute and indirect addressing modes. The indirect mode uses a full two byte pointer, and may be considered a special addressing mode. A jump to location $0320 is 4C 20 03, or JMP $0320. A jump to the address stored at $0320 is 6C 20 03, or JMP ($0320). An unconditional jump to a subroutine has only absolute addressing, 20 20 03 or JSR $0320.
Most load, store, arithmetic and logical instructions have all the addressing modes described above. Other instructions may require no addresses at all. This is called implicit addressing. Examples are: decrement X by 1, CA or DEX; decrement Y by 1, 88 or DEY; increment X by 1, E8 or INX; increment Y by 1, C8 or INY; transfer A to X, AA or TAX; transfer A to Y, A8 or TAY; transfer Y to A, 98 or TYA; transfer X to A, 8A or TXA, transfer stack pointer S to X, BA or TSX; transfer X to S, 9A or TXS; no operation, EA or NOP; force interrupt or break, 00 or BRK. Instructions to set or clear flags, or to push or pull the stack, also use implied addressing.
Instructions that load data, as well as the increment and decrement instructions, affect the N and Z flags. X and Y have increment and decrement instructions, but no other arithmetic or logical instructions. They must be copied into the accumulator, by TXA for example, for arithmetic to be done, and restored with TAX. The accumulator can be incremented and decremented by CLC ADC #1 (18 69 01) and SEC SBC #1 (38 E9 01), respectively. The N,Z flags can be set for the value in the accumulator with the do-nothing instruction AND #$FF (29 FF) or ORA #0 (09 00). The accumulator can be cleared with AND #0 (29 00) as well as with LDA #0 (A9 00).
The eight flags in the status byte may be written NVXBDIZC. X is not a flag, but reserved for expansion. N, Z and C are set as the result of an arithmetic operation. N = 1 means the result was negative, Z = 1 means the result was zero, and C = 1 means that there is a carry. B = 1 means that a BRK operation has occurred. With a normal interrupt, it remains 0. If I = 1, interrupts are disabled. If D = 1, the arithmetic will be decimal. This flag is usually always set to 0. The overflow flag V is set when there has been an overflow in signed arithmetic. It can also be set by an external signal, but this feature is not used in the Apple.
The programmer has control of the I, D, V and C flags. CLC is 18, and SEC is 38 (the mnemonics should make the operation clear). CLI is 58, and SEI is 78. Note that CLI enables interrupts, and SEI disables them. CLD is D8, and SED is F8. CLD means that arithmetic will be binary. The overflow flag is cleared by B8, CLV. V cannot be set by the program. The status byte cannot be read directly. It can be pushed on the stack, and popped into the accumulator from there.
The conditional branch instructions test the flags. They are only two bytes, the opcode and the relative jump, from 127 bytes ahead to 128 bytes back, relative to the IP after the instruction has been read. That is, the second byte is considered a signed byte to be added to the IP. Branch on carry clear is 90 or BCC; BCS is B0; BEQ is F0 (the Z flag is tested); BNE is D0; BPL (the N flag is tested; for BPL it is 0) is 10; BMI is 30; BVC is 50; BVS is 70.
A number in A can be compared with a number M in memory with the compare instruction, CMP. This is really a subtraction, but a borrow is not included, and the result is not saved, so the number in A is not affected. However, the N, Z and C flags are set as a result of the subtraction. The contents of X and Y can also be compared with memory by the instructions CPX and CPY. CMP supports the full 8 different addressing modes, but CPX and CPY use only the three non-indexed modes. If A - M = 0, the Z and C flags are set, while N is cleared. If A > M, the N and Z flags are set, while C is cleared. If A < M, the N and Z flags are cleared, but C is set. We have a three-way choice, which can be implemented with BEQ (branch on Z = 1), BPL (branch on N = 0), and BMI (branch on N = 1). The arithmetic IF of FORTRAN with its three choices was based on just such a comparison as this.
Whenever a register is loaded, using LDA, LDX or LDY, the N and Z flags are set appropriately, so you may immediately test whether the byte loaded is positive, negative, or zero. The instruction BIT, like CMP, is used only to set flags. Instead of a subtraction, it does a logical AND of A and M, and sets the Z flag if the result is zero. Clearly, if A is all 0's except a 1 in a certain bit, BIT can be used to see if that bit is set in M or not. If it is, then Z = 0 and BNE will pick it up. Bit 7 of M goes into the N flag, and bit 6 into the V flag. Both of these can be tested with a branch instruction. The only addressing modes are zero-page and absolute, for which the opcodes are 24 and 2C, respectively.
A peculiarity of the 6502 is that the add instruction always adds the carry bit in. ADC #$02 is 69 02, while ADC $0320 is 69 20 03. The sum is placed in A, and the N, Z, C and V flags are affected. Before adding, carry should be cleared. If there is a carry, it will be added in when adding the next higher-order bytes. In subtraction, the complement of the carry flag is considered as a borrow. SBC &0320, ED 20 03, would find A - M - /C, where A is the value in the accumulator, M is the data from memory location 0320, and /C is the complement of the carry flag. Before subtracting, carry should be set. If there is a borrow, it will be cleared by the subtraction and used in subtracting the higher-order bytes.
The stack pointer is maintained in the S register. The upper address byte is automatically set at 01, so the address is in page 1. When S = FF, the stack is empty. When one byte is pushed on the stack, S = FE. That is, S points to the next location to be used, and the stack builds downwards (towards smaller addresses). The primary use of the stack is to save return addresses from subroutines, but it is also valuable as temporary data storage. When a JSR is executed, the address of the last byte of the three-byte instruction is copied to the stack, high byte first, then the low byte. Note that this is not the address of the next instruction, but one byte short of it. When RTS is executed at the end of the subroutine, the low byte plus 1 is put in the low byte of the instruction pointer, followed by the high byte. S is incremented by 2, and the IP now contains the address of the instruction following the JSR, from which execution proceeds. With an interrupt, the same thing happens, except that the status byte is copied onto the stack first. On RTI, the status is pulled from the stack and restored in the processor, then the IP is pulled, and points to the next instruction. When using the stack in your routines, it is essential to pop everything off the stack when leaving that you pushed during the routine. Otherwise, you will leave the stack unbalanced and may run out of stack space. I do not know how far Applesoft protects its stack from meddling, but I suspect there is no protection.
The operation of writing a byte at the stack pointer and decrementing the pointer is called pushing, while the inverse operation is called pulling (Intel copyrighted the term "pop" so something else had to be used. I hope Intel made a lot of money from this weaselly action). The accumulator can be pushed on the stack with 48, PHA, and the status with 08, PHP. The accumulator is popped with PLA, 68, and the status is popped with PLP, 28. To save the X and Y registers at the start of an interrupt routine, they are transferred to the accumulator and pushed. At the end, they are popped and transferred back to X and Y.
The arithmetic operations are ADC and SBC, which affect the flags NZCV. Multiplication and division must be performed in routines written for the purpose. The logical operations are AND, ORA and EOR. EOR is exclusive-OR (Intel copyrighted XOR). There are also ASL, arithmetic shift left, LSR, logical shift right, ROR rotate right, and ROL, rotate left. ASL and LSR both shift in 0's and shift out into carry. ROL and ROR shift through carry. ROL shifts into carry at the high end, and out of carry at the low end, while ROR does the opposite. Memory can be incremented or decremented by one with INC and DEC. The shift, rotate and increment/decrement instructions may modify memory directly. The shift and rotate instructions may affect the accumulator only. ASL A is 0A, LSR A is 4A, ROL A is 2A, and ROR A is 6A. It is instructive to write short routines exercising these instructions to see them in action.
Decimal arithmetic assumes packed BCD, with two digits in each byte, high-order digit first. If you add $05 and $05 with the decimal flag set, you will get $10. If the decimal flag is clear, the result is $0A. Carry and borrow work the same way as in binary arithmetic.
We have now surveyed most of the 6502's 56 instructions. It is a small instruction set, but capable of doing everything required. The indexed and indirect addressing is well adapted to high-level language constructs, such as C structures. The base address of the structure is put in memory, while the X or Y register contains the offset of the desired structure element. X and Y can also hold array indices. In an instruction reference, you need the mnemonic, an explanation of what the instruction does, the flags that are affected, and the opcode, assembly language, number of bytes and number of cycles.
Some short exercises have already been considered: what the instructions do, how the addressing modes work, and how the flags are used. It is very instructive to try some more extensive problems, and satisfying to see them work. Arithmetic computation routines are good subjects. First, try multiple precision computations like 16-bit addition, subtraction, and shifts. The carry flag plays an important role in these computations. Shifts multiply or divide by 2. They can be used in routines to multiply a number by a constant. For example, a number is multiplied by 5 if it is shifted left two places and added to itself.
Somewhat larger problems are the multiplication of two 8-bit numbers to get a 16-bit number, or the division of a 16-bit number by an 8-bit number to get an 8-bit quotient and remainder. A straightforward routine to multiply two bytes is shown at the right. To test this routine, load $8A in $336 and $45 in $338. The product of these two bytes is $2532, which will be stored at $338 as 32 25, low byte first. The idea is to shift the multiplicand one place left and add it to the product if the multiplier has a 1 in that position. For example, if the multiplicand is 1010 and the multiplier is 0110, we do not add the unshifted multiplicand because the first bit from the right in the multiplier is 0. We shift it left, and add it, because the multiplier bit is 1. This is repeated for the next 1. We shift again, but it is not added because the multiplier bit is 0. The product is then 0000 + 10100 + 101000 + 0000000 = 111100 = $3C = 60 decimal. Indeed, 10 x 6 = 60.
In division, we start at the left end of the dividend, and try to subtract the divisor. If it "goes" we put a 1 in the quotient, and do the subtraction. Then the divisor is shifted one place right, and we try again to subtract. If it "goes" we put another 1 in the quotient and do the subtraction; if not, we put a zero in the quotient and go on to the next digit. Finally, we wind up with an 8-bit quotient, and an 8-bit remainder of the dividend. This problem is left as an exercise for the reader.
The clock input to the 6502 is a 1 MHz square wave, called φ0. The 6502 outputs nonoverlapping clocks φ1 and φ2. φ2 is a delayed version of φ0. A complete bus cycle, reading or writing a byte, is accomplished in one period, a microsecond. The R/W output distinguishes read and write cycles, while φ2 is used as a strobe. Every processor cycle is either a read or a write cycle. The precise frequency used in the Apple is 1.022727 MHz, so the cycle is 980 ns. Later versions used 2, 3 and 4 MHz clocks, but these were not used in the Apple. Data appears only in the last half of a cycle, when φ2 is high. The Apple II uses this fact to update the screen display and refresh dynamic memory in the first half of each cycle, one of the ingenious features invented by Steve Wozniak, the designer.
A LDA #Oper (immediate) takes only 2 cycles, since it is done when the opcode has been read. LDA Oper (absolute) takes 3 cycles to read the instruction, and one additional cycle to read the data, or 4 cycles in all. A BCC takes two cycles to read the instruction. If C = 0, that is all, and execution proceeds immediately. If C = 1 and the branch is taken, 1 additional cycle is necessary to add the low byte of the relative value to the IP. If the branch goes to another page, then one more cycle is necessary to find the high byte. A JSR takes 6 cycles, and RTS 6 cycles as well. Setting and clearing flags takes 2 cycles. Pushes take 3 cycles, pops 4 cycles. It is possible to use routines for accurate timing.
A read from location C030 makes a click on the speaker. If the reads are repeated regularly, a tone will be produced. A program to make a beep is: A0 00 A2 C6 CA D0 FD AD 30 C0 88 D0 F5 60. Start the Monitor, and type in 300:A0 00 A2 C6 ... and press RETURN. List the program with 300L, and see what instructions are included. There is an outer loop that counts down Y from an inital value of 00, giving 256 iterations of the inner loop, that counts down X from an intial value of C6, chosen to give a frequency of 1 kHz. Execute the program with 300G and listen to the result. Change the byte in location 303 to change the frequency. The lowest tone is obtained with 00. The program was assembled by hand. The relative jumps were found by counting back from the address just past the jump; FF on the second byte of the jump, FE on the first byte, and so on. The L command will calculate the actual targets, so you can see if you were right. Rewrite the program to DEC a memory location to control the length of the beep, and then to DEC a 16-bit quantity to get a really long beep.
The wiring for the interrupt request is shown in the figure at the right. When the /IRQ line is pulled low, a maskable interrupt is requested. The interrupt routine should raise the request, do its job, and enable interrupts again (CLI, 58) before returning with RTI (40). The I bit is set when the interrupt is recognized. If J2 requests the interrupt, pins 30 and 28 should be connected in the board plugged into J1, and J2 should ground pin 28 to request the interrupt. This wiring is misdesigned; there should be an inverter at the /IRQ pin to request an interrrupt when the chain is interrupted at any jack. The manual says this is the case, which the circuit diagram denies. The vector to the interrupt routine is stored at 3FE-3FF.
The /NMI pin is pulled up the same way, and is connected to pin 29 of J1. The NMI is requested with a pulse low. Locations $3FB-$3FD contain a jump to the interrupt routine, something like 4C 00 03, to use a routine beginning at 0300 and ending with a RTI.
An easy way to combine Applesoft BASIC with machine language routines is through the USR function. The CALL command is the easiest way to run a machine language routine that ends in RET to return to Applesoft. However, it does not provide a way to pass or return arguments. When USR(X) is executed in Applesoft, the real argument X is loaded into the floating-point accumulator (FAC) at $9D.$A2 and execution passes to location $A0. A jump to the start of the machine-language routine is loaded in $A0.$A2, explicitly 4C 00 03 if the routine begins at $0300. Then, calling the routine at $E10C interprets the value in the FAC as an integer, which is stored in $A0 (hi) and &A1 (lo). This integer can then be used by the routine. It could be sent to a peripheral card, for example.
An integer, perhaps read from a peripheral card, can be put into A (hi) and Y (lo), and then loaded into the FAC by calling $E2F2. When the routine returns to Applesoft with RTS this value is available. If the routine is called by A = USR(X), this return value is assigned to the variable A.
A floating-point number is represented in six bytes in the FAC. Suppose the number is expressed as 0.1 ... x 2n, where 0.1 ... represents the 32-bit mantissa, with the binary point at the left and beginning with a 1-bit, and n is the binary exponent. The byte in $9D is the exponent plus $80. The mantissa is in $9E.$A1. $A2 contains the sign byte, which is 00 for a positive number, and FF for a negative number. This is the unpacked representation, ready for arithmetic operations. In the 5-byte packed representation, the leading 1 in the mantissa is replaced by a sign bit. For example, the integer 8 is 23, or 0.1 x 24, so it is loaded into the FAC as 84 80 00 00 00 00. -8 would be 84 80 00 00 00 FF. If the first byte of a 5-byte quantity is at the address hi,lo then a JSR to $EB2B with X = hi, Y = lo packs the FAC at this address. Going the other way, $EAF9 with A = hi, Y = lo unpacks the number at this address into the FAC. Applesoft does all its calculations using FAC and floating point. A similar set of 6 locations, $A5.$AA, is called ARG and contains the other operand in binary operations. $EB53 moves ARG to FAC, and $EB63 moves FAC to ARG. A number at hi,lo can be unpacked directly to ARG with $E9E3, if called with A = hi and Y = lo. It is usually easier to perform arithmetic in Applesoft than by calling the routines in machine language.
As an easy illustration of using the USR function, get into the Monitor with CALL -151, then set the vector with *A0:4C 00 03. Put a RTS at 0300 with 300:60. Then return to Applesoft with ctrl-C return. The command PRINT USR(X) will display X on the screen, while A = USR(X) will load X into A. The machine-language routine can be expanded from 60 to anything you desire.
Single-board computers using the 6502, such as the SYM-1, are excellent for program development, and offer more tools than the Apple does. You can also build your own 6502 system on a breadboard, but it is a lot of work.
A. Watson, Apple II Reference Manual (Cupertino, CA: Apple Computer, Inc. 030-0357-A, 1982).
__________, The DOS Manual (Cupertino, CA: Apple Computer, Inc. #A2L0036, 1981).
L. A. Leventhal, 6502 Assembly Language Programming (Berkeley CA: Osborne/McGraw-Hill, 1979).
R. Zaks, Programming the 6502, 3rd ed. (Berkeley, CA: Sybex, 1980).
__________, SY6500/MCS6500 Microcomputer Family Programming Manual (Santa Clara CA: Synertek, 1975). I prefer this reference for its ease of use and complete information with few distractions.
Composed by J. B. Calvert
Created 2 September 2004
Last revised 8 October 2004