Saturday, 3 December 2016

6502 notes

I've been dabbling in 6502 machine code for a long time, but only recently started to get a hold of it. My initial touchpoint was the machine code monitor on the Action Replay VI module on the C64 and a battered copy of the Commodore 64 Programmer's Reference I found from a flea market in early 1990s. Which I obviously still have.
These days, I know a bit more, but certain things are a bit difficult to remember. Here's some.

How the Carry (C) affects the ADC and SBC instructions (and the difference)

For the longest time I did not get this. For additions, carry is cleared (with CLC) and then becomes set if the addition crosses the 255 boundary. The C flag is then carried (duh) over to the next ADC instruction as a +1. With subtraction, it's the reverse: the carry is previously set and then becomes cleared if the subtraction crosses 0. When clear, the SBC #\$00 performs a SBC #\$01, for example. After this, 16-bit addition and subtraction became easier to understand.

;Incrementing a 16-bit value at \$C000-\$C001 with ADC

CLC          ;Clear Carry
LDA \$C000
ADC #\$01     ;if \$FF is crossed Carry becomes set
STA \$C000
LDA \$C001
ADC #\$00     ;is like 1 if Carry is set
STA \$C001

;Decrementing a 16-bit value at \$C000-\$C001 with SBC

SEC          ;Set Carry
LDA \$C000
SBC #\$01     ;if \$FF is crossed Carry becomes clear
STA \$C000
LDA \$C001
SBC #\$00     ;is like 1 if Carry is clear
STA \$C001

When is the Overflow (V) flag set?

This flag is not that useful to me, but I don't want to forget what it does.

As two bytes are added together with ADC, as above, the calculation is at the same time evaluated as unsigned and "signed". If the numbers as "signed" cross a boundary of (-127...128) the overflow will be set.

I've come across a misconception that says "if the 7th bit of the original value and the resulting calculation are different, then the overflow is set."

However, the following will not trigger the overflow flag, even though the 7th bit clearly changes:

CLC
LDA #\$FF     ; 0b11111111 (-1) is the original value
; 0b00000001 (+1) is the result

I won't go around doing the two's complement explanation, which frankly just makes it more difficult to understand for me. Let's just have a look at how the signed numbers work:

Positive values:

00000000 = 0
00000001 = 1
00000010 = 2
00000011 = 3
00000100 = 4
...
00001000 = 8
...
00010000 = 16
...
00100000 = 32
...
01000000 = 64
...
01111111 = 127

The negative numbers have their minus sign bit set:

11111111 = -1
11111110 = -2
11111101 = -3
11111100 = -4
11111011 = -5
...
11110111 = -9
...
11101111 = -17
...
11011111 = -33
...
10111111 = -65
...
10000000 = -128

There's no special "signed mode" for the ADC command. Simply, if within the above representation, you do a calculation that crosses over 127, or under -128, the V flag will be set. Just like the C will be used when crossing over 0...255 in the ordinary representation.

So what the above example does is -1+2, which does not cross this boundary any more than 1+2 would set the C flag.

The reformulated rule: "If values with a similar 7th bit are added together, and the 7th bit changes, the V flag is set."

For the most part I don't see the point of the signed-system, as I can usually pretend that 128 or 32768 is the zero point in an unsigned calculation. Here, have a picture of a silly robot that makes it all easier to understand.

How the Indexed Indirect and Indirect Indexed addressing modes work

Most of these things have no place in fast code, hence I had not really looked at them.

Perhaps the funniest personal discovery was to see that it's possible to point to a 16-bit address using the indirect indexed addressing mode. I did not know the 6502 was capable of doing this so directly (relatively speaking). It's bit like z80's ld (de),a but not really. Because there is no 16-bit register as such, incrementing the address is a matter of using the ADC on the stored value as described in the first topic above.

Usually a memory fill is done effectively with something like this, which would fill the C64 default screen area with a "space".

LDA #\$20
LDX #\$FF
loop:
STA \$0400,X
STA \$0500,X
STA \$0600,X
STA \$0700,X
DEX
BNE loop
RTS

We can write the start address to a location in a zero page and use the indexed indirect mode to write to the address, at the same time incrementing the stored value with a 16-bit addition.

Some have suggested that the zero-page works as a bunch of additional "registers" for the 6502 and this seems to validate that idea slightly.

(It is extremely slow, though)

LDA #\$04    ;high byte of start address \$0400
STA \$11     ;store at zero page
LDA #\$00    ;low byte of start address \$0400
STA \$10     ;store at zero page
LDY #\$00    ;we won't change the Y in the following
loop:
LDA #\$20    ;fill with ASCII space
STA (\$10),Y ;write to address stored at zp \$10-\$11
CLC         ;clear Carry
LDA #\$01    ;low byte of 16-bit addition

STA \$10
LDA #\$00    ;high byte of 16-bit addition
STA \$11
CMP #\$08    ;are we at \$08xx?
BEQ exit
JMP loop
exit:
RTS

When to use BPL, BMI, BCS, BCC branching operations

Some of my earliest code tended to work with BEQ and BNE only. Quite a lot can be achieved with them. The classic loop already used above, works without comparison instructions at all. Besides this, it can make a big speed difference how the '0' is used in data/tables.

LDX #\$0B
loop:
INC \$d020
DEX
BNE loop

My first instinct has been to use the BPL and BMI opcodes as they sounded nice and simple: Branch on PLus and Branch on MInus.

However, the BPL/BMI work with the signed interpretation discussed above. Parts of code might work as intended if the numbers happened to be in the +0...127 boundary, which probably confused me in the past.

So the BCC (Branch if Carry Clear) and BCS (Branch if Carry Set) are used instead when dealing with unsigned number comparisons, even if they sound less friendly.

LDA #\$20
CMP #\$21
;C=0

LDA #\$20
CMP #\$20
;C=1

LDA #\$20
CMP #\$1F
;C=1

This ought to branch if the A value is lesser than the one it is compared to:

LDA #\$1F   ; \$00-\$1F will result in C=0
CMP #\$20
BCC branch ; (branches in this case)

The next one branches if the A value is equal or greater than the one it is compared to:

LDA #\$20   ; \$20-\$FF will result in C=1
CMP #\$20
BCS branch ; (branches in this case)

So, yeah.

Strangely enough many of the webpages and even books about 6502 are incomplete when it comes to explaining in detail what each of the commands do. Even the books might just state that a particular opcode "affects" a certain flag, but may not open up how exactly that flag is affected.