Asmj syntax

Last revised: 23-Nov-2003
  • Syntax of 6809 Machine Instructions

    Unless otherwise indicated, all words of all instructions (patterns whose names end in "_instr") must be separated by one or more spaces. We omit those from the syntax expressions for brevity. In other patterns, there are no such omissions; spaces are needed as explicitly specified.

    1. Inherent

      Inherent instructions do not need any information to tell where to find the data to operate on; it is implicitly specified as part of the instruction itself. For instance, the "mul" instruction always multiplies the contents of accumulators A and B; it cannot multiply anything else. Syntactically, these are the simplest possible instructions.

      	inh_op ::= "abx"|"daa"|"mul"|"sex"|"nop"|"rti"|"rts"
      		  |"swi"|"swi2"|"swi3"|"sync"|"cwai"
      		  |"asla"|"asra"|"clra"|"coma"|"deca"|"inca"|"lsla"
      		  |"aslb"|"asrb"|"clrb"|"comb"|"decb"|"incb"|"lslb"
      		  |"lsra"|"nega"|"psha"|"pula"|"rola"|"rora"|"tsta"
      		  |"lsrb"|"negb"|"pshb"|"pulb"|"rolb"|"rorb"|"tstb"
      	inherent_instr ::= [label] <inh_op> [<comment>]
      

    2. Operand in memory

      Some instructions find one operand in memory (or at least compute an address as if they were going to do that) and can do so by any of several addressing modes. For brevity, our syntax diagram allows a few combinations that are actually illegal. For instance, store instructions cannot have immediate operands, jump instructions can have neither immediate nor direct operands, load-effective-address (lea_) instructions can have only indexed operands, and the condition-register instructions (andcc and orcc) can have only immediate operands. But syntactically they all follow the same pattern, as shown below.

      	mem_op ::= "adda"|"adca"|"anda"|"bita"|"eora"|"ora"|"suba"|"sbca"
      	          |"addb"|"adcb"|"andb"|"bitb"|"eorb"|"orb"|"subb"|"sbcb"
      	          |"cmpa"|"cmpb"|"cmpd"|"cmpx"|"cmpy"|"cmpu"|"cmps"
      	          |"lda" |"ldb" |"ldd" |"ldx" |"ldy" |"ldu" |"lds"
      	          |"sta" |"stb" |"std" |"stx" |"sty" |"stu" |"sts"
      		  |"asl"|"asr"|"clr"|"com"|"dec"|"inc"|"jmp"|"jsr"
      		  |"lsl"|"lsr"|"neg"|"psh"|"pul"|"rol"|"ror"|"tst"
      	          |"addd"|"subd" | "leax"|"leay"|"leau"|"leas"
      	          |"andcc"|"orcc"
      	mem_arg ::= <immediate_arg>|<extended_arg>|<direct_arg>|<indexed_arg>
      	mem_instr ::= [label] <mem_op> <mem_arg> [<comment>]
      

      The syntax of the operand depends on the addressing mode, as follows.

      1. Immediate

        The operand is the byte (or word) in memory immediately following the opcode. The operand argument begins with a pound-sign "#", followed by a numeric expression.

        	immediate_arg ::= "#" <numeric_expression>
        

      2. Extended

        The operand's address immediately follows the opcode in memory. The operand argument is just a numeric expression.

        	extended_arg ::= <numeric_expression>
        

      3. Direct

        The byte following the opcode is used as the least significant byte of the operand's address. The most significant byte of the address is taken from the "direct page" register. The operand argument is a numeric expression representing the memory byte, followed by ",dp".

        	direct_arg ::= <numeric_expression> ",dp"
        

      4. Indexed

        This mode is itself quite varied, as the 6809 allows for many forms of indexed addressing. For constant-offset addressing, Asmj accepts hints to make the offset-length shorter than the default of two bytes. (The default applies when the offset involves symbols that have not yet been defined; since the assembler cannot determine the value of the offset when it need to reserve space for it, it must leave room for the largest possible one.) A prefix of a single less-than sign indicates that the length should fit into a single byte, while two less-than signs mean that the length should fit into the postbyte itself. This later form is only legal with constant-offset from a proper index register; it is not available for use with PC-relative addressing.

        	index_register ::= "x" | "y" | "u" | "s"
        	accumulator ::= "a" | "b" | "d"
        	indexed_indirect ::= "[" <indexed_direct> "]"
        	                   | "[" <numeric_expression> "]"
        	indexed_direct ::= <constant_offset>
        	                 | <accumulator_offset>
        	                 | <auto_increment>
        	constant_offset ::= [<numeric_expression>] "," <index_register>
        	                  |  <numeric_expression> ",pcr" 
        	accumulator_offset ::= <accumulator> "," <index_register>
        	auto_increment ::= "," ("-"|"--") <index_register>
        	                |  "," <index_register> ("+"|"++")
        
        	indexed_arg ::= [ "<<" | "<" ]
        	                ( <indexed_direct> | <indexed_indirect> )
        

    3. Two-register

      The only two-register instructions are tfr and exg. Each needs a list of exactly two registers to act on, with the only limitation being that both registers must be the same size; it is illegal to transfer an 8-bit register to or from a 16-bit register.

      	register_8 = "a" | "b" | "cc" | "dp"
      	register_16 = "x" | "y" | "u" | "s" | "d" | "pc"
      	two_reg_arg ::= <register_8> "," <register_8>
      	              | <register_16> "," <register_16>
      	two_reg_op ::= "tfr"|"exg"
      	two_reg_instr ::= [label] <two_reg_op> <two_reg_arg> [<comment>]
      

    4. Stack

      The only stack instructions are pshs/pshu and puls/pulu. Each needs a list of registers to push or pull, with the only limitation being that pshs/puls cannot push/pull the S stack pointer, while pshu/pulu cannot push/pull the U stack pointer. The operand argument is the list of register names, separated by commas, with no spaces between them.

      	register ::= <register_8> | <register_16>
      	stack_arg ::= <register> [ "," <register> ]...
      	stack_op ::= "pshs"|"pshu"|"puls"|"pulu"
      	stack_instr ::= [label] <stack_op> <stack_arg> [<comment>]
      

    5. Branch

      Branch instructions always need a single argument: the address to branch to. This can be any numeric expression, although in practice it will usually be just a symbol that was defined as the label of an instruction.

      Ordinary branch instructions (those that begin with "B") can only reach labels up to 127 (or -128) bytes away from the following instruction. Long branches (those that begin with "L"), on the other hand, can branch to labels at any distance.

      	branch_op ::= "bra"|"brn"|"bcc"|"bcs"|"beq"|"bge"|"bgt"
      	             |"bhi"|"bhs"|"ble"|"blo"|"bls"|"blt"
      	             |"bmi"|"bne"|"bvc"|"bvs"|"bpl"|"bsr"
      	             |"lbra"|"lbrn"|"lbcc"|"lbcs"|"lbeq"|"lbge"|"lbgt"
      	             |"lbhi"|"lbhs"|"lble"|"lblo"|"lbls"|"lblt"
      	             |"lbmi"|"lbne"|"lbvc"|"lbvs"|"lbpl"|"lbsr"
      	branch_instr ::= [label] <branch_op> <numeric_expression> [<comment>]