Commodore 64I'm writing my own experimental cross-assembler for the 6502 microprocessor, under the working title "MyASM". The assembler itself is written in Perl, which means I can use the processing power of a modern computer to accomplish pretty much anything I want :-)
This is the first time I've written a real parser, but so far I have the following features working properly:
- ; Comments
- .define with recursive symbols, integer arithmetics and bitwise operations
Command line options
|-a||Dump address table|
|-g||Dump gap table|
|-s||Dump symbol table|
|-D name||Define symbol "name"|
|-r 200||Set maximum recursive depth to 200 (default is 128)|
; CommentsA semicolon ";" marks a comment, either on a separate line or following other statements.
; Comments are ignored by the assembler lda #"A" ; Load accumulator with the ASCII code for "A" jsr CHROUT ; Print the character "A"
Labels:A label is entered into the assembler symbol table as a WORD and all references to the label are substituted for its value at compile time.
A label may contain any of the following characters: a-zA-Z0-9 as well as underscore "_" and period ".", but it MUST NOT begin with a period "."
A label declaration must be immediately followed by a colon ":"
A label may be referenced as any other symbol and used in compile-time arithmetics
There is no length limitation on the label name, but labels longer than 32 characters will make the symbol and address dumps harder to read.
loop: ; Define label at compile time dey sta $0400, y bne loop ; Branch to the label 'loop' if y == 0
.defineThis directive may be used in two different ways, as shown below:
.define VIC.bordercolor = $d020 ; a) Set a constant expression .define use_floating_point ; b) Set a boolean flaga) A named expression is entered into the assembler symbol table as a BYTE, WORD or DWORD depending on the expression value, and all references to the symbol are substituted for its value at compile time.
b) A named symbol is entered into the assembler symbol table as a BYTE with a value of 1. This shorthand variant is most commonly used with .ifdef/.ifndef A named symbol may contain any of the following characters: a-zA-Z0-9 as well as underscore "_" and period ".", but it MUST NOT begin with a period "."
A named symbol may be referenced as any other symbol and used in compile-time arithmetics
There is no length limitation on the symbol name, but labels longer than 32 characters will make the symbol and address dumps harder to read.
ExpressionsHere are some examples of valid expressions:
.define VIC = $d000 ; Hexadecimal WORD .define IP_address = $555f2942 ; Hexadecimal DWORD .define color_white = 1 ; Decimal BYTE .define VIC.background_color = VIC + $20 ; Named symbol .define bitpattern = %00000111 ; Binary BYTE .define bitmask1 = bitpattern | %11000000 ; Bitwise OR .define bitmask2 = bitpattern & %11000000 ; Bitwise AND .define arithmetic1 = 2 + 3 * 4 - 5 / 6 * 7 ; = 14 .define arithmetic2 = 2 + (3 * 4) - 5 / (6 * 7) ; = 14 .define arithmetic3 = (2 + 3) * 4 - (5 / 6) * 7 ; = 20Expressions are evaluated recursively and will currently fail if nested more than 128 levels. (This detection is in place to catch looping references)
You can manually adjust this limit using the command line option "-r".
.includeInclude another file, for example shared libraries or other commonly used code. The same file MAY be included twice, but this will cause errors if the included file tries to re-define a label or a named symbol. You can test if a file has already been included by using .ifdef/.ifndef with the filename:
.include Bootstrap_C64.asm .ifndef VIC.asm ; Check if VIC.asm has already been included .include VIC.asm ; No? Include it now .endif
.binaryWhile .include is used to include another SOURCE file, the .binary directive lets you include a raw BINARY file. The most obvious use is for bitmaps and music, but any ripped, pre-compiled or generated data can be loaded this way.
As with .include, you can test if a binary file has already been included by using .ifdef/.ifndef with the filename.
.at $1000 .binary soundtrack.sid ; Most SID tunes expect to be loaded at $1000
.ifdef / .ifndefAs shown in previous examples, .ifdef and .ifndef can be used at compile-time to test if symbols have been defined yet. Note that this checking only takes place during the first pass, so the symbol may be defined later. Example:
.ifdef problem .error There is a problem ; No symbol named "problem" is defined yet .endif .define problem .ifdef problem .error There is a problem ; Oops, "problem" is defined now .endif.ifdef and .ifndef clauses may be nested, but inner clauses will only be tested if the outer ones are all true:
.ifdef true foo ; compiled .ifdef also_true bar ; compiled .ifdef false baz ; skipped .ifdef also_true ; true but parent test is false! boo ; skipped .endif .endif .endif .endifNotice that indentation is used for clarity here, it is ignored by the assembler.
.orgThe .org directive is optional, it may only appear ONCE in the code, and it must appear before any data or mnemonics. The default code origin is $0800 which is where the C64 usually loads program code. The file "Bootstrap_C64.asm" shows how to insert a small BASIC program to launch your code.
.org $0800 .byte $00 .byte $0c $08 .byte $01 $00 ; Line number 1 .byte $9e $20 $32 $30 $36 $32 ; SYS 2062 .byte $00 $00 $00 jmp main
.alignThere are times where you want to make sure a piece of code or data is placed exactly on a page boundary. Also, sprite data must be placed on an address which is a multiple of 64 so the VIC ship can properly display the sprite.
Both can be accomplished using the .align directive:
.align 64 .include sprite.asm ; Each sprite pointer is an offset within the 16K VIC memory bank, divided by 64 lda #(sprite.asm - VIC.bank)/64 sta VIC.sprite_pointer0 .align 256 .include lookup_table.asmWhen the .align keyword is encountered, the necessary number of bytes are filled with zeroes. This means that in a worst case scenario up to N-1 bytes may be wasted, where N is the alignment value. To minimize this problem, inspect the assembler address table and consider re-arranging data and code in memory to use it more efficiently. MyASM will NEVER move chunks around for you, because this might break self-modifying code, computed lookup tables, stack manipulation etc.
.atUnlike .org which places the beginning of the program, and .align which places code or data at certain boundaries, the .at directive is used to place code or data at an absolute address. As with .align this can waste a lot of memory if used incorrectly, so judicious use of the address and gap tables is recommended.
.at $1000 .binary soundtrack.sid ; Most SID tunes expect to be loaded at $1000Let's examine an example output from the compiler using the -g flag (show gap table)
> myasm test.asm -g Source file './test.asm' open Source file 'include/Bootstrap_C64.asm' open Source file 'include/Bootstrap_C64.asm' closed, read 10 lines Binary file 'data/commando.sid' open Binary file 'data/commando.sid' closed, read 4222 bytes Source file './test.asm' closed, read 57 lines Gap table: test.asm(line 55, col 8) From $0919 to $1000 = 1767 bytes Wrote 2 + 6270 bytes to test.asm.prg ($0800-$207E)As you can see, this program wastes 1767 of 6270 bytes because of the gap between the program code and the binary data located .at $1000
Got more code or data that you want to squeeze in? Use the gap table!
.error / .warningUsed with .ifdef/.ifndef these directives allow a programmer to enhance libraries etc. by implementing checks such as these:
.ifndef VIC.bank .warning VIC.bank is undefined, using default .define VIC.bank = $0000 .endif .ifdef method_one .ifdef method_two .error Can't use both methods one and two .endif .endif