NAME
xa - 6502/R65C02/65816 cross-assembler
SYNOPSIS
xa [OPTION]... FILE
DESCRIPTION
xa is a multi-pass cross-assembler for the 8-bit processors
in the 6502 series (such as the 6502, 65C02, 6504, 6507,
6510, 7501, 8500, 8501 and 8502), the Rockwell R65C02, and
the 16-bit 65816 processor. For a description of syntax, see
ASSEMBLER SYNTAX further in this manual page.
OPTIONS
-E Do not stop after 20 errors, but show all errors.
-v Verbose output.
-C No CMOS opcodes (default is to allow R65C02 opcodes).
-W No 65816 opcodes (default).
-w Allow 65816 opcodes.
-B Show lines with block open/close (see PSEUDO-OPS).
-c Produce o65 object files instead of executable files
(no linking performed); files may contain undefined
references.
-o filename
Set output filename. The default is a.o65; use the
special filename - to output to standard output.
-P filename
Set listing filename. The default is none; use the
special filename - to print the listing to standard
output.
-F format
Set listing format; default is plain. The only other
currently supported format is html.
-e filename
Set errorlog filename; default is none.
-l filename
Set labellist filename; default is none. This is the
symbol table and can be used by disassemblers such as
dxa(1) to reconstruct source.
-r Add cross-reference list to labellist (requires -l).
-Xcompatset
Enables compatibility settings to become more (not
fully!) compatible with other 6502 assemblers and
codebases. Currently supported are compatibility sets
MASM, CA65 and C, with XA23 available as a deprecated
option for codebases relying on compatibility with the
previous version of xa. Multiple compatibility sets
may be specified and combined, e.g., -XMASM -XXA23.
-XMASM allows colons to appear in comments for MASM
compatibility. This does not affect colon
interpretation elsewhere and may become the default in
a future version.
-XCA65 adds syntactic features more compatible with
ca65(1). It permits := for defining labels (instead of
plain =), and adds support for unnamed labels and
"cheap" local labels using the @ character, but
disables its other meaning for 24-bit mode (see
ASSEMBLER SYNTAX).
-XC enables the usage of 0xHEX and 0OCTAL C-style
number encodings.
-XXA23 restores partial compatibility with xa 2.3.x. In
particular, it uses ^ for generating control
characters, disables escaped characters with \, allows
nested multi-line comments, and disables all predefined
xa preprocessor macros. Although some portions of this
option remain supported syntax, the option itself is
inherently deprecated and may be removed in the next
2.x or 3.x release.
-a Support ca65(1)-style unnamed labels using colons, but
not the remainder of the other supported ca65(1)
features. This allows their use with 65816 mode, for
example. Implies -XMASM.
-M This option is deprecated and will be removed in a
future version; use -XMASM instead. Allows colons to
appear in comments for MASM compatibility. This does
not affect colon interpretation elsewhere, and may
become the default in a future version.
-k Allow the carat (^) to mask a character with $1f/31.
This can be used as a shorthand for control characters,
such as ^m^j becoming a carriage return followed by a
linefeed.
-R Start assembler in relocating mode, i.e. use segments.
-U Do not allow undefined labels in relocating mode.
-Llabel
Defines label as an absolute (but undefined) label even
when linking.
-b? addr
Set segment base for segment ? to address addr. ?
should be t, d, b or z for text, data, bss or zero
segments, respectively.
-A addr
Make text segment start at an address such that when
the file starts at address addr, relocation is not
necessary. Overrides -bt; other segments still have to
be taken care of with -b.
-G Suppress list of exported globals.
-DDEF=TEXT
Define a preprocessor macro on the command line (see
PREPROCESSOR).
-I dir
Add directory dir to the include path (before XAINPUT;
see ENVIRONMENT).
-O charset
Define the output charset for character strings.
Currently supported are ASCII (default), PETSCII
(Commodore ASCII), PETSCREEN (Commodore screen codes)
and HIGH (set high bit on all characters).
-p? Set the alternative preprocessor character to ?. This
is useful when you wish to use cpp(1) and the built-in
preprocessor at the same time (see PREPROCESSOR).
Characters may need to be quoted for your shell
(example: -p'~' ).
--help
Show summary of options (-? is a synonym).
--version
Show version of program.
ASSEMBLER SYNTAX
An introduction to 6502 assembly language programming and
mnemonics is beyond the scope of this manual page. We invite
you to investigate any number of the excellent books on the
subject; one useful title is "Machine Language For
Beginners" by Richard Mansfield (COMPUTE!), covering the
Atari, Commodore and Apple 8-bit systems, and is widely
available on the used market.
xa supports both the standard NMOS 6502 opcodes as well as
the Rockwell CMOS opcodes used in the 65C02 (R65C02). With
the -w option, xa will also accept opcodes for the 65816.
NMOS 6502 undocumented opcodes are intentionally not
supported, and should be entered manually using the .byte
pseudo-op (see PSEUDO-OPS). Due to conflicts between the
R65C02 and 65816 instruction sets and undocumented
instructions on the NMOS 6502, their use is discouraged.
In general, xa accepts the more-or-less standard 6502
assembler format as popularised by MASM and TurboAssembler.
Values and addresses can be expressed either as literals, or
as expressions; to wit,
123 decimal value
$234 hexadecimal value (0x234 accepted with -XC)
&123 octal (0123 accepted with -XC)
%010110 binary
* current value of the program counter
The ASCII value of any quoted character is inserted directly
into the program text (example: "A" inserts the byte "A"
into the output stream); see also the PSEUDO-OPS section.
This is affected by the currently selected character set, if
any.
Labels define locations within the program text, just as in
other multi-pass assemblers. A label is defined by anything
that is not an opcode; for example, a line such as
label1 lda #0
defines label1 to be the current location of the program
counter (thus the address of the LDA opcode). A label can be
explicitly defined by assigning it the value of an
expression, such as
label2 = $d000
which defines label2 to be the address $d000, namely, the
start of the VIC-II register block on Commodore 64
computers. The program counter * is considered to be a
special kind of label, and can be assigned to with
statements such as
* = $c000
which sets the program counter to decimal location 49152. If
-XCA65 is specified, you can also use := as well as =.
With the exception of the program counter, labels cannot be
assigned multiple times. To explicitly declare redefinition
of a label, place a - (dash) before it, e.g.,
-label2 = $d020
which sets label2 to the Commodore 64 border colour
register. The scope of a label is affected by the block it
resides within (see PSEUDO-OPS for block instructions). A
label may also be hard-specified with the -L command line
option.
Redefining a label does not change previously assembled code
that used the earlier value. Therefore, because the program
counter is a special type of label, changing the program
counter to a lower value does not reorder code assembled
previously and changing it to a higher value does not issue
padding to put subsequent code at the new location. This is
intentional behaviour to facilitate generating relocatable
and position-independent code, but can differ from other
assemblers which use this behaviour for linking. However, it
is possible to use pseudo-ops to simulate other assemblers'
behaviour and use xa as a linker; see PSEUDO-OPS and
LINKING.
If -XCA65 or -a is specified, "unnamed" labels may be
specified with : (i.e., no label, just a colon); branches
may then reference these unnamed labels with a colon and
plus signs for forward branching or minus signs for backward
branching. For example (from the ca65(1) documentation),
: lda (ptr1),y ; #1
cmp (ptr2),y
bne :+ ; -> #2
tax
beq :+++ ; -> #4
iny
bne :- ; -> #1
inc ptr1+1
inc ptr2+1
bne :- ; -> #1
: bcs :+ ; #2 -> #3
ldx #$FF
rts
: ldx #$01 ; #3
: rts ; #4
Additionally, in -XCA65 mode, "cheap" local labels may be
used, marked by the @ prefix. These temporary labels exist
only between two regular labels and automatically go out of
scope with the next regular label. This allows, with
reasonable care, reuse of common label names like "loop."
For those instructions where the accumulator is the implied
argument (such as asl and lsr; inc and dec on R65C02; etc.),
the idiom of explicitly specifying the accumulator with a is
unnecessary as the proper form will be selected if there is
no explicit argument. In fact, for consistency with label
handling, if there is a label named a, this will actually
generate code referencing that label as a memory location
and not the accumulator. Otherwise, the assembler will
complain.
Labels and opcodes may take expressions as their arguments
to allow computed values, and may themselves reference other
labels and/or the program counter. An expression such as
lab1+1 (which operates on the current value of label lab1
and increments it by one) may use the following operands,
given from highest to lowest priority:
* multiplication (priority 10)
/ integer division (priority 10)
+ addition (priority 9)
- subtraction (9)
<< shift left (8)
>> shift right (8)
>= => greater than or equal to (7)
> greater than (7)
<= =< less than or equal to (7)
< less than (7)
= equal to (6); == also accepted
<> >< does not equal (6); != also accepted
& bitwise AND (5)
^ bitwise XOR (4)
| bitwise OR (3)
&& logical AND (2)
|| logical OR (1)
Parentheses are valid. When redefining a label, combining
arithmetic or bitwise operators with the = (equals) operator
such as += and so on are valid, e.g.,
-redeflabel += (label12/4)
Normally, xa attempts to ascertain the value of the operand
and (when referring to a memory location) use zero page,
16-bit or (for 65816) 24-bit addressing where appropriate
and where supported by the particular opcode. This generates
smaller and faster code, and is almost always preferable.
Nevertheless, you can use these prefix operators to force a
particular rendering of the operand. Those that generate an
eight bit result can also be used in 8-bit addressing modes,
such as immediate and zero page.
< low byte of expression, e.g., lda #<vector
> high byte of expression
! in situations where the expression could be understood
as either an absolute or zero page value, do not
attempt to optimize to a zero page argument for those
opcodes that support it (i.e., keep as 16 bit word)
@ render as 24-bit quantity for 65816, even if smaller
than 24 bits (must specify -w command-line option, must
not specify -XCA65)
` force further optimization, even if the length of the
instruction cannot be reliably determined (see
NOTES'N'BUGS)
Expressions can occur as arguments to opcodes or within the
preprocessor (see PREPROCESSOR for syntax). For example,
lda label2+1
takes the value at label2+1 (using our previous label's
value, this would be $d021), and will be assembled as $ad
$21 $d0 to disk. Similarly,
lda #<label2
will take the lowest 8 bits of label2 (i.e., $20), and
assign them to the accumulator (assembling the instruction
as $a9 $20 to disk).
Comments are specified with a semicolon (;), such as
;this is a comment
They can also be specified in the C language style, using /*
*/ and // which are understood at the PREPROCESSOR level
(q.v.).
Normally, the colon (:) separates statements, such as
label4 lda #0:sta $d020
or
label2: lda #2
(note the use of a colon for specifying a label, similar to
some other assemblers, which xa also understands with or
without the colon). This also applies to semicolon comments,
such that
; a comment:lda #0
is understood as a comment followed by an opcode. To defeat
this, use the -XMASM compatibility mode to allow colons
within comments; this may become the default in a future
version. Colon statement separation does not apply to /* */
and // comments, which are dealt with at the preprocessor
level (q.v.).
PSEUDO-OPS
Pseudo-ops are false opcodes used by the assembler to denote
meta- or inlined commands. Like most assemblers, xa has a
rich set.
.byt value1,value2,value3,...
Specifies a string of bytes to be directly placed into
the assembled object. The arguments may be
expressions. Any number of bytes can be specified.
.asc "text1" ,"text2",...
Specifies a character string which will be inserted
into the assembled object. Strings are understood
according to the currently specified character set; for
example, if ASCII is specified, they will be rendered
as ASCII, and if PETSCII is specified, they will be
translated into the equivalent Commodore ASCII
equivalent. Other non-standard ASCIIs such as ATASCII
for Atari computers should use the ASCII equivalent
characters; graphic and control characters should be
specified explicitly using .byt for the precise
character you want. Note that when specifying the
argument of an opcode, .asc is not necessary; the
quoted character can simply be inserted (e.g., lda #"A"
), and is also affected by the current character set.
Any number of character strings can be specified.
.byt and .asc are synonymous, so you can mix things such as
.byt $43, 22, "a character string" and get the expected
result. The string is subject to the current character set,
but the remaining bytes are inserted without modification.
.aasc "text1" ,"text2",...
Specifies a character string that is always rendered in
true ASCII regardless of the current character set.
Like .asc, it is synonymous with .byt.
.word value1,value2,value3...
Specifies a string of 16-bit words to be placed into
the assembled object in 6502 little-endian format (that
is, low-byte/high-byte). The arguments may be
expressions. Any number of words can be specified.
.dsb length,fillbyte
Specifies a data block; a total of length repetitions
of fillbyte will be inserted into the assembled object.
For example, .dsb 5,$10 will insert five bytes, each
being 16 decimal, into the object. The arguments may be
expressions. If only a single argument is provided,
then the argument is treated as a number of null bytes
to insert. See LINKING for how to use this pseudo-op to
link multiple objects.
.bin offset,length,"filename"
Inlines a binary file without further interpretation
specified by filename from offset offset (relative to
the beginning of the file) for length bytes. This
allows you to insert data such as a previously
assembled object file or an image or other binary data
structure, inlined directly into this file's object. If
length is zero, then the length of filename, minus the
offset, is used instead. The arguments may be
expressions. See LINKING for how to use this pseudo-op
to link multiple objects.
.( Opens a new block for scoping. Within a block, all
labels defined are local to that block and any sub-
blocks, and go out of scope as soon as the enclosing
block is closed (i.e., lexically scoped). All labels
defined outside of the block are still visible within
it. To explicitly declare a global label within a
block, precede the label with + or precede it with & to
declare it within the previous level only (or globally
if you are only one level deep). Sixteen levels of
scoping are permitted.
.block is accepted as a synonym for .(, as well as
.proc (but you cannot specify an explicit scope name as
in ca65; only anonymous blocks are supported).
.) Closes a block. .bend or .endproc are accepted as
synonyms.
.as .al .xs .xl
Only relevant in 65816 mode (with the -w option
specified). These pseudo-ops set what size accumulator
and X/Y-register should be used for future
instructions; .as and .xs set 8-bit operands for the
accumulator and index registers, respectively, and .al
and .xl set 16-bit operands. These pseudo-ops on
purpose do not automatically issue sep and rep
instructions to set the specified width in the CPU; set
the processor bits as you need, or consider
constructing a macro. .al and .xl generate errors if
-w is not specified.
.assert expression,"message"
Evaluates expression and if it is false (i.e.,
evaluates to zero), prints message as a fatal error,
terminating assembly immediately. For example, a block
of assembly code that creates high ROM might have
.assert *<$fffa, "hit vectors"
to ensure that assembled code does not leak into the
6502 high vectors. If the preceding code is too long,
the assertion will be false, and the condition will be
detected in a controlled fashion. Any operation may be
used as part of the expression, including logical
comparisons such as =, ==, <, <=, >, >=, != and <>.
.include filename
Includes another file in place of the pseudo-op, as if
the preprocessor had done so with an #include directive
(see PREPROCESSOR), but at the assembler phase after
preprocessing has already occurred.
The following pseudo-op applies to listing mode.
.listbytes number
In the listing output, sets the maximum number of hex
bytes to be printed in the listing for pseudo-ops like
.byt, by default 8. The special argument unlimited sets
no upper limit. If listing mode is disabled, this
pseudo-op has no observable effect.
The following pseudo-ops apply primarily to relocatable .o65
objects. A full discussion of the relocatable format is
beyond the scope of this manpage; see
http://www.6502.org/users/andre/o65/ for the most current
specification.
.text .data .bss .zero
These pseudo-ops switch between the different segments,
.text being the actual code section, .data being the
data segment, .bss being uninitialized label space for
allocation and .zero being uninitialized zero page
space for allocation. In .bss and .zero, only labels
are evaluated. These pseudo-ops are valid in relocating
and absolute modes.
.code
For ca65 compatibility, this is currently mapped to
.text.
.zeropage
For ca65 compatibility, this is currently mapped to
.zero.
.align value
Aligns the current segment to a byte boundary (2, 4 or
256) as specified by value (and places it in the header
when relocating mode is enabled). Other values generate
an error.
.fopt type, value1, value2, value3, ...
Acts like .byt/.asc except that the values are embedded
into the object file as file options. The argument
type is used to specify the file option being
referenced. A table of these options is in the
relocatable o65 file format description. The remainder
of the options are interpreted as values to insert. Any
number of values may be specified, and may also be
strings.
.import label1, label2, label3, ...
Defines the given labels as global labels which are
imported and resolved during the link stage, like the
-L command line parameter.
.importzp label1, label2, label3, ...
Analogous to .import, except that it only imports
zeropage labels (i.e., byte values).
PREPROCESSOR
xa implements a preprocessor very similar to that of the C-
language preprocessor cpp(1) and many oddiments apply to
both. For example, as in C, the use of /* */ for comment
delimiters is also supported in xa, and so are comments
using the double slash //. The preprocessor also supports
continuation lines, i.e., lines ending with a backslash (\);
the following line is then appended to it as if there were
no dividing newline. This too is handled at the preprocessor
level.
For reasons of memory and complexity, the full breadth of
the cpp(1) syntax is not fully supported. In particular,
macro definitions may not be forward-defined (i.e., a macro
definition can only reference a previously defined macro
definition), except for macro functions, where recursive
evaluation is supported; e.g., to #define WW AA , AA must
have already been defined. Certain other directives are not
supported, nor are most standard pre-defined macros, and
there are other limits on evaluation and line length.
Because the maintainers of xa recognize that some files will
require more complicated preparsing than the built-in
preprocessor can supply, the preprocessor will accept
cpp(1)-style line/filename/flags output. When these lines
are seen in the input file, xa will treat them as cc would,
except that flags are ignored. xa does not accept files on
standard input for parsing reasons, so you should dump your
cpp(1) output to an intermediate temporary file, such as
cc -E test.s > test.xa
xa test.xa
No special arguments need to be passed to xa; the presence
of cpp(1) output is detected automatically.
Note that passing your file through cpp(1) may interfere
with xa's own preprocessor directives. In this case, to mask
directives from cpp(1), use the -p option to specify an
alternative character instead of #, such as the tilde (e.g.,
-p'~' ). With this option and argument specified, then
instead of #include, for example, you can also use ~include,
in addition to #include (which will also still be accepted
by the xa preprocessor, assuming any survive cpp(1)). Any
character can be used, although frankly pathologic choices
may lead to amusing and frustrating glitches during parsing.
You can also use this option to defer preprocessor
directives that cpp(1) may interpret too early until the
file actually gets to xa itself for processing.
The following predefined macros are supported, except if
-XXA23 is specified:
XA_MAJOR
The current major version of xa.
XA_MINOR
The current minor version of xa.
The following preprocessor directives are supported:
#include "filename"
Inserts the contents of file filename at this position.
If the file is not found, it is searched using paths
specified by the -I command line option or the
environment variable XAINPUT (q.v.). When inserted, the
file will also be parsed for preprocessor directives.
#echo comment
Inserts comment comment into the errorlog file,
specified with the -e command line option.
#print expression
Computes the value of expression expression and prints
it into the errorlog file.
#error message
Displays the message as an error and terminates
assembly.
#define DEFINE text
Equates macro DEFINE with text text such that wherever
DEFINE appears in the assembly source, text is
substituted in its place (just like cpp(1) would do).
In addition, #define can specify macro functions like
cpp(1) such that a directive like #define mult(a,b)
((a)*(b)) would generate the expected result wherever
an expression of the form mult(a,b) appears in the
source. This can also be specified on the command line
with the -D option. The arguments of a macro function
may be recursively evaluated, unlike other #defines;
the preprocessor will attempt to re-evaluate any
argument refencing another preprocessor definition up
to ten times before complaining.
The following directives are conditionals. If the
conditional is not satisfied, then the source code between
the directive and its terminating #endif are expunged and
not assembled. Up to fifteen levels of nesting are
supported.
#ifdef DEFINE
True only if macro DEFINE is defined.
#ifndef DEFINE
The opposite; true only if macro DEFINE has not been
previously defined.
#if expression
True if expression expression evaluates to non-zero.
expression may reference other macros.
#iflused label
True if label label has been used (but not necessarily
instantiated with a value). This works on labels, not
macros!
#ifldef label
True if label label is defined and assigned with a
value. This works on labels, not macros!
#else
Implements alternate path for a conditional block.
#endif
Closes a conditional block.
Unclosed conditional blocks at the end of included files
generate warnings; unclosed conditional blocks at the end of
assembly generate an error.
#iflused and #ifldef are useful for building up a library
based on labels. For example, you might use something like
this in your library's code:
#iflused label
#ifldef label
#echo label already defined, library function
#else
label /* your code */
#endif
#endif
LINKING
xa is oriented towards generating sequential binaries. Code
is strictly emitted in order even if the program counter is
set to a lower location than previously assembled code, and
padding is not automatically emitted if the program counter
is set to a higher location. Changing the program location
only changes new labels for code that is subsequently
emitted; previous emitted code remains unchanged.
Fortunately, for many object files these conventions have no
effect on their generation.
However, some applications may require generating an object
file built from several previously generated components,
and/or submodules which may need to be present at specific
memory locations. With a minor amount of additional
specification, it is possible to use xa for this purpose as
well.
The first means of doing so uses the o65 format to make
relocatable objects that in turn can be linked by ldo65(1)
(q.v.).
The second means involves either assembled code, or
insertion of previously built object or data files with
.bin, using .dsb pseudo-ops with computed expression
arguments to insert any necessary padding between them, in
the sequential order they are to reside in memory. Consider
this example:
.word $1000
* = $1000
; this is your code at $1000
part1 rts
; this label marks the end of code
endofpart1
; DON'T PUT A NEW .word HERE!
* = $2000
.dsb (*-endofpart1), 0
; yes, set it again
* = $2000
; this is your code at $2000
part2 rts
This example, written for Commodore microcomputers using a
16-bit starting address, has two "modules" in it: one block
of code at $1000 (4096), indicated by the code between
labels part1 and endofpart1, and a second block at $2000
(8192) starting at label part2.
The padding is computed by the .dsb pseudo-op between the
two modules. Note that the program counter is set to the new
address and then a computed expression inserts the proper
number of fill bytes from the end of the assembled code in
part 1 up to the new program counter address. Since this
itself advances the program counter, the program counter is
reset again, and assembly continues.
When the object this source file generates is loaded, there
will be an rts instruction at address 4096 and another at
address 8192, with null bytes between them.
Should one of these areas need to contain a pre-built file,
instead of assembly code, simply use a .bin pseudo-op to
load whatever portions of the file are required into the
output. The computation of addresses and number of necessary
fill bytes is done in the same fashion.
Although this example used the program counter itself to
compute the difference between addresses, you can use any
label for this purpose, keeping in mind that only the
program counter determines where relative addresses within
assembled code are resolved.
ENVIRONMENT
xa utilises the following environment variables, if they
exist:
XAINPUT
Include file path; components should be separated by
`,'.
XAOUTPUT
Output file path.
NOTES'N'BUGS
The R65C02 instructions ina (often rendered inc a) and dea
(dec a) must be rendered as bare inc and dec instructions
respectively.
The 65816 instructions mvn and mvp use two eight bit
parameters, the only instructions in the entire instruction
set to do so. Older versions of xa took a single 16-bit
absolute value. As of 2.4.0, this old syntax is no longer
accepted.
Forward-defined labels -- that is, labels that are defined
after the current instruction is processed -- cannot be
optimized into zero page instructions even if the label does
end up being defined as a zero page location, because the
assembler does not know the value of the label in advance
during the first pass when the length of an instruction is
computed. On the second pass, a warning will be issued when
an instruction that could have been optimized can't be
because of this limitation. (Obviously, this does not apply
to branching or jumping instructions because they're not
optimizable anyhow, and those instructions that can only
take an 8-bit parameter will always be casted to an 8-bit
quantity.) If the label cannot otherwise be defined ahead
of the instruction, the backtick prefix ` may be used to
force further optimization no matter where the label is
defined as long as the instruction supports it.
Indiscriminately forcing the issue can be fraught with
peril, however, and is not recommended; to discourage this,
the assembler will complain about its use in addressing mode
situations where no ambiguity exists, such as indirect
indexed, branching and so on.
SEE ALSO
file65(1), ldo65(1), reloc65(1), uncpk(1), dxa(1)
AUTHOR
This manual page was written by David Weinehall
<tao@acc.umu.se>, Andre Fachat <fachat@web.de> and Cameron
Kaiser <ckaiser@floodgap.com>. Original xa package
(C)1989-1997 Andre Fachat. Additional changes (C)1989-2024
Andre Fachat, Jolse Maginnis, David Weinehall, Cameron
Kaiser. The official maintainer is Cameron Kaiser.
OVER 30 YEARS OF XA
Yay us?
WEBSITE
http://www.floodgap.com/retrotech/xa/
Man(1) output converted with
man2html