Variables, DEF FN Definitions and Arrays Storage in Amstrad CPC/Locomotive BASIC

(The information in this article comes from reverse engineering Amstrad CPC BASIC. You can find the reverse engineered source code in my CPC BASIC source code repository. You can find an example BASIC program which ‘walks’ the storage areas in the Examples folder of that repository. See also the CPC Wiki page for more technical information on CPC BASIC).

Variables and DEF FN definitions are stored in the ‘variable storage area’. Arrays are stored in the ‘Arrays storage area’. Both of these areas are organised as a number of linked lists using a similar format.

The ‘variable storage area’ holds data for variables and DEF FNs in 27 linked lists. There is a separate linked list for each letter of the alphabet – corresponding to the initial letter of the variables stored within – and a separate list for all DEF FNs (no matter what their initial letter).

The ‘array storage area’ holds three separate linked lists, one for each data type – strings, integers and reals.

Links between items in each list are offsets relative to the byte before the start of their respective storage areas. The start of the storage areas can found at the following memory addresses (each entry is a word):

Storage areaBASIC 1.1BASIC 1.0 (CPC464)
Variables and DEF FNs&AE68&AE85
Table 1: Location containing the addresses of the start of variables and arrays storage areas. (BASIC 1.1 is used in all versions of the CPC except the CPC464. I.e. the CPC464+ also used BASIC 1.1)

(As an aside, when a DEF FN is executed the parameters are ‘pushed’ onto the execution stack using the exact same format as in the variables storage area (but with a single list for all variables – not a separate list per letter). The only difference being that the ‘offsets’ used to link items are absolute values (or offsets from &0000 depending on your point of view). The parameters are pushed onto the stack at the start of execution of the FN, and the code to find a variable searches both the FN variable list and the main variable list).

In memory the arrays storage area immediately follows the variables storage area (which itself immediately follows the tokenised program storage area).

New items are added to the end of the relevant storage area (i.e. higher memory addresses). If items are added to the variables/DEF FNs storage area then the entire arrays storage area is moved up to accommodate (hence the need to use relative links). As code is added to the program both the variables and arrays storage areas are moved up in memory . (And the complete areas are moved down in memory if code is removed). If arrays are ERASEd the arrays at higher addresses are move down to ‘fill’ the gap.

The start (‘head’) of each of the lists can be found at the following locations. Each entry is a relative offset as described above.

List itemBASIC 1.1BASIC 1.0 (CPC464)Notes
Variables “A”-“Z”&ADB7&ADD0A word entry for each letter corresponding to the initial letter of the variable name. The first entry is for the letter ‘A’, second for the letter ‘B’ and so on.
DEF FNs&ADEB&AE04DEF FN definitions (NOT invocations)
Real arrays&ADED&AE06
Integer arrays&ADEF&AE08
String arrays&ADF1&AE10
Table 2: Storage locations for the list headers for variables and arrays.

Every list item uses the same basic header format which is then followed by different data depending of the item type. The header format is as follows,

Field nameField size (bytes)Field typeNotes
Link2wordLink to the next list item (i.e. next lower in memory). Storage as a relative offset from the byte before the start of the storage area. &0000 if no more items in the list.
NameOne byte per characterASCII7The variable (etc) name stored in upper case with the high byte of the last character is set. The variable name is converted up upper case by simply clearing bit 5. This has the effect of also changing the ASCII codes for numbers and periods (which can also be used in variable names). Note that no validation of the variable name is done during the storage process – validation is done when the program is tokenised.
Data type1ByteThe data type of the variable (etc). See below
DataSee belowSee belowThe ‘payload’ of the item. For variables (and DEF FNs) this is the address returned by the ‘@’ operator.
Table 3: Header format for items variable and array linked lists.

The ‘data type’ field is encoded as follows. Note that the lower nybble of the value is one less that the internal data type used by BASIC, and thus one less than the number of bytes of storage required for each data type. Also note that these values are NOT the same as those used within the tokenised program code. Also note that arrays still have a type stored within the header even though there are separate lists for each data type.

&01Integer variable, or integer array
&02String variable, or string array
&04Real variable, or real array
&41DEF FN returning an integer value
&42DEF FN returning a string value
&44DEF FN returning a real value
Table 4: Data type values and their meanings in variable and array link list headers

For variables and DEF FNs the data field is as follows,

Item typeData field size (bytes)Notes
Integer216-bit integer value
String3‘Standard’ string descriptor, i.e. a single byte length and a word pointer to the ASCII data. For an empty string the length byte will be zero and the pointer value will be invalid (and also zero). The ASCII data is usually stored within the ‘strings area’ but for string constants (i.e variables which have been assigned a constant value, as opposed to a calculated value) the pointer will point to the location within the tokenised program code where the value was declared/assigned.
Real5Real value stored in the standard 5-byte floating point format
DEF FN2Pointer to the DEF FN declaration within the tokenised BASIC program. Specifically this address points to the opening ‘(‘ of the DEF FN’s parameter list, or the ‘=’ sign if the DEF FN has no parameters.
Table 5: Data items stored within the variable/DEF FNs linked lists. (The ‘data’ field in Table 3)

For arrays the data area is as follows:

Field nameField size (bytes)Field typeNotes
Data size2WordThe size of the array element data, equivalent to the total number of array elements multiplied by the number of bytes of storage required for each element (see the variables data field details above). If we use dim_n to described the number of elements in a dimension and n is the dimension number then this value equals (dim_0 * dim_1 [etc] * element size)
Dimension count1ByteThe number of dimensions in the array
Dimension sizes2*(Dimension count)(List of) wordList of the number of elements in each dimension, stored in reverse order of declaration. I.e. the statement DIM x(5,6,7) will result in values here of (8,7,6). (Note that array bounds start at zero and finish at the DIMmed value, so DIM a(10) will result in an array with bounds [0..10] and, therefore, 11 elements).
Element dataAs ‘Data size’ fieldList of <element>The actual data stored in the array. The size of each element is as given for variable data in the previous table.
Table 6: Fields in the data area for an item in an arrays linked list.

2 Replies to “Variables, DEF FN Definitions and Arrays Storage in Amstrad CPC/Locomotive BASIC”

Leave a Reply

Your email address will not be published. Required fields are marked *