(The information in this article comes from reverse engineering Amstrad CPC BASIC. You can find the reverse engineered source code in my CPC BASIC source code repository. You can find an example BASIC program which ‘walks’ the storage areas in the Examples folder of that repository. See also the CPC Wiki page for more technical information on CPC BASIC).
Variables and DEF FN definitions are stored in the ‘variable storage area’. Arrays are stored in the ‘Arrays storage area’. Both of these areas are organised as a number of linked lists using a similar format.
The ‘variable storage area’ holds data for variables and DEF FNs in 27 linked lists. There is a separate linked list for each letter of the alphabet – corresponding to the initial letter of the variables stored within – and a separate list for all DEF FNs (no matter what their initial letter).
The ‘array storage area’ holds three separate linked lists, one for each data type – strings, integers and reals.
Links between items in each list are offsets relative to the byte before the start of their respective storage areas. The start of the storage areas can found at the following memory addresses (each entry is a word):
|Storage area||BASIC 1.1||BASIC 1.0 (CPC464)|
|Variables and DEF FNs||&AE68||&AE85|
(As an aside, when a DEF FN is executed the parameters are ‘pushed’ onto the execution stack using the exact same format as in the variables storage area (but with a single list for all variables – not a separate list per letter). The only difference being that the ‘offsets’ used to link items are absolute values (or offsets from &0000 depending on your point of view). The parameters are pushed onto the stack at the start of execution of the FN, and the code to find a variable searches both the FN variable list and the main variable list).
In memory the arrays storage area immediately follows the variables storage area (which itself immediately follows the tokenised program storage area).
New items are added to the end of the relevant storage area (i.e. higher memory addresses). If items are added to the variables/DEF FNs storage area then the entire arrays storage area is moved up to accommodate (hence the need to use relative links). As code is added to the program both the variables and arrays storage areas are moved up in memory . (And the complete areas are moved down in memory if code is removed). If arrays are ERASEd the arrays at higher addresses are move down to ‘fill’ the gap.
The start (‘head’) of each of the lists can be found at the following locations. Each entry is a relative offset as described above.
|List item||BASIC 1.1||BASIC 1.0 (CPC464)||Notes|
|Variables “A”-“Z”||&ADB7||&ADD0||A word entry for each letter corresponding to the initial letter of the variable name. The first entry is for the letter ‘A’, second for the letter ‘B’ and so on.|
|DEF FNs||&ADEB||&AE04||DEF FN definitions (NOT invocations)|
Every list item uses the same basic header format which is then followed by different data depending of the item type. The header format is as follows,
|Field name||Field size (bytes)||Field type||Notes|
|Link||2||word||Link to the next list item (i.e. next lower in memory). Storage as a relative offset from the byte before the start of the storage area. &0000 if no more items in the list.|
|Name||One byte per character||ASCII7||The variable (etc) name stored in upper case with the high byte of the last character is set. The variable name is converted up upper case by simply clearing bit 5. This has the effect of also changing the ASCII codes for numbers and periods (which can also be used in variable names). Note that no validation of the variable name is done during the storage process – validation is done when the program is tokenised.|
|Data type||1||Byte||The data type of the variable (etc). See below|
|Data||See below||See below||The ‘payload’ of the item. For variables (and DEF FNs) this is the address returned by the ‘@’ operator.|
The ‘data type’ field is encoded as follows. Note that the lower nybble of the value is one less that the internal data type used by BASIC, and thus one less than the number of bytes of storage required for each data type. Also note that these values are NOT the same as those used within the tokenised program code. Also note that arrays still have a type stored within the header even though there are separate lists for each data type.
|&01||Integer variable, or integer array|
|&02||String variable, or string array|
|&04||Real variable, or real array|
|&41||DEF FN returning an integer value|
|&42||DEF FN returning a string value|
|&44||DEF FN returning a real value|
For variables and DEF FNs the data field is as follows,
|Item type||Data field size (bytes)||Notes|
|Integer||2||16-bit integer value|
|String||3||‘Standard’ string descriptor, i.e. a single byte length and a word pointer to the ASCII data. For an empty string the length byte will be zero and the pointer value will be invalid (and also zero). The ASCII data is usually stored within the ‘strings area’ but for string constants (i.e variables which have been assigned a constant value, as opposed to a calculated value) the pointer will point to the location within the tokenised program code where the value was declared/assigned.|
|Real||5||Real value stored in the standard 5-byte floating point format|
|DEF FN||2||Pointer to the DEF FN declaration within the tokenised BASIC program. Specifically this address points to the opening ‘(‘ of the DEF FN’s parameter list, or the ‘=’ sign if the DEF FN has no parameters.|
For arrays the data area is as follows:
|Field name||Field size (bytes)||Field type||Notes|
|Data size||2||Word||The size of the array element data, equivalent to the total number of array elements multiplied by the number of bytes of storage required for each element (see the variables data field details above). If we use dim_n to described the number of elements in a dimension and n is the dimension number then this value equals (dim_0 * dim_1 [etc] * element size)|
|Dimension count||1||Byte||The number of dimensions in the array|
|Dimension sizes||2*(Dimension count)||(List of) word||List of the number of elements in each dimension, stored in reverse order of declaration. I.e. the statement DIM x(5,6,7) will result in values here of (8,7,6). (Note that array bounds start at zero and finish at the DIMmed value, so DIM a(10) will result in an array with bounds [0..10] and, therefore, 11 elements).|
|Element data||As ‘Data size’ field||List of <element>||The actual data stored in the array. The size of each element is as given for variable data in the previous table.|