[This article is a copy of comments in source code for the Def.IL unit.]
The intermediate language (IL) is the core of the Quiche compiler. The parser
takes the source code and converts it to IL. The code generator takes the IL and
converts it to assembly code.
Each IL record stores an operation and up the three parameters. Some operations
extend over multiple IL records, for example when making function calls. The
three parameters are referred to as Param1, Param2 and Dest. Dest can also be
referred to as Param3 when using some of the system operations.
Operations (or operators) divide into two main classes: system operations and
other (non-system) operations. System operations are generally those which move
data without transforming it. For example, initialising variables, branching
and dispatching function calls.
The non-system operations are usually those which do transform data. These are
operations such as maths, comparisons, and logic as well as intrinsics (basic
operations which use function syntax but which (often) result in simple inline
code in the output).
IL Format for Non-system Operations
The non-system operations are represented in the IL using a common format
whereas the non-system operations use the IL in a more flexible fashion.
The IL for non-system operations takes the form:Dest Operation Param1 [Param2]
Ie. the Operation takes in one or two parameters – referred to as source
parameters – and stores the result in the dest(ination) parameter.
Thus the source code:B := A + 1
is represented in IL as
Operation: Add
Param1: Source the value from the A variable
Param2: An immediate (literal) value of 1
Dest: Store the result to the B variable
Parameter Kinds
Each parameter has a ‘kind’ which specifies where the value is being sourced
from or stored to, and what sort of value is required. Some of the key params
kinds are:
Immediate: an immediate (literal) value.
VarSource: Reading value from variable.
VarDest: Storing the result to a variable.
VarAddr: The address of a variable.
VarPtr: The value pointed to by a variable (ie dereferencing a pointer).
Branch: Data for an unconditional branch.
CondBranch: Data for a conditional branch.
Some parameter kinds are only valid when used with specific operations, for
example the Branch and CondBranch can only be used in branching operations.
Extending IL Items
As mentioned above some operations can extend over multiple IL records. The
operations which allow this come in pairs, one of which has the word Extended
appended. Thus the IL for a function call uses the operations FuncCall and
FuncCallExtended.
The Extended form of the operation indicates that the data extends into the
following IL record. There can be multiple of these extended operation IL items
but they must end with a non extended operation of the same type.
Thus a function call which takes four arguments will require two IL records. The
first will contain data relating to the first three parameters and be of type
FuncCallExtended. The second will contain data for the fourth parameter and be
of type FuncCall.
Note that the exact formatting of the data for a function call within the IL
will depend on the calling convention in use. (And in general the formatting of
IL data for system operations, including extended operations, is specified by
the operations and is not dictated by the IL format).
Other IL Data
Various other pieces of data are also stored within the intermediate language
record. These include:
- The index of the current code block (a code block is a section of code with no
branches). - Information about data types.
- Data relating to the source code (eg line number).
- Compiler flags in effect at that position in the source code (eg overflow
and range checking status).
Parameters include extra information such as the CPU register which the data
needs to be placing in (or in which the result will be found). Much
of this data is added by, and used by, the code generator.
IL Data Storage
IL data is stored in a list format. This allows any IL item to be referenced
through it’s index position in the list and allows the IL data to be easily
traversed. The index position is the data stored by the Branch and CondBranch
param kinds, with the value pointing to the destination IL item.
IL Functions
The IL has functions to append a new IL item to the IL list, insert and IL item
into a list (a function which is rarely used), and to retrieve an item given
it’s index.
Links
Given the core nature of the IL it has relevance to large swathes of the code
base. Key areas of relevance would be the Parser, the Operators, Types, and the
code generator.