The easiest way to extend the compiler is by adding more logic to it, see 2.1. For writing logic functions you also have to iterate over enhanced syntax trees, see 2.2 and 2.3. You might also want to use available tools, see 2.4. If you want to improve an existing compilation step or add a completely new one, see 2.5.
For debugging you can:
use CapJitSetDebugLevel
(1.11-2),
use ENHANCED_SYNTAX_TREE_CODE
(2.2-2) to view the tree as GAP code (ENHANCED_SYNTAX_TREE_CODE
( tree )),
use CapJitPrettyPrintSyntaxTree
(2.4-1) to view the tree in a more readable formatting (CapJitPrettyPrintSyntaxTree
( tree )),
use the debug
and debug_path
record entries of logic templates (see CapJitAddLogicTemplate
(2.1-2)).
Warning: When writing logic functions and templates keep in mind that wrapped arguments are outlined, see CapJitOutlinedWrappedArguments
(2.5-19). This means that for example a logic template of the form CreateCapCategoryObjectWithAttributes( cat, attr, MyFunction( x ) )
will never match because MyFunction( x )
is outlined to a local variable.
‣ CapJitAddLogicFunction ( func ) | ( function ) |
Adds the function func to the list of logic functions. For a list of pre-installed logic functions, which can be used as guiding examples, see CompilerForCAP/gap/Logic.gi
. Technically, func should accept an (enhanced) syntax tree and return an (enhanced) syntax tree. Semantically, func should use some kind of "logic" to transform the tree. For example, func could look for calls of CallFuncList
and replace them by calls to the actual function. Note: Often it is easier to use a logic template (see CapJitAddLogicTemplate
(2.1-2)) than a logic function.
‣ CapJitAddLogicTemplate ( template ) | ( function ) |
Adds the logic template template to the list of logic templates. For a list of pre-installed logic templates, which can be used as guiding examples, see CompilerForCAP/gap/LogicTemplates.gi
. Logic templates are records with the following entries:
src_template
and dst_template
(required): strings containing valid GAP code defining expressions
variable_names
(required): a list of strings (which must not be keywords in GAP)
variable_filters
(optional): a list of filters or data types (see CapJitInferredDataTypes
(2.5-10)) with the same length as variable_names
, defaults to a list of IsObject
new_funcs
(optional): a list of lists of strings, defaults to the empty list
number_of_applications
(optional): a positive integer or infinity, defaults to infinity
apply_in_proof_assistant_mode
(optional): one of the strings "yes"
, "no"
, or "only"
, defaults to "yes"
debug
(optional): a boolean
debug_path
(optional): a path
Semantics: src_template
is a piece of code which should be replaced by dst_template
:
The function CapJitAppliedLogicTemplates
(2.5-18) tries to find occurences of src_template
in a tree and potentially replaces them by dst_template
.
When trying to find an occurence of src_template
in a tree, all strings occuring in the list variable_names
are considered as variables, i.e., they match any value in the tree. If a variable occurs multiple times, the corresponding parts of the tree must be equal. The template is only applied if all values match the corresponding filters or data types in variable_filters
.
For each function in dst_template
, CapJitAppliedLogicTemplates
(2.5-18) tries to find a corresponding function in src_template
. The functions are matched by comparing the lists of names of function arguments. If for a function in dst_template
no corresponding function in src_template
exists, you have to add the list of names of function arguments of this function to new_funcs
.
If number_of_applications
is a positive integer n, the logic template will be applied at most n times and ignored afterwards.
apply_in_proof_assistant_mode
determines if the logic template is applied in proof assistant mode, is not applied in proof assistant mode, or is only applied in proof assistant mode but not in normal mode.
debug
can be set to true
to print more information while CapJitAppliedLogicTemplates
(2.5-18) tries to apply the template. (Note: this causes informational break loops which are not actual errors).
debug_path
can be set to a specific path to get exact information why the subtree at this path does or does not match src_template
.
Notes:
src_template
is only replaced by dst_template
if the result is well-defined, i.e., if all function variables reference only functions in their function stack. This can be used to move "static" variables (i.e. variables not depending on local variables) outside of functions. Example: consider a template with src_template
given by Sum( List( L, x -> x^2 * value ) )
and dst_template
given by Sum( List( L, x -> x^2 ) ) * value
(assuming distributivity). This replacement is only valid if value
is independent of x
. However, we do not need to make this explicit at any point, because if value
depends on x
, the result Sum( List( L, x -> x^2 ) ) * value
is not well-defined, so the template is not applied anyway.
If src_template
cannot be expressed as valid GAP code, the component src_template_tree
can be set. In that case, src_template
is not parsed and src_template_tree
is used when trying to find a match. Variables in the sense of variable_names
have to be given as syntax trees of type SYNTAX_TREE_VARIABLE
with a unique id
. Setting src_template
and variable_names
is still required to have a readable representation of the template. If dst_template
cannot be expressed as valid GAP code, it can be handled in an analogous manner.
To simplify the handling of syntax trees, the CAP compiler enhances syntax trees in the following ways:
Lists are transformed (in the expected way) into records of type SYNTAX_TREE_LIST with integer keys (and an additional key length
).
All node types starting with STAT_SEQ_STAT are replaced by STAT_SEQ_STAT.
All node types starting with EXPR_FUNCCALL_ are replaced by EXPR_FUNCCALL.
All node types starting with EXPR_PROCCALL_ are replaced by EXPR_PROCCALL.
All node types starting with STAT_FOR are replaced by STAT_FOR.
Nested STAT_SEQ_STATs are flattened.
A final STAT_RETURN_VOID in functions is dropped.
Branches of STAT_IF etc. are given the type BRANCH_IF.
If the body of a BRANCH_IF is not a STAT_SEQ_STAT, the body is wrapped in a STAT_SEQ_STAT.
The key-value pairs of EXPR_RECs are given the type REC_KEY_VALUE_PAIR.
A globally unique ID is assigned to each function.
Return statements are replaced by assignments to a special local variable "RETURN_VALUE".
The handling of local variables and higher variables is unified by the concept of function variables: function variables (FVARs) reference local variables in functions via the function id (func_id
) and their name (name
), which is determined from the list of arguments/local variables of the function.
All statements after if ... then return ...; fi;
are moved into an else
branch.
The syntax tree types in CAP_JIT_INTERNAL_SYNTAX_TREE_TO_OPERATION_TRANSLATIONS
(e.g. EXPR_ELM_MAT
) are replaced by calls to corresponding operations (so that they do not require special handling).
if/elif/else statements must all end with the assignment to the same local variable and must have an else clause. They are coded using a new type EXPR_CASE
: Trees of type EXPR_CASE
have a key branches
which holds a list of trees of type CASE_BRANCH
. Trees of type CASE_BRANCH
have two keys condition
and value
, which get populated with the condition of the if branch and the right hand side of the last variable assignment in the body of the branch. Caution: The remaining statements in the body of the if branch are moved outside of the if/elif/else statement!
Statements which are not assignments to local variables or if/else statements as well as nested if/else statements are not supported.
EXPR_FUNC
s are coded as EXPR_DECLARATIVE_FUNC
s: The latter has not key nloc
. Additionally, instead of stats
, the latter has a key bindings
of type FVAR_BINDING_SEQ
. A syntax tree of type FVAR_BINDING_SEQ
describes a set(!) of bindings (= assignments) of local variables. It has a key names
, a list of bound names. Values of bindings can be added/obtained/unbound via CapJitAddBinding
, CapJitValueOfBinding
, and CapJitUnbindBinding
. When the tree is converted to a function again, the set of bindings is endowed with a order which respects the relation "uses".
Local variables must not be assigned more than once (this includes function arguments, which are assigned at least once, namely when the function is called). An exception is made for "rapid reassignments": if the same variable is assigned and then reassigned immediately in the next statement, the right-hand side of the first assignment is inserted into the right-hand side of the second assignment.
‣ ENHANCED_SYNTAX_TREE ( func ) | ( function ) |
Returns: a record
Returns an enhanced syntax tree of the plain function func (see above). If the option globalize_hvars
is set to true
, higher variables pointing to variables in the environment of func are assigned to global variables and referenced via these global variables in the tree. Otherwise, an error is thrown if such higher variables exist. If a list is given as the option given_arguments
, references to the i
-th argument of the function are replaced by references to a global variable with the value of given_arguments[i]
(only in case this position is bound).
‣ ENHANCED_SYNTAX_TREE_CODE ( tree ) | ( function ) |
Returns: a function
Converts the enhanced syntax tree tree to a function.
‣ CapJitIterateOverTree ( tree, pre_func, result_func, additional_arguments_func, additional_arguments ) | ( function ) |
Returns: see description
Iterates recursively over a syntax tree and calls pre_func and result_func for each node. Overview:
pre_func allows to modify a (sub-)tree before the recursion over its children. For example, you could detect occurrences of if true then <body> fi;
and simply return the body to simplify the tree.
result_func allows to construct the return value of a (sub-)tree from the return values of its children. For example, if you want to check if a node of a certain type occurs in the tree, return true
if tree has the type or any of the children returned true
, otherwise return false
.
additional_arguments[_func] allows to create and pass additional data to the children of tree, for example the path or the function stack.
Details:
First, pre_func is called with the following arguments: tree and additional_arguments. If it returns fail
, the recursion is skipped and result_func is called immediately with result
set to fail
(see below). Otherwise it must return a syntax tree, which is then used as the value of tree for the remaining computation. If you do not need this function, use ReturnFirst
.
Secondly, for each child of tree, additional_arguments_func is called with the following arguments: tree, the key of the child, and additional_arguments. If tree is a list, the key of a child is its list index. If tree is a record, the key of a child is the corresponding record name of tree. The return value is used in the next step.
Next, the recursion starts: for each child, CapJitIterateOverTree
is called again with the following arguments: the child, pre_func, result_func, additional_arguments_func, and the return value of the call of additional_arguments_func in the step above.
The results of the recursive calls are stored in the variable result
: If tree is a list, result
is also a list and the i
-th entry of this list is the return value of the result_func
of the i
-th child. If tree is a record, result
is also a record and result.(key)
is the return value of the result_func
of the child named key
.
Next, result_func is called with the following arguments: tree, result
, keys
(a list containing the children's keys), and additional_arguments. The return value should be the result of the current tree formed by combining the results of the children. For an example see CapJitResultFuncCombineChildren
(2.4-3).
Finally, the return value of result_func is returned.
Note: This function on its own does not modify the tree. However, you can make modifications in pre_func, result_func, and additional_arguments_func. If you do not want to make these modifcations in-place, you can replace a (sub-)tree by a modified version in pre_func and combine the modified (sub-)trees again using CapJitResultFuncCombineChildren
(2.4-3) as result_func.
‣ CapJitIterateOverTreeWithCachedBindingResults ( tree, pre_func, result_func, additional_arguments_func, additional_arguments ) | ( function ) |
Same input and output as CapJitIterateOverTree
(2.3-1), but the results of bindings are cached and
if we reach an EXPR_REF_FVAR
the result of the corresponding binding is given as result
for result_func,
pre_func
will not be called for FVAR_BINDING_SEQ
s,
the result
for FVAR_BINDING_SEQ
s will only contain used bindings, and
the keys
for FVAR_BINDING_SEQ
s will only contain the names of used bindings and will be ordered compatible with the "uses" relation on bindings, i.e. if a binding contains references to another binding, the name of the other binding will come first in keys
.
WARNING: Calls to CapJitIterateOverTreeWithCachedBindingResults
must not be nested if this results in a function being visited twice.
‣ CapJitPrettyPrintSyntaxTree ( tree ) | ( function ) |
Displays an enhanced syntax tree in a more useful way. For example, prints the type of a node on top.
‣ CapJitIsCallToGlobalFunction ( tree, condition ) | ( function ) |
Returns: a boolean
Checks if tree is an EXPR_FUNCCALL with funcref
EXPR_GVAR such that gvar
fulfills condition. If condition is a string, gvar
must be equal to the string. Othwerwise, condition must be a function returning a boolean when applied to gvar
.
‣ CapJitResultFuncCombineChildren ( tree, result, additional_arguments ) | ( function ) |
Returns: a list or a record
Can be used as a result_func
for CapJitIterateOverTree
(2.3-1). Replaces tree.(key)
(resp. tree[key]
) by result.(key)
(resp. result[key]
) for all keys of children of tree and returns the result. See CapJitIterateOverTree
(2.3-1) for more details.
‣ CapJitContainsRefToFVAROutsideOfFuncStack ( tree, initial_func_id_stack ) | ( function ) |
Returns: a boolean
Checks if tree (with function ID stack initial_func_id_stack) contains an FVAR which references a function outside of its function stack. Such a tree is not well-defined.
‣ CapJitGetOrCreateGlobalVariable ( value ) | ( function ) |
Returns: a string
Assigns value to a global variable and returns the name of the global variable. If value has already been assigned to a global variable by this function before, simply returns the name of that global variable.
‣ CapJitFindNodeDeep ( tree, condition_func ) | ( function ) |
Returns: a list or fail
Finds a node in tree for which condition_func returns true
and returns the path of this node. For each node, condition_func is called with the node and current path as arguments, and must return a boolean. If multiple nodes are found, children are preferred over their parents (i.e. a "deep" node is returned). If no node can be found, fail
is returned.
‣ CapJitFindNodes ( tree, condition_func ) | ( function ) |
Returns: a list
Finds all nodes in tree for which condition_func returns true
. For each node, condition_func is called with the node and current path as arguments, and must return a boolean. Returns a list of paths of nodes for which this call yields true.
‣ CapJitGetNodeByPath ( tree, path ) | ( function ) |
Returns: a record
Gets the node of tree with path path. Throws an error if no such node exists.
‣ CapJitRemovedReturnFail ( tree ) | ( function ) |
Returns: a record
Removes removes any statement of the form if condition then return fail; fi;
(or similar) from the statements of a tree of type EXPR_FUNC
. Throws an error if it cannot find such a statement.
‣ CapJitPrettyPrintFunction ( func ) | ( function ) |
Returns: a string
Pretty prints the function func and returns the result. func must be a regular function, i.e. not an operation or a kernel function.
‣ CapJitCopyWithNewFunctionIDs ( tree ) | ( function ) |
Returns: a record
Returns a structural copy of the enhanced syntax tree which is 1:1 except that all functions have new, unused IDs.
‣ CapJitIsEqualForEnhancedSyntaxTrees ( tree1, tree2 ) | ( function ) |
Returns: a boolean
Returns true
if the enhanced syntax trees tree1 and tree2 are equal up to:
renaming of arguments,
replacement of function IDs,
changing the names of global variables while still referencing the identical value.
‣ CapJitAddBinding ( bindings, name, value ) | ( function ) |
Returns: a record
Adds a binding for name with value value to a syntax tree bindings of type FVAR_BINDING_SEQ
.
‣ CapJitValueOfBinding ( bindings, name ) | ( function ) |
Returns: a record
Gets the value of the binding named name from a syntax tree bindings of type FVAR_BINDING_SEQ
.
‣ CapJitUnbindBinding ( bindings, name ) | ( function ) |
Unbinds the the binding named name from a syntax tree bindings of type FVAR_BINDING_SEQ
.
‣ CapJitReplacedEXPR_REF_FVARByValue ( tree, func_id, name, value ) | ( function ) |
Returns: a record
Replaces all subtrees of the enhanced syntax tree tree which are of type EXPR_REF_FVAR with given func_id and name by value.
‣ CapJitGetNextUnusedVariableID ( func ) | ( function ) |
Returns: an integer
Returns the minimal positive integer n such that no name of a local variable of the function func (given as an enhanced syntax tree) ends with Concatenation( "_", String(
m ) )
for any m \geq n.
‣ CapJitDataTypeIsNestedListOfSimpleLiterals ( data_type ) | ( function ) |
Returns whether the data type is a nested list of integers, booleans, strings, or characters. The nesting can be arbitrarily deep, including the degenerate case of no list at all.
‣ PrintWithCurrentlyCompiledFunctionLocation ( args... ) | ( function ) |
Prints args... followed by the location of the currently compiled function.
‣ DisplayWithCurrentlyCompiledFunctionLocation ( obj ) | ( function ) |
Displays obj followed by the location of the currently compiled function.
‣ ErrorWithCurrentlyCompiledFunctionLocation ( args... ) | ( function ) |
Prints args... as an error followed by the location of the currently compiled function.
‣ EvalStringStrict ( string ) | ( function ) |
Same as EvalString
, but enters a break-loop when encountering syntax errors.
‣ ConcatenationOfStringsAsEnumerationWithAnd ( list_of_strings ) | ( function ) |
Concatenates [ "apples", "bananas", "oranges" ]
as "apples, bananas, and oranges"
.
‣ CapJitCleanedUpHoistedAndDeduplicatedExpressions ( tree ) | ( function ) |
Returns: a record
Cleans up variable names after hoisting and deduplication:
Assignments of the form hoisted_xxx := deduped_yyy;
are deleted and all references to hoisted_xxx
are replaced by references to deduped_yyy
.
If hoisted_xxx
is used more than once, it is renamed to deduped_xxx
. Note: Even if a hoisted value is used more than once during CapJitHoistedExpressions
(2.5-8), it is not necessarily used more than once after CapJitDeduplicatedExpressions
(2.5-5) because the places where it was used might have been deduplicated themselves.
‣ CapJitAppliedCompilerHints ( tree, category, apply_irreversible_optimizations ) | ( function ) |
Returns: a record
Applies compiler hints (see 1.6) to tree. Depending on apply_irreversible_optimizations, only reversible optimizations or all optimizations are applied.
‣ CapJitReplacedGlobalVariablesByCategoryAttributes ( tree, category ) | ( function ) |
Returns: a record
Applies the compiler hint category_attribute_names
(see 1.6) to tree. Assumes that category is the first argument of the function defined by tree.
‣ CapJitReplacedSourceAndRangeAttributes ( tree, category ) | ( function ) |
Returns: a record
Applies the compiler hint source_and_range_attributes_from_morphism_attribute
(see 1.6) to tree.
‣ CapJitDeduplicatedExpressions ( tree ) | ( function ) |
Returns: a record
Deduplicates expressions occuring at least twice in the enhanced syntax tree tree.
‣ CapJitDroppedHandledEdgeCases ( tree ) | ( function ) |
Returns: a record
Idea: If the same edge case is handled multiple times in the tree by checking a condition and returning a value, all condition checks except the first can be dropped. Details: Keeps a record of conditions which immediately lead to a return, i.e., statements of the form if condition1 then return value; fi;
. If another statement of the form if condition2 then return value; fi;
is found later in the tree and if condition2 = true
implies condition1 = true
, the second statement is dropped.
‣ CapJitDroppedUnusedBindings ( tree ) | ( function ) |
Returns: a record
Drops bindings (and names) of variables in functions in tree which are never referenced.
‣ CapJitHoistedExpressions ( tree ) | ( function ) |
Returns: a record
Hoists expressions which are part of but indepedent of inner functions to outer functions. Assumes that there are no unused bindings (i.e., you should use CapJitDroppedUnusedBindings
(2.5-7) first).
‣ CapJitHoistedBindings ( tree ) | ( function ) |
Returns: a record
Hoists bindings which are part of but indepedent of inner functions to outer functions. Assumes that there are no unused bindings (i.e., you should use CapJitDroppedUnusedBindings
(2.5-7) first).
‣ CapJitInferredDataTypes ( tree ) | ( function ) |
Returns: a record
Tries to infer the data types of expressions in the enhanced syntax tree tree of a function with data types. A data type is a record with component filter
(a GAP filter) and, depending on the filter, additional components:
IsFunction
with additional component signature
: The signature is a list with two entries. The first entry is a list of data types of the inputs, the second entry is the data type of the output.
IsList
with additional component element_type
: The type of the elements of the list. Only homogeneous lists are supported.
IsNTuple
with additional component element_types
: The types of the elements of the tuple.
filters implying IsRing
, IsRingElement
, IsHomalgMatrix
with additional component ring
: The ring instance (to which the element or matrix belongs). Exception: If the filter IsInt
describes a technical'' integer (e.g. an index) and not an integer used in a mathematical context, the component
ring
should not be set.
filters implying IsCapCategory
, IsCapCategoryObject
, IsCapCategoryMorphism
, or IsCapCategoryTwoCell
with additional component category
: The category instance (to which the object, morphism or two cell belongs). WARNING: IsString
implies IsList
, so when creating a data type with filter implying IsString
one must set element_type
to a data type with filter IsChar
, or use IsStringRep
instead of IsString
.
‣ CapJitInlinedArguments ( tree ) | ( function ) |
Returns: a record
Example: transforms (function(x) return x; end)(1)
into (function() local x; x := 1; return x; end)()
. Details: Searches for function calls of resolved functions. Assigns the argument values to local variables at the beginning of the function, and drops the arguments (i.e., makes the function a 0-ary function).
‣ CapJitInlinedBindings ( tree ) | ( function ) |
Returns: a record
Example: transforms function() local x; x := 1; return x^2; end
into function() return 1^2; end()
. Details: Replaces references to local variables of a function by the value of the corresponding binding of the function. If the option inline_var_refs_only
is set to true
, this is only done if the value is a reference to a (local or global) variable. If the option inline_fully
is NOT set to true
, wrapped arguments are not inlined (see CapJitOutlinedWrappedArguments
(2.5-19)). Simple values like integers are always inlined. Also drops the inlined bindings.
‣ CapJitInlinedBindingsToVariableReferences ( tree ) | ( function ) |
Returns: a record
Short hand for CapJitInlinedBindings(
tree : inline_var_refs_only := true )
.
‣ CapJitInlinedBindingsFully ( tree ) | ( function ) |
Returns: a record
Short hand for CapJitInlinedBindings(
tree : inline_fully := true )
.
‣ CapJitInlinedFunctionCalls ( tree ) | ( function ) |
Returns: a record
Example: transforms function() local x; x := (y -> y + 2)(1); return x; end
into function() local x, y, r; y := 1; r := y + 2; x := r; return x; end
. Details: Searches for function calls of a resolved function in the right hand side of a variable assignment or a return statement. Inserts the body of the function right before the variable assignment / return statement to avoid the function call. Assumes that arguments of function calls are inlined (i.e., you should use CapJitInlinedArguments
(2.5-11) first). Due to the nature of a return
statement breaking the execution and having no goto
keyword in GAP, only functions
ending with a return
statement, or
ending with an if-(elif)-else-statement with return
statements at the end of all branches
and not containing other return
statements can be inlined.
‣ CapJitInlinedSimpleFunctionCalls ( tree ) | ( function ) |
Returns: a record
Replaces function calls of the form (function() return value; end)()
by value
. Assumes that arguments of function calls are inlined (i.e., you should use CapJitInlinedArguments
(2.5-11) first).
‣ CapJitAppliedLogic ( tree ) | ( function ) |
Returns: a record
Applies all logic functions (see CapJitAddLogicFunction
(2.1-1)) and logic templates (see CapJitAppliedLogicTemplates
(2.5-18)) to tree.
‣ CapJitAppliedLogicTemplates ( tree ) | ( function ) |
Returns: a record
Applies all logic templates (see CapJitAddLogicTemplate
(2.1-2)) to tree.
‣ CapJitOutlinedWrappedArguments ( tree ) | ( function ) |
Returns: a record
Outlines wrapped arguments to local variables. This includes:
the attribute value(s) in CreateCapCategoryObjectWithAttributes
and AsCapCategoryObject
the attribute values (including the arguments source
and range
) in CreateCapCategoryMorphismWithAttributes
and AsCapCategoryMorphism
the arguments of NTuple
(excluding the first argument)
‣ CapJitResolvedGlobalVariables ( tree ) | ( function ) |
Returns: a record
Resolves global variables (except those which are listed in CAP_JIT_NON_RESOLVABLE_GLOBAL_VARIABLE_NAMES
):
Replaces a global variable referencing a plain function by the syntax tree of this function in case the function is annotated with the pragma CAP_JIT_RESOLVE_FUNCTION
.
Computes attributes of categories stored in global variables and places the results into global variables again.
‣ CapJitResolvedOperations ( tree ) | ( function ) |
Returns: a record
Tries to resolve operations in tree:
Operations of CAP categories are resolved by taking one of the functions added to the category via an Add
function.
Operations announced to the compiler via InstallMethodForCompilerForCAP
or InstallOtherMethodForCompilerForCAP
(see the documentation of CAP) are resolved via the number of arguments.
generated by GAPDoc2HTML