The compiler was started as a just-in-time compiler, that is, it needed some arguments to infer types of variables. Although the compiler now supports precompilation and type signatures can be explicitly given, it still is a just-in-time compiler in the following sense: Only methods for operations getting a CAP category as the first argument can be found by the compiler, and an instance of a CAP category is needed for method selection.
The compiler uses GAP's syntax trees for optimizing functions. The term tree always refers to the syntax tree of the function to be compiled. Note that a node of the tree always knows its children, so technically it is also a tree. That is, the terms tree, subtree, and node technically describe the same thing but we use them for different emphases.
We often replace a node in the tree by another tree representing the "value" of the original node. Examples:
Replace a global variable referencing an integer, a string, or a boolean by EXPR_INT, EXPR_STRING, EXPR_TRUE or EXPR_FALSE.
Replace a global variable referencing a plain function by the syntax tree of this function.
Replace a record access of a global function by the value of this record access.
Replace an operation by a concrete method.
We call this resolving the global variable, operation, etc. Note that this does not change the basic "layout" of the tree.
On the contrary, in the following examples we change the "layout" of the tree:
If we have a function call of a resolved function, we can assign the argument values to local variables at the beginning of the function. This way we can avoid passing arguments completely.
If a function call of a resolved function occurs in the right hand side of a variable assignment, we can insert the body of the resolved function right before the variable assignment. This way we can avoid the function call.
We can replace all references to a local variable by the right hand side of the variable assignment and then drop the assignment.
We call this inlining the function arguments, functions, or variable assignments.
The compilation process has two phases: the resolving phase and the rule phase.
During the resolving phase, operations and global variables are resolved.
CAP operations can only be resolved if an instance of a CAP category to which the operation is applied is known. Then, the functions added to the category via Add
functions are considered. Note that caching, pre functions, etc. are bypassed.
You can also use InstallMethodForCompilerForCAP
or InstallOtherMethodForCompilerForCAP
(see the documentation of CAP) to make methods which accept a CAP category as the first argument known to the compiler. The same restrictions as for CAP operations apply.
References to global functions are resolved if the function name is not listed in CAP_JIT_NON_RESOLVABLE_GLOBAL_VARIABLE_NAMES
and if the function is annotated with the pragma CAP_JIT_RESOLVE_FUNCTION
.
If a call of a global function or operation (occuring as the right hand side of a variable assignment or as the return value of a function) is annotated with the pragma CAP_JIT_NEXT_FUNCCALL_DOES_NOT_RETURN_FAIL
, the compiler assumes that the resolved function never returns fail
and thus simply removes any code of the form if condition then return fail; fi;
(or similar) from the resolved function. The pragma has to be placed right before the variable assignment or the return statement.
If the pragma CAP_JIT_DROP_NEXT_STATEMENT
occurs in the resolved function, the statement following the pragma is removed.
For details see CapJitResolvedOperations
(2.5-21) and CapJitResolvedGlobalVariables
(2.5-20). If no operation or global variable can be resolved anymore, we continue with the rule phase.
In the rule phase, the tree is optimized using several rules and techniques. Function arguments, functions, and assignments to local variables are inlined. Unused variables are dropped. Handled edge cases are dropped, that is, if the same edge case is caught multiple times via if condition then return ...; fi;
statements, only the first such statement is kept. Finally, "logic" is applied to the tree. For example, calls of CallFuncList
are replaced by calls to the actual function. The logic can be extended by the user, see chapter 2.
For all details, see the list of compilations steps in 2.5.
There are some requirements for the steps described above to be correct.
The code must be purely functional, that is, it must have the following properties:
The code must not depend on side effects (otherwise dropping "unused" variables or inlining variables could change results).
Only assignments to local variables, if/elif/else statements, and return statements are allowed as statements in functions.
All branches of any given if/elif/else statement must end with a return statement or the assignment to the same local variable (so if/elif/else statements can be handled as expressions).
If statements must always have an else statement, except if all branches end with a return statement. In the latter case, the statements following the if statement are automatically put into an else statement.
Local variables must be assigned at most once (this includes function arguments, which are assigned at least once, namely when the function is called). An exception is made for "rapid reassignments": if the same variable is assigned and then reassigned immediately in the next statement, this only counts as a single assignment.
Since the compiler might change the order of statements, checking if a local variable (or list entry etc.) is bound is not safe.
GAP has strict semantics, that is, the evaluation of an expression first evaluates all subexpressions, and if an error occurs during the evaluation of a subexpression, the evaluation of the outer expression also stops. However, CompilerForCAP has some optimizations which are only valid with non-strict semantics. These optimizations produce subexpression which are not computed in the original code and are not needed for the final result. With GAP's strict semantics, those subexpressions are still evaluated despite not being needed for the final result. Hence, those subexpressions must not throw an error when evaluated. In particular:
For performance optimizations, the compiler might move expressions out of if/else statements, e.g. if they occur inside a (potentially expensive) List
call. Thus, code inside if/else statements must execute without error even if moved outside of the if/else statement.
More generally, hoisting expressions might lead to subexpressions being evaluated which were not being evaluated before. For example, the subexpression mat[i]
in the uncompiled code i -> List( [ ], j -> mat[i][j] )
is never evaluated due to the domain of j
being empty, but CompilerForCAP will hoist mat[i]
out of the List
call, so after compilation mat[i]
might be evaluated.
Functions which are applied to an element of a list must be applicable to all elements of this list.
Furthermore, there are some technical restrictions:
Nested if/elif/else statements are not allowed.
Function arguments or local variables must not be called RETURN_VALUE
, as this name is used internally.
Non-dense lists are not supported.
There is no detection for recursive function calls, so resolving such a function call leads to an infinite loop.
Also note that technically the result of the compilation is only valid for the concrete CAP category which was used to resolve CAP operations and get typing information. That is, for parametric categories you must check manually for which parameters the result is valid.
The main mode of application of the compiler is precompiling categories and storing the result in a file for later use via CapJitPrecompileCategory
(1.4-1).
‣ CapJitPrecompileCategory ( category_constructor, given_arguments, package_name, compiled_category_name ) | ( function ) |
Compiles operations of the CAP category returned by the function category_constructor applied to given_arguments. The result is available via a global function called ADD_FUNCTIONS_FOR_
compiled_category_name which is written to the file package_name/precompiled_categories/
compiled_category_name.gi
and adds the compiled functions to the category given as the only argument. For testing purposes, there is also a global function called compiled_category_name which takes the same arguments as category_constructor and returns the category with the compiled functions installed (note: category_constructor is called with no_precompiled_code := true
, which might lead to uncompiled code being present in lower categories of a tower of categories). If a list of operations is given via the option operations
, only operations in this list are precompiled. Else all installed operations (excluding operations which are part of suggested dependencies) of the category are precompiled. Furthermore, the options number_of_objectified_objects_in_data_structure_of_object
, number_of_objectified_morphisms_in_data_structure_of_object
, number_of_objectified_objects_in_data_structure_of_morphism
, and number_of_objectified_morphisms_in_data_structure_of_morphism
can optionally be set to allow the compiler to warn about about bad compilation results. For example, number_of_objectified_objects_in_data_structure_of_object
should be the number of calls to CreateCapCategoryObjectWithAttributes
or AsCapCategoryObject
required for creating an object in the tower defined by category_constructor, e.g. 1 if the tower has height one, or 2 if the tower has height two and the objects of the category at the top are given by objects in the category one level below. Warnings will be displayed if this number is exceeded (because this implies that not all wrap-unwrap-pairs could be canceled). Technical requirements:
category_constructor must be a regular function, i.e. not an operation or a kernel function.
category_constructor must support the option FinalizeCategory
. WARNING: When using attributes you might run into errors because the options are only respected the first time you call the attribute getter. To catch such a situation, category_constructor is applied to given_arguments twice and if the results are identical (IsIdenticalObj
) an error is raised.
CAP operations returning fail
are excluded from the compilation because they usually do not fulfill all requirements of the compiler.
‣ CapJitPrecompileCategoryAndCompareResult ( category_constructor, given_arguments, package_name, compiled_category_name ) | ( function ) |
Calls CapJitPrecompileCategory
(1.4-1) with the given arguments and displays a warning if this changes the contents of package_name/precompiled_categories/
compiled_category_name.gi
.
One can also compile a function func
manually via CapJitCompiledFunction
(1.5-2).
‣ CapJitCompiledCAPOperationAsEnhancedSyntaxTree ( cat, operation_name ) | ( function ) |
A special version of CapJitCompiledFunctionAsEnhancedSyntaxTree
(1.5-3) compiling the operation given by operation_name in cat.
‣ CapJitCompiledFunction ( func[, type_signature] ) | ( function ) |
Returns: a function
Returns a compiled version of the function func with signature type_signature (see CapJitInferredDataTypes
(2.5-10)). If func is an operation or a kernel function, it is returned unchanged. If type_signature is not given, all steps which require knowledge about the types of variables are skipped. If the first argument of func is a CAP category, the type signature can also be given by three separate arguments: an instance of a CAP category, a list of input filters (as in filter_list
in the method name record) and an output filter (as in return_type
in the method name record). If a full type signature is not available but the first argument of func is a CAP category, an instance of a CAP category can be given as the second argument. In this case, the category is used to get the type information required to resolve CAP operations.
‣ CapJitCompiledFunctionAsEnhancedSyntaxTree ( func, category[, type_signature] ) | ( function ) |
Returns: a record
Similar to CapJitCompiledFunction
(1.5-2) with the following differences:
an enhanced syntax tree is returned instead of a function,
func must not be an operation or a kernel function because those cannot properly be represented as a syntax tree,
the second argument must always be a CAP category or fail
,
the post-processing step, e.g. the application of compiler hints, is *not* executed.
You can give hints to the compiler to improve the result of the compilation. Compiler hints are attached to the category by making category!.compiler_hints
a record with one or more of the following keys:
category_attribute_names
: a list of names of attributes of the category. If a global variable in the compiled code has the same value (w.r.t. IsIdenticalObj
) as one of the given attributes of the category, the global variable is replaced by the attribute getter applied to the category. Background: During the compilation, the compiler resolves attributes occuring in the code and stores them in global variables called CAP_JIT_INTERNAL_GLOBAL_VARIABLE_n
(for integers n
starting from 1). If these variables cannot be replaced, the resulting code is only valid in the current session.
source_and_range_attributes_from_morphism_attribute
: a record with keys object_attribute_name
, morphism_attribute_name
, source_attribute_getter_name
, and range_attribute_getter_name
. Replaces the attribute object_attribute_name
of the source (resp. range) of a morphism by source_attribute_getter_name
(resp. range_attribute_getter_name
) applied to the attribute morphism_attribute_name
of the morphism. Can be used if objects and morphisms can be easily created from the morphism datum anyway. Note: Some simple expressions like integers are NOT replaced.
Note: The compiler can only discover the hints if the category is the first argument of the function.
You can use StopCompilationAtCategory
to prevent the compiler from inlining and optimizing code of a given category. You can revert this decision using ContinueCompilationAtCategory
.
‣ StopCompilationAtCategory ( category ) | ( function ) |
Stops the compiler from inlining and optimizing code of category.
‣ ContinueCompilationAtCategory ( category ) | ( function ) |
Allows the compiler to inline and optimize code of category (this is the default).
‣ StopCompilationAtPrimitivelyInstalledOperationsOfCategory ( category ) | ( function ) |
Stops the compiler from inlining and optimizing code of primitively installed operations of category. Warning: Due to caching of compiled CAP operations, this has to be called before any compilation involving category.
‣ ContinueCompilationAtPrimitivelyInstalledOperationsOfCategory ( category ) | ( function ) |
Allows the compiler to inline and optimize code of primitively installed operations of category (this is the default).
‣ CapJitDisableDataTypeInference ( ) | ( function ) |
‣ CapJitEnableDataTypeInference ( ) | ( function ) |
In case of errors, the automatic inference of data types can be disabled (and re-enabled later on).
‣ CapJitEnableStepByStepCompilation ( ) | ( function ) |
‣ CapJitDisableStepByStepCompilation ( ) | ( function ) |
Enables or disables step-by-step compilation for demonstration or debugging purposes. If enabled, only the first level of CAP operations and known methods is resolved (instead of resolving recursively). Caveats:
This does not work if fully compiled operations are already cached, so this should be used before starting any compilation.
This does not work if the CAP category is not immediately available from the input (e.g. because it is a range category of a homomorphism structure of the input category).
‣ CapJitEnableProofAssistantMode ( ) | ( function ) |
‣ CapJitDisableProofAssistantMode ( ) | ( function ) |
(experimental) Enables or disables the (experimental) proof assistant mode. For example, in this mode the compiler will display warnings if the code involves CAP operations which are not known to be compatible with the congruence of morphisms, and expressions will not be hoisted or deduplicated.
‣ StateLemma ( description, claim, category, filter_strings, preconditions ) | ( operation ) |
(experimental) States a new categorical lemma to be proven in proof assistant mode.
description is a human readable version of the statement
claim is a function formalizing the statement, that is, returning true
if and only if the statement holds
category is the category for which the statement shall be proven; the category must be the first argument of claim
filter_strings is a list of filter strings as in the method name record
preconditions is a list of records of the form rec( src_template := <string>, dst_template := <string> )
; the template strings are interpreted in the context of claim
and can be used to describe relations between the arguments of claim
The aim is to reduce the function claim to {...} -> true
by applying logic templates via ApplyLogicTemplate
(1.10-5).
‣ PrintLemma ( ) | ( operation ) |
(experimental) Prints the state of the current lemma.
‣ AttestValidInputs ( ) | ( operation ) |
(experimental) Lets the user attest that all arguments in function calls in the state of the current lemma form valid inputs for the functions, for example, that two morphisms passed to PreCompose
are actually composable. This allows CompilerForCAP to simplify type checks, for example, that the return value of the call to PreCompose
is a well-defined morphism.
‣ ApplyLogicTemplate ( logic_template ) | ( operation ) |
(experimental) Applies a logic template to the state of the current lemma. If number_of_applications
is not set, it defaults to 1
.
‣ ApplyLogicTemplateNTimes ( n, logic_template ) | ( operation ) |
(experimental) Applies a logic template n times to the state of the current lemma. Note: number_of_applications
must not be set.
‣ AssertLemma ( ) | ( operation ) |
(experimental) Asserts that the current lemma is proven, that is, that the state of the current lemma is {...} -> true
. If this is the case, the current lemma is cleared, which allows a new lemma to be stated.
‣ StateProposition ( category, proposition_id ) | ( operation ) |
(experimental) States a new proposition to be proven for the category category. Valid IDs are the entries of RecNames( CAP_JIT_INTERNAL_PROOF_ASSISTANT_PROPOSITIONS )
. A proposition is a collection of lemmata which can be stated using StateNextLemma
(1.10-9) and can then be proven in the same way as a lemma stated using StateLemma
(1.10-2).
‣ StateNextLemma ( ) | ( operation ) |
(experimental) States the next lemma of the current proposition.
‣ AssertProposition ( ) | ( operation ) |
(experimental) Asserts that the current proposition is proven, that is, that all lemmata of the current proposition are proven.
You can increase the info level of InfoCapJit
to get information about the compilation process.
‣ InfoCapJit | ( info class ) |
Info class used for info messsages of the CAP compiler.
‣ CapJitSetDebugLevel ( level ) | ( function ) |
Set the debug level. At level 0, no debug information is printed. At level 1 (or higher), compiled functions are printed at the beginning of the compilation, before the resolving phase, before the rule phase, before the post-processing starts, and at the end of the compilation. At level 2 (or higher), compiled functions are printed after every compilation step. These levels are not fixed and may change in the future.
Q: Where do the category attributes in the compiled code come from?
A: During compilation, attributes of categories are resolved, that is, their values are stored in global variables without keeping track of which category they belong to. After the compilation, CompilerForCAP tries to replace those global variables again by attributes registered via category_attribute_names
to the current category (for details, see 1.6). If some attributes of the current category are categories themselves, their attributes are also taken into account (recursively). In this case, if two categories cat1
and cat2
have attributes Attr1
and Attr2
, respectively, with the same value, it is undefined whether Attr1( cat1 )
or Attr2( cat2 )
will be used. So even if only Attr2( cat2 )
appeared in the original code, Attr1( cat1 )
might appear in the compiled code.
Q: Why is my function not resolved?
A: The function might be listed in CAP_JIT_NON_RESOLVABLE_GLOBAL_VARIABLE_NAMES
or it might be missing an annotation with the pragma CAP_JIT_RESOLVE_FUNCTION
. Additionally, a function can only be resolved if it appears as a global variable in the tree during the resolving phase of the compilation. That is, it must be referenced from a global variable from the beginning on, or after global variables are resolved by CapJitResolvedGlobalVariables
(2.5-20). Possibly you have adapt CapJitResolvedGlobalVariables
(2.5-20) to your setting.
Q: Why is do I get the error "a local variable with name <name> is assigned more than once (not as a part of a rapid reassignment), this is not supported"?
A: For reasons of correctness, variables cannot be inlined if a variable is assigned more than once in the body of a function (this includes function arguments which are assigned at least once, namely when the function is called). An exception is made for "rapid reassignments": if the same variable is assigned and then reassigned immediately in the next statement, this only counts as a single assignment.
Q: Why do I get one of the following errors: "tree includes statements or expressions which indicate possible side effects", "tree contains an assignment of a higher variable with initial name <name>, this is not supported", or "tree contains for loop over non-local variable, this is not supported" ?
A: We can only drop unused variables, inline variables, etc. if we assume that the code contains no side effects. Statements like STAT_PROCCALL or assignment to higher variables cause (or at least indicate) side effects, so continuing with the compilation would probably not lead to a correct result.
Q: Why is a given logic function or logic template not applied?
A: There are various possible reasons for this:
Compiler hints, hoisting, and deduplication are applied after the rule phase, so even if the logic would apply to then final result, it might not apply to the intermediate result.
The compiler might not be able to decide a required equality of expressions because it cannot decide the equality of arbitrary functions.
If the expression is part of a call to CAP_JIT_INCOMPLETE_LOGIC
, this shows that it has not fully run through all logic functions/templates.
generated by GAPDoc2HTML