Native methods for CSEC machine #74
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Introduction
Native methods in Java provide FFI functionality to call foreign functions. Native methods are especially useful in the CSE machine to implement primitive functions, and in some cases, are the only real way to do so.
Motivation
The motivation for implementing native methods initially stemmed from
Object.hashCode
, which itself is a native method.Native methods have clear uses for some other cases as well, like with displaying or printing to standard output. Other specific uses for native methods are presently unclear, but having the ability to break out of the CSEC machine is certainly not a detriment.
Implementation
Notation note: we refer generally to a Java syntactic native method by the term native method, and the foreign function that is eventually called when the native method is invoked by the term foreign function.
CSEC Handling
Two control items (labelled C1, C2) have been modified to allow for native methods to be properly handled.
(C1) Method Declarations (AST Node
MethodDeclaration
) will now dynamically link a foreign function to a native method declaration. Closures (theClosure
type) now distinguishes between three types of declarations (previously two), with the newest beingNativeDeclaration
.Constructors cannot be native in Java.
NativeDeclarations
mirror theMethodDeclaration
type, except it does not have a method body but instead hold a pointer to the resolved foreign function.Implementation note: even though
MethodDeclaration
andConstructorDeclaration
, both of which are contained within theClosure
type, are defined in the AST (inast/types/classes.ts
), we instead choose to defineNativeDeclaration
inec-evaluator/types.ts
instead, to localise these changes.(C2) Invocations (Instruction
invoke
) will now distinguish the newClosure
type, and handle it specially. In particular, retains most of the usual functionality, and will appropriately initialise a new environment frame for the function to be executed (see native method architecture below), and push areturn
instruction before directly calling the foreign function with the full CSEC machine state.Native Method Architecture
We offer full and complete power to native methods, passing in the entire control, stash, and environment to foreign functions they call. This matches the power that foreign functions are expected to have.
The CSEC machine still handles argument resolution and returning. This is because FFI calls should function like normal functions to the CSEC machine, except that it ignores whatever happens while the foreign function is running. To this end, common foreign functions are expected to:
Note that the foreign function, using common utility methods that the rest of the CSEC machine uses, will have to appropriately deserialise and serialise the values. In particular, it must be able to destructure AST nodes that it expects and reconstruct valid AST nodes that the CSEC machine expects.
Because the full control, stash, and environment is passed to the foreign function, this allows foreign functions to effectively do anything to the CSEC machine state. However, foreign functions are generally not permitted to expect anything except its arguments being defined in the current environment.
The dictionary used uses fully-qualified method descriptors to identify the correct foreign function. These are of the form:
Note that foreign functions always have the same type:
In particular, foreign functions always return nothing. Their results are to be injected directly into the stash that is provided, for use by the CSEC machine (and the program it is currently running).
The identifier is included in such a symbolic reference for the express purpose of making foreign functions easier to implement. Because foreign functions must retrieve their arguments from the environment, their implementations must be aware of the correct parameter identifiers.