Skip to content

Files

Latest commit

2066efa · Mar 3, 2025

History

History

vm

Specification of Concurrent Virtual Machine

Compared to Source § 3 , this concurrent system has the following changes:

  • The addition of three primitive concurrent functions.

For details, see Section “Concurrency Support” below.

Please refer to Appendix A for explanations about compilation rules and VM transition rules.

Overview

This concurrent system consists of concurrently executing (potentially multiple) code in multiple threads. Communication between threads is achieved by updating the values of memory shared between threads. Each thread is a collection of registers os , pc , e , rs , and at the expiration of the time quantum, the concurrent system executes the next thread in the thread queue.

To comply with the textbook, the thread that calls concurrent_execute also continues to execute the rest of its code concurrently with the threads. Furthermore, there is no join primitive concurrent function.

However, this design departs from the textbook, as the system does not execute parallel threads. Instead, it executes concurrent threads.

Concurrency Support

The following concurrent functions are supported:

  • '_' allowed only in math mode

    $\texttt{concurrent_execute(}\texttt{f}_\texttt{1}, \cdots \texttt{f}_\texttt{n}\texttt{)}$
    : primitive , setup multiple threads for concurrent execution. For each nullary function f i that returns undefined, setup a thread t i that executes the code in the body of f i . The thread that called concurrent_execute also executes concurrently with all t i . Returns undefined. This is an atomic operation.
  • test_and_set(a): primitive , assumes the head of array a is a boolean b . Sets the head of a to true. Returns b . This is an atomic operation.
  • clear(a): primitive , sets the head of array a to false. Returns undefined. This is an atomic operation.

EXECUTE Rules

Notes

  • For simplicity, heap is not represented in the rules.

Compiling

Missing dimension or its units for \Rule

$$\Rule{E_1 \translateaux s_1 \qquad \cdots \qquad E_n \translateaux s_n}{\texttt{concurrent_execute}(E_1, \cdots , E_n) \translateaux s_1. \cdots .s_n.\texttt{CALLV id n}}$$
Each of s 1 . . s n is a string of VM instruction that loads a closure onto the operand stack, and i d is the opcode of EXECUTE.

Running

There are additional structures in our VM:

  1. tq , a register which is a queue of threads.
  2. to , a register initialized with 0 , that indicates how many instructions are left for a thread to run.

The state representing our VM will have two more corresponding structures: ( os , pc , e , rs , tq , to ) The initial state of our VM is a pc , which has empty os , env , rs , and tq , and zero timeout: ( , pc , , , , 0 )

Some notes on program execution:

  1. When the program calls concurrent_execute, it executes the rest of its code along with the concurrent threads, therefore the sequential program becomes a concurrent thread.
  2. Consider the case when a program does not use any primitive concurrent functions. To avoid distinguishing between the case when the program is sequential and the program is concurrent, therefore simplifying the rules, the sequential program must still execute like a concurrent thread. This means sequential execution may time out, and may be pushed and popped from the thread queue. However, implementations of the VM may, as an optimization, avoid pushing and popping while executing sequential code.
  3. As a result, all programs here are concurrent programs, even if they do not call concurrent_execute.
  4. None of the programs here return a result, because concurrent programs should not return a result, since concurrent threads return nothing.

Design choices:

  1. Our concurrent system simulates a uniform distribution of execution traces, in the sense that running the same program infinitely many times should give a uniform distribution of program execution traces. This is in contrast to actual concurrent systems, where some traces occur with very low probability. There, the programmer may not even be aware that such low probability traces exist. So it is possible for the programmer to believe their program is correct, until they encounter a bug in one of the low probability execution traces. To avoid this, our concurrent system presents to the programmer each trace with equal probability. The programmer is then exposed more to low probability execution traces, and thinks more about low probability execution traces when reasoning about their code. We achieved this by inserting uniform randomness into the scheduling algorithm.
  2. With round-robin scheduling, where resource starvation is impossible, there are three possible ways of inserting uniform randomness, but some introduce starvation (in each way of inserting uniform randomness, we assume all other parts of the scheduling are non-random):
    Random removal from the thread queue
    allows starvation , since it is possible for some thread to never be removed, and thus never be scheduled.
    Random insertion into the thread queue
    allows starvation , since from all concurrent threads t i , whenever some concurrent thread t n performs a nested call of concurrent_execute that spawns children, it is possible for t n ’s children to be scheduled in front of all t i in the thread queue. If this scenario repeats again for nested calls to concurrent_execute in each of t n ’s children, then none of t i will ever be scheduled.
    Random time quanta
    does not allow starvation , since no priority is assigned to concurrent threads, so the ordering of existing concurrent threads in the thread queue is respected.

    Therefore, to avoid starvation, we choose to insert uniform randomness by allocating uniformly random time quanta to concurrent threads. This choice of inserting uniform randomness has the additional benefit of also being fair when the execution time of a single run goes to infinity: the expected amount of time allocated to each concurrent thread is equal.

Thread timeout

$$\Rule{}{ \begin{aligned} &(\textit{os}_1, \textit{pc}_1, \textit{e}_1, \textit{rs}_1, (\textit{os}_2, \textit{pc}_2, \textit{e}_2, \textit{rs}_2).\textit{tq}, 0)
\transition &(\textit{os}_2, \textit{pc}_2, \textit{e}_2, \textit{rs}_2, \textit{tq}.(\textit{os}_1, \textit{pc}_1, \textit{e}_1, \textit{rs}_1), c) \end{aligned}}$$ If a thread times out and has not finished execution (has not executed the RET statement), then it is enqueued on the thread queue, and the next thread is dequeued from the thread queue, with a random timeout value c .

The above rule assumes there is least one thread in the thread queue. To cover all cases, here is the rule for zero threads in the thread queue:

Missing dimension or its units for \Rule

$$\Rule{}{(\textit{os}, \textit{pc}, \textit{e}, \textit{rs}, \langle \rangle, 0) \transition (\textit{os}, \textit{pc}, \textit{e}, \textit{rs}, \langle \rangle, c)}$$

Running thread

Missing dimension or its units for \Rule

$$\Rule{s(\textit{pc}) ≠ \texttt{RET} \qquad \textit{to} > 0}{(\textit{os}, \textit{pc}, \textit{e}, \textit{rs}, \textit{tq}, \textit{to}) \transition (\textit{os’}, \textit{pc’}, \textit{e’}, \textit{rs’}, \textit{tq}, \textit{to}-1)}$$
where the primed values are just like normal VM code execution, and the timeout is initially nonzero, and then decrements.

Running thread, returning from function

Missing dimension or its units for \Rule

$$\Rule{s(\textit{pc}) = \texttt{RET} \qquad \textit{to} > 0 \qquad \textit{rs} ≠ \langle \rangle}{(\textit{os}, \textit{pc}, \textit{e}, \textit{rs}, \textit{tq}, \textit{to}) \transition (\textit{os’}, \textit{pc’}, \textit{e’}, \textit{rs’}, \textit{tq}, \textit{to}-1)}$$
where the primed values are just like normal VM code execution, and the timeout is initially nonzero, and then decrements. Note: the thread may execute the RET statement inside a function, and the thread does the normal thing of popping rs and so on.

Starting EXECUTE

$$\Rule{s(\textit{pc}) = \texttt{EXECUTE n} \qquad \textit{to} > 0}{ \begin{aligned} &((\textit{pc}_1, \textit{e}_1). \cdots .(\textit{pc}_n, \textit{e}_n).\textit{os}, \textit{pc}, \textit{e}, \textit{rs}, \langle \rangle, \textit{to})
\transition &(\textit{os}, \textit{pc}, \textit{e}, \textit{rs}, (\langle \rangle, \textit{pc}_1, \textit{e}_1, \langle \rangle). \cdots .(\langle \rangle, \textit{pc}_n, \textit{e}_n, \langle \rangle), \textit{to}-1) \end{aligned}}$$ Closures representing threads t i (two-tuples of pc i and e i ) on the operand stack are converted into threads t i . Thread t i is a four-tuple of each thread t i ’s own os i , pc i , e i , and rs i . Initially, thread t i has empty os i and empty rs i . The thread that calls concurrent_execute also continues to execute concurrently with the other threads. This is shown by the os , pc , e , rs being in the machine state after the transition arrow, and shown by the timeout decrementing. Note: we decrement timeout instead of setting a random timeout, since setting a random timeout makes starvation possible: the thread that only calls concurrent_execute, and calls concurrent_execute infinitely many times, will always be run.

Returning from thread

Missing dimension or its units for \Rule

$$\Rule{s(\textit{pc}_1) = \texttt{RET} \qquad \textit{to} > 0 \qquad \textit{rs}_1 = \langle \rangle}{(\textit{os}_1, \textit{pc}_1, \textit{e}_1, \textit{rs}_1, (\textit{os}_2, \textit{pc}_2, \textit{e}_2, \textit{rs}_2).\textit{tq}, 0) \transition (\textit{os}_2, \textit{pc}_2, \textit{e}_2, \textit{rs}_2, \textit{tq}, c)}$$
If a thread executes the RET statement, and the runtime stack is empty, then the thread is not enqueued on the thread queue, and the next thread is dequeued from the thread queue, with a random timeout value c .

The above rule assumes there is least one thread in the thread queue. To cover all cases, the rule for zero threads in the thread queue is in the next subsection:

Ending our VM

Missing dimension or its units for \Rule

$$\Rule{s(\textit{pc}) = \texttt{RET} \qquad \textit{to} > 0 \qquad \textit{rs} = \langle \rangle \qquad \textit{tq} = \langle \rangle}{(\textit{os}, \textit{pc}, \textit{e}, \textit{rs}, \textit{tq}, \textit{to}) \transition (\textit{os}, \textit{pc}, \textit{e}, \textit{rs}, \textit{tq}, \textit{to}-1)}$$
If a thread executes the RET statement, and both the runtime stack and the thread queue are empty, and the timeout is nonzero, then the timeout decrements, and our VM stops.

TEST_AND_SET and CLEAR Rules

Notes

  • For simplicity, all registers and heap are not represented in the rules, except os and pc .
  • test_and_set is an atomic operation.

Compiling

Missing dimension or its units for \Rule

$$\Rule{E \translateaux s}{\texttt{test_and_set}(E) \translateaux s.\texttt{CALLV id 1}}$$
where E is an array, whose head is a boolean, and i d is the opcode of TEST_AND_SET.

Missing dimension or its units for \Rule

$$\Rule{E \translateaux s}{\texttt{clear}(E) \translateaux s.\texttt{CALLV id 1}}$$
where E is an array, and i d is the opcode of CLEAR.

Running

Missing dimension or its units for \Rule

$$\Rule{s(\textit{pc}) = \texttt{TEST_AND_SET}}{(a.\textit{os},\textit{pc}) \transition (b.\textit{os},\textit{pc} + 1)}$$
where a is the address of an array stored on the heap. The head of this array is initially b , where b is a boolean. After this rule executes, the head of this array is set to true .

Missing dimension or its units for \Rule

$$\Rule{s(\textit{pc}) = \texttt{CLEAR}}{(a.\textit{os},\textit{pc}) \transition (\textit{os},\textit{pc} + 1)}$$
where a is the address of an array stored on the heap. The head of this array is updated to false .

Appendix A

Inference line

The horizontal inference line plays the role of if  then  in our earlier presentations of the rules. In general, in an inductive definition of a set X , an inference rule of the form

Missing dimension or its units for \Rule

$$\Rule{x_1 \quad \cdots \quad x_n}{x}$$
stands for the rule if  x 1 x n X , then  x X .

Compilation rules

The translation from SourceA to SVML is accomplished by a function \translate : SourceA SVML which uses the auxilary translation function \translateaux .

The auxiliary translation function \translateaux is defined by many rules, some of which we have already covered in this document: the rules for concurrent_execute, test_and_set, and clear. The other rules for the auxiliary translation function \translateaux will not be covered in this document. Instead, please refer to the document Source Virtual Machine Language.

VM transition rules

The machine that we will use to execute SVML programs is a variation of a \emph{push-down automaton}. Let us fix a specific program s . The machine M s that executes s is given as an automaton that transforms a given machine state to another state. The machine state is represented by so-called registers. In the case of SVML, we need two registers, called \emph{program counter}—denoted by the symbol \textit{pc}—and \emph{operand stack} —denoted by the symbol \textit{os}.

The program counter is used to point to a specific instruction in s , starting from position 0. For example, if pc = 2 , and s is the program $$\begin{aligned} \lbrack&\texttt{LDCN 1},
&\texttt{LDCN 2},\ &\texttt{PLUS},\ &\texttt{LDCN 3},\ &\texttt{TIMES},\ &\texttt{DONE}\ \rbrack \end{aligned}$$ , then s ( pc ) = PLUS .

\paragraph{Operand Stack} The operand stack is a sequence of values from Num + Bool + Fun .

\paragraph{Compilation of Names} Similar to the approach of the previous chapter, we implement names by environments. To this aim, we add a register e to the machine state. Register e represents the environment with respect to which the names are executed. Environments map names to denotable values. Thus an environment e , in which x refers to the number 1 can be accessed by applying e to x , e ( x ) = 1 .

\paragraph{Execution of Function Application} According to the translation of function application, the instruction CALL   n will find its arguments in reverse order on the operand stack, followed by the operator, which—according the the previous paragraph—is represented by a closure. To implement static scoping, the machine must take the environment of the closure, and extend it by a binding of the formal parameters to the actual arguments. Thus, the following rule is our first attempt to describe the execution of CALL  n .

Missing dimension or its units for \Rule

$$\Rule{s(\textit{pc}) = \texttt{CALL } n}{
  \begin{aligned}
  &(v_n.\ldots v_1.(\textit{address},x_1\cdots x_n,e’).\textit{os},\textit{pc},e)
  \transition &(\textit{os},\textit{address},e’[x_1 ← v_1]\cdots[x_n ← v_n])
  \end{aligned}}$$

There is, however, a major difficulty with this rule. What should happen when a function returns? In other words, what should the machine do when it encounters the instruction RTN after executing the function body? In particular, what should be the program counter, operand stack and environment after returning from a function? Of course, the program counter, operand stack and environment must be restored to their state before the function call.

In order to keep program execution in a simple loop, we need to make this return information explicit. Since functions can call other functions before returning, the natural data structure for this return information is a stack. We call this stack the \emph{runtime stack}. The runtime stack, denoted by rs , will be the forth (and last) register that we add to our machine state. Each entry in the runtime stack contains the address of the instruction to return to, and the operand stack os and environment e to be reinstalled after the function call. Such a triplet ( address , os , e ) is called \emph{runtime stack frame}, or simply \emph{stack frame}.

Function application pushes a new stack frame on the runtime stack, in addition to the actions described in the first attempt above. Thus, the actual rule for CALL   n is as follows.

Missing dimension or its units for \Rule

$$\Rule{s(\textit{pc}) = \texttt{CALL } n}{
  \begin{aligned}
  &(v_n.\ldots v_1.(\textit{address},x_1\cdots x_n,e’).\textit{os},\textit{pc},e,\textit{rs})
  \transition &(\langle \rangle,\textit{address},e’[x_1 ← v_1]\cdots[x_n ← v_n],(\textit{pc}+1,\textit{os},e).\textit{rs})
  \end{aligned}}$$

Now, we can describe the behavior of the machine M s as a transition function \transition , which transforms machine states to machine states, and which is defined by many rules, some of which we have already covered in this document: the rules for EXECUTE, TEST_AND_SET, and CLEAR. The other rules for the transition function \transition will not be covered in this document. Instead, please refer to the document Source VM Instruction Set.