Compared to Source
- The addition of three primitive concurrent functions.
For details, see Section “Concurrency Support” below.
Please refer to Appendix A for explanations about compilation rules and VM transition rules.
This concurrent system consists of concurrently executing (potentially multiple) code in multiple threads. Communication between threads is achieved by updating the values of memory shared between threads. Each thread is a collection of registers
To comply with the textbook, the thread that calls concurrent_execute
also continues to execute the rest of its code concurrently with the threads. Furthermore, there is no join
primitive concurrent function.
However, this design departs from the textbook, as the system does not execute parallel threads. Instead, it executes concurrent threads.
The following concurrent functions are supported:
-
:'_' allowed only in math mode
$\texttt{concurrent_execute(}\texttt{f}_\texttt{1}, \cdots \texttt{f}_\texttt{n}\texttt{)}$
, setup multiple threads for concurrent execution. For each nullary function that returns undefined
, setup a threadthat executes the code in the body of . The thread that called concurrent_execute
also executes concurrently with all. Returns undefined
. This is an atomic operation. -
test_and_set(a)
:, assumes the head of array a
is a boolean. Sets the head of a
totrue
. Returns. This is an atomic operation. -
clear(a)
:, sets the head of array a
tofalse
. Returnsundefined
. This is an atomic operation.
- For simplicity, heap is not represented in the rules.
$$\Rule{E_1 \translateaux s_1 \qquad \cdots \qquad E_n \translateaux s_n}{\texttt{concurrent_execute}(E_1, \cdots , E_n) \translateaux s_1. \cdots .s_n.\texttt{CALLV id n}}$$
EXECUTE
.
There are additional structures in our VM:
-
, a register which is a queue of threads. -
, a register initialized with , that indicates how many instructions are left for a thread to run.
The state representing our VM will have two more corresponding structures:
Some notes on program execution:
- When the program calls
concurrent_execute
, it executes the rest of its code along with the concurrent threads, therefore the sequential program becomes a concurrent thread. - Consider the case when a program does not use any primitive concurrent functions. To avoid distinguishing between the case when the program is sequential and the program is concurrent, therefore simplifying the rules, the sequential program must still execute like a concurrent thread. This means sequential execution may time out, and may be pushed and popped from the thread queue. However, implementations of the VM may, as an optimization, avoid pushing and popping while executing sequential code.
- As a result, all programs here are concurrent programs, even if they do not call
concurrent_execute
. - None of the programs here return a result, because concurrent programs should not return a result, since concurrent threads return nothing.
Design choices:
- Our concurrent system simulates a uniform distribution of execution traces, in the sense that running the same program infinitely many times should give a uniform distribution of program execution traces. This is in contrast to actual concurrent systems, where some traces occur with very low probability. There, the programmer may not even be aware that such low probability traces exist. So it is possible for the programmer to believe their program is correct, until they encounter a bug in one of the low probability execution traces. To avoid this, our concurrent system presents to the programmer each trace with equal probability. The programmer is then exposed more to low probability execution traces, and thinks more about low probability execution traces when reasoning about their code. We achieved this by inserting uniform randomness into the scheduling algorithm.
- With round-robin scheduling, where resource starvation is impossible, there are three possible ways of inserting uniform randomness, but some introduce starvation (in each way of inserting uniform randomness, we assume all other parts of the scheduling are non-random):
- Random removal from the thread queue
-
, since it is possible for some thread to never be removed, and thus never be scheduled. - Random insertion into the thread queue
-
, since from all concurrent threads , whenever some concurrent thread performs a nested call of concurrent_execute
that spawns children, it is possible for’s children to be scheduled in front of all in the thread queue. If this scenario repeats again for nested calls to concurrent_execute
in each of’s children, then none of will ever be scheduled. - Random time quanta
-
, since no priority is assigned to concurrent threads, so the ordering of existing concurrent threads in the thread queue is respected.
Therefore, to avoid starvation, we choose to insert uniform randomness by allocating uniformly random time quanta to concurrent threads. This choice of inserting uniform randomness has the additional benefit of also being fair when the execution time of a single run goes to infinity: the expected amount of time allocated to each concurrent thread is equal.
$$\Rule{}{
\begin{aligned}
&(\textit{os}_1, \textit{pc}_1, \textit{e}_1, \textit{rs}_1, (\textit{os}_2, \textit{pc}_2, \textit{e}_2, \textit{rs}_2).\textit{tq}, 0)
\transition &(\textit{os}_2, \textit{pc}_2, \textit{e}_2, \textit{rs}_2, \textit{tq}.(\textit{os}_1, \textit{pc}_1, \textit{e}_1, \textit{rs}_1), c)
\end{aligned}}$$
If a thread times out and has not finished execution (has not executed the RET
statement), then it is enqueued on the thread queue, and the next thread is dequeued from the thread queue, with a random timeout value
The above rule assumes there is least one thread in the thread queue. To cover all cases, here is the rule for zero threads in the thread queue:
$$\Rule{}{(\textit{os}, \textit{pc}, \textit{e}, \textit{rs}, \langle \rangle, 0) \transition (\textit{os}, \textit{pc}, \textit{e}, \textit{rs}, \langle \rangle, c)}$$
$$\Rule{s(\textit{pc}) ≠ \texttt{RET} \qquad \textit{to} > 0}{(\textit{os}, \textit{pc}, \textit{e}, \textit{rs}, \textit{tq}, \textit{to}) \transition (\textit{os’}, \textit{pc’}, \textit{e’}, \textit{rs’}, \textit{tq}, \textit{to}-1)}$$
$$\Rule{s(\textit{pc}) = \texttt{RET} \qquad \textit{to} > 0 \qquad \textit{rs} ≠ \langle \rangle}{(\textit{os}, \textit{pc}, \textit{e}, \textit{rs}, \textit{tq}, \textit{to}) \transition (\textit{os’}, \textit{pc’}, \textit{e’}, \textit{rs’}, \textit{tq}, \textit{to}-1)}$$
RET
statement inside a function, and the thread does the normal thing of popping
$$\Rule{s(\textit{pc}) = \texttt{EXECUTE n} \qquad \textit{to} > 0}{
\begin{aligned}
&((\textit{pc}_1, \textit{e}_1). \cdots .(\textit{pc}_n, \textit{e}_n).\textit{os}, \textit{pc}, \textit{e}, \textit{rs}, \langle \rangle, \textit{to})
\transition &(\textit{os}, \textit{pc}, \textit{e}, \textit{rs}, (\langle \rangle, \textit{pc}_1, \textit{e}_1, \langle \rangle). \cdots .(\langle \rangle, \textit{pc}_n, \textit{e}_n, \langle \rangle), \textit{to}-1)
\end{aligned}}$$
Closures representing threads concurrent_execute
also continues to execute concurrently with the other threads. This is shown by the concurrent_execute
, and calls concurrent_execute
infinitely many times, will always be run.
$$\Rule{s(\textit{pc}_1) = \texttt{RET} \qquad \textit{to} > 0 \qquad \textit{rs}_1 = \langle \rangle}{(\textit{os}_1, \textit{pc}_1, \textit{e}_1, \textit{rs}_1, (\textit{os}_2, \textit{pc}_2, \textit{e}_2, \textit{rs}_2).\textit{tq}, 0) \transition (\textit{os}_2, \textit{pc}_2, \textit{e}_2, \textit{rs}_2, \textit{tq}, c)}$$
RET
statement, and the runtime stack is empty, then the thread is not enqueued on the thread queue, and the next thread is dequeued from the thread queue, with a random timeout value
The above rule assumes there is least one thread in the thread queue. To cover all cases, the rule for zero threads in the thread queue is in the next subsection:
$$\Rule{s(\textit{pc}) = \texttt{RET} \qquad \textit{to} > 0 \qquad \textit{rs} = \langle \rangle \qquad \textit{tq} = \langle \rangle}{(\textit{os}, \textit{pc}, \textit{e}, \textit{rs}, \textit{tq}, \textit{to}) \transition (\textit{os}, \textit{pc}, \textit{e}, \textit{rs}, \textit{tq}, \textit{to}-1)}$$
RET
statement, and both the runtime stack and the thread queue are empty, and the timeout is nonzero, then the timeout decrements, and our VM stops.
- For simplicity, all registers and heap are not represented in the rules, except
and . -
test_and_set
is an atomic operation.
$$\Rule{E \translateaux s}{\texttt{test_and_set}(E) \translateaux s.\texttt{CALLV id 1}}$$
TEST_AND_SET
.
$$\Rule{E \translateaux s}{\texttt{clear}(E) \translateaux s.\texttt{CALLV id 1}}$$
CLEAR
.
$$\Rule{s(\textit{pc}) = \texttt{TEST_AND_SET}}{(a.\textit{os},\textit{pc}) \transition (b.\textit{os},\textit{pc} + 1)}$$
$$\Rule{s(\textit{pc}) = \texttt{CLEAR}}{(a.\textit{os},\textit{pc}) \transition (\textit{os},\textit{pc} + 1)}$$
The horizontal inference line plays the role of
$$\Rule{x_1 \quad \cdots \quad x_n}{x}$$
The translation from SourceA to SVML is accomplished by a function
The auxiliary translation function concurrent_execute
, test_and_set
, and clear
.
The other rules for the auxiliary translation function
The machine that we will use to execute SVML programs is a variation of a \emph{push-down automaton}. Let us fix a specific program
The program counter is used to point to a specific instruction in
&\texttt{LDCN 2},\
&\texttt{PLUS},\
&\texttt{LDCN 3},\
&\texttt{TIMES},\
&\texttt{DONE}\
\rbrack
\end{aligned}$$
, then
\paragraph{Operand Stack}
The operand stack is a sequence of values from
\paragraph{Compilation of Names}
Similar to the approach of the previous chapter, we implement names by environments. To this aim, we add a register
\paragraph{Execution of Function Application}
According to the translation of function application, the instruction
$$\Rule{s(\textit{pc}) = \texttt{CALL } n}{
\begin{aligned}
&(v_n.\ldots v_1.(\textit{address},x_1\cdots x_n,e’).\textit{os},\textit{pc},e)
\transition &(\textit{os},\textit{address},e’[x_1 ← v_1]\cdots[x_n ← v_n])
\end{aligned}}$$
There is, however, a major difficulty with this rule. What should happen when a function returns? In other words, what should the machine do when it encounters the instruction
In order to keep program execution in a simple loop, we need to make this return information explicit. Since functions can call other functions before returning, the natural data structure for this return information is a stack. We call this stack the \emph{runtime stack}. The runtime stack, denoted by
Function application pushes a new stack frame on the runtime stack, in addition to the actions described in the first attempt above. Thus, the actual rule for
$$\Rule{s(\textit{pc}) = \texttt{CALL } n}{
\begin{aligned}
&(v_n.\ldots v_1.(\textit{address},x_1\cdots x_n,e’).\textit{os},\textit{pc},e,\textit{rs})
\transition &(\langle \rangle,\textit{address},e’[x_1 ← v_1]\cdots[x_n ← v_n],(\textit{pc}+1,\textit{os},e).\textit{rs})
\end{aligned}}$$
Now, we can describe the behavior of the machine EXECUTE
, TEST_AND_SET
, and CLEAR
.
The other rules for the transition function