Evaluating Expressions, part 7 – Transactions and Exceptions
In part 7 of our series implementing programming language constructs with actors, we implement parallel execution of block statements. Parallel execution motivates the use of single-assignment data-flow variables. We also introduce transactions and exception handling. The only extension required to our grammar from part 6 is the inclusion of a THROW
statement:
stmt ::= 'LET' <eqtn> | 'SEND' <expr> 'TO' <expr> | 'CREATE' <ident> 'WITH' <expr> | 'BECOME' <expr> | 'THROW' <expr> | <expr>; ...
Parallel Execution
The implementation of statement blocks in part 6 was strictly sequential. Each statement completed before the next was executed. But, statements should really be executed in parallel. We can accomplish parallel execution by changing the statement-pair behavior.
LET stmt_pair_beh(h_stmt, t_stmt) = \(cust, #exec, env, sponsor).[ CREATE call WITH call_pair_beh(h_stmt, t_stmt) SEND (k_call, #exec, env, sponsor) TO call CREATE k_call WITH \result.[ SEND stmt_pair_result(result) TO cust ] ] LET stmt_pair_result = \result.( CASE result OF (#ok, #ok) : #ok (#fail, _) : #fail (_, #fail) : #fail (#ok, beh) : beh (beh, #ok) : beh _ : #fail END )
When stmt_pair_beh receives an #exec
request, it creates a call (defined in part 3) to execute the head statement h_stmt and the tail statement t_stmt concurrently. The customer for the call k_call receives the result, which is a pair of values indicating the status of each statement execution. These status values are combined, using the stmt_pair_result function, into a final result value, which is sent to the original customer cust.
Data-flow Environments
In part 6 our statement blocks used dynamic_env_beh to execute within an environment that allowed binding of new variables and rebinding of existing variables. Since each statement was executed sequentially, the order of mutations to the environment was well defined. Now that statements are executed in parallel, there are potential race conditions between variable bindings and usage of those variables. We will resolve these race conditions by implementing single-assignment data-flow variables.
Static analysis of a statement block can easily determine which identifiers will be bound by execution of the block. We use this list of identifiers to create a block environment which declares initially unbound data-flow variables for each identifier.
LET block_stmt_beh(vars, stmt) = \(cust, #exec, env, sponsor).[ CREATE env' WITH block_env_beh(env) SEND (vars, env') TO extend CREATE extend WITH \(vlist, xenv).[ CASE vlist OF (first, rest) : [ SEND (k_declare, #declare, first) TO xenv CREATE k_declare WITH \xenv'.[ SEND (rest, xenv') TO extend ] ] _ : [ SEND (cust, #exec, xenv, sponsor) TO stmt ] ] ]
When block_stmt_beh receives an #exec
request, it creates a block environment env’. A temporary actor extend is created to traverse the (NIL
-terminated) list of identifiers vars that will be bound by the block. For each identifier first, a #declare
request is sent to the block environment xenv and the extended environment is passed back to extend along with the remaining identifiers rest. Once all of the data-flow variables have been declared, the extended environment xenv is used to send an #exec
request to the statement stmt.
Block Environment
Block environments add a #declare
request to the environment protocol. The #declare
request extends the environment with an initially unbound data-flow variable, associated with an identifier.
LET block_env_beh(next) = \(cust, req).[ CASE req OF (#declare, ident) : [ CREATE next' WITH unbound_env_beh(SELF, ident, NIL, next) BECOME block_env_beh(next') SEND SELF TO cust ] _ : [ SEND (cust, req) TO next ] END ]
When block_env_beh receives a #declare
request, it creates a new unbound data-flow variable next’ associated with the identifier ident and inserts it as the new next binding for this environment. The extended block environment SELF
is sent to the customer cust. All other requests are forwarded on to the next actor in the environment chain.
Data-flow Variable
An unbound data-flow variable buffers #lookup
requests until it receives a #bind
request. Once the binding has occurred, and all buffered requests have been satisfied, it behaves like a regular bound variable. This mechanisms allow execution of multiple concurrent statements, each of which may #lookup
or #bind
the identifier, and ensures that the #bind
request completes before any of the #lookup
requests are completed. Data dependencies between independent sub-expressions are thus automatically resolved.
LET unbound_env_beh(scope, ident, waiting, next) = \(cust, req).[ CASE req OF (#lookup, $ident) : [ # wait for binding BECOME unbound_env_beh(scope, ident, (cust, waiting), next) ] (#bind, $ident, value) : [ BECOME bound_env_beh(scope, ident, value, next) SEND waiting TO NEW \list.[ # deliver value to waiting IF $list = (first, rest) [ SEND value TO first SEND rest TO SELF ] ] SEND scope TO cust ] _ : [ SEND (cust, req) TO next ] END ]
When unbound_env_beh receives a #lookup
request matching the identifier ident, it adds the customer cust to the list of waiting customers. Note that there is no response to the customer at this point in time.
When unbound_env_beh receives a #bind
request matching the identifier ident, it becomes bound to the value and sends responses to each of the customers in the waiting list. The enclosing block environment scope is sent to the #bind
customer as the environment “extended” by the binding.
All other requests are forwarded on to the next actor in the environment chain.
Bound Variable
A single-assignment variable can only be bound once within a particular scope. In order to support this, the bound environment behavior env_beh from part 5 is extended to handle #bind
requests.
LET bound_env_beh(scope, ident, value, next) = \(cust, req).[ CASE req OF (#lookup, $ident) : [ SEND value TO cust ] (#bind, $ident, $value) : [ SEND scope TO cust ] (#bind, $ident, value') : [ SEND ? TO cust ] _ : [ SEND (cust, req) TO next ] END ]
When bound_env_beh receives a #bind
request matching the identifier ident, there are two cases to consider. If the value matches the bound value, then the scope is sent to the customer cust, indicating that the binding was successful (although redundant). If the value does not match, then the undefined value ?
is sent to the customer cust, indicating an error (the environment containing the conflicting bindings is “undefined”).
Exceptions and Transactions
We’ve worked very hard to give a reasonable meaning to practically every combination of language elements. For example, we’ve defined all functions to be “total” by providing a specific “undefined” value ?
. However, it seems inevitable that some of our imperative statements will “fail” under certain conditions. When they do, we need a way to signal an exception.
When an exception occurs during the execution of an actor behavior, we ensure that none of the effects (SEND
, CREATE
and BECOME
) are visible to other actors. This amounts to establishing a kind of transaction within which the actor behavior executes. The execution sponsor (described in part 6) is a natural place to implement these transactional semantics.
As a first example, consider the LET
statement (from part 5). When the equation in a LET
statement fails to unify (as discussed in part 4), the LET
statements “fails”. We modify the behavior of the LET
statement to “throw” an exception when it fails.
LET let_stmt_beh(eqtn) = \(cust, #exec, env, sponsor).[ SEND (k_env, #unify, env) TO eqtn CREATE k_env WITH \env'.[ CASE env' OF ? : [ SEND (cust, #throw, #conflict, eqtn) TO sponsor ] _ : [ SEND #ok TO cust ] END ] ]
When let_stmt_beh receives an #exec
request, and the equation eqtn responds to the #unify
request with ?
(indicating unification failure), a #throw
request is sent to the sponsor, which will eventually send #fail
to the customer cust.
Throw Statement
The THROW
statement allows a behavior to explicitly signal an exception. An arbitrary expression is evaluated to produce an exception value that can provide information about the cause of the exception.
LET throw_stmt_beh(expr) = \(cust, #exec, env, sponsor).[ SEND (k_throw, #eval, env) TO expr # evaluate exception expression CREATE k_throw WITH \exception.[ SEND (cust, #throw, exception) TO sponsor ] ]
When throw_stmt_beh receives an exec
request, it sends an #eval
request to the exception expression expr. The customer for this request k_throw receives the exception value and sends it in a #throw
request to the sponsor. The sponsor is expected to eventually send #fail
to the customer cust.
Transaction Sponsor
In part 6 of this series we introduced the concept of a sponsor. A sponsor can monitor and control access to computational resources. When an actor wants to CREATE
another actor, or SEND
a message, we delegate that action to the sponsor. This allows the sponsor to implement the transactional semantics we need in order to properly support exceptions.
When an actor begins processing a message, we provide a sponsor which captures the actions requested by the actor. We keep a list of actors created and messages sent during execution of the actor’s behavior. If the behavior completes successfully, we commit the transaction, obtaining the final list of actors and messages.
LET ok_xact_beh(mlist, alist) = \(cust, req).[ CASE req OF #new : [ CREATE actor WITH new_actor_beh BECOME ok_xact_beh(mlist, (actor, alist)) SEND actor TO cust ] (#send, msg, target) : [ BECOME ok_xact_beh(((msg, target), mlist), alist) SEND #ok TO cust ] (#throw, exception) : [ # raise an exception BECOME fail_xact_beh(exception) SEND #fail TO cust ] #commit : [ SEND (mlist, alist) TO cust ] #revert : [ SEND (#exception, #ok) TO cust ] END ]
When ok_xact_beh receives a #new
request, it creates a new uninitialized actor, adds it to the list of actors created, and sends it to the customer cust.
When ok_xact_beh receives a #send
request, it adds the message msg and target actor to the list of pending messages, and sends #ok
to the customer cust.
When ok_xact_beh receives a #throw
request, it becomes fail_xact_beh (a failed transaction) and sends #fail
to the customer cust.
When ok_xact_beh receives a #commit
request, it sends the list of pending messages mlist and the list of created actors alist to the customer cust.
When ok_xact_beh receives a #revert
request, it sends an exception to the customer cust indicating that a non-failed transaction was reverted. This should not happen. We expect a successful transaction to be committed instead.
If an exception is thrown during execution of the actor’s behavior, any actors created or messages sent are forgotten (and reclaimed by garbage collection). We retain the exception value and ignore further CREATE
and SEND
actions. The executing actor’s behavior will eventually #fail
, causing us to revert the transaction, obtaining the value of the exception thrown.
LET fail_xact_beh(exception) = \(cust, req).[ CASE req OF #new : [ SEND NEW new_actor_beh TO cust ] #commit : [ SEND (NIL, NIL) TO cust ] #revert : [ SEND (#exception, exception) TO cust ] _ : [ SEND #fail TO cust ] END ]
When fail_xact_beh receives a #new
request, it creates a new uninitialized actor and sends it to the customer cust. This is needed to allow processing of concurrent statements in the actor behavior to complete. The created actor is forgotten.
When fail_xact_beh receives a #commit
request, it sends (NIL, NIL)
to the customer cust indicating that the transaction had no effect. This should not happen. We expect a failed transaction to be reverted instead.
When fail_xact_beh receives a #revert
request, it sends the exception to the customer cust.
When fail_xact_beh receives any other request, it sends #fail
to the customer cust.
Monitor
When a message is dispatched to an actor behavior the customer provided is a monitor that eventually receives an #ok
or #fail
status. If the status is #ok
, we commit the transaction held by the sponsor, dispatching the pending messages. If the status is #fail
, we revert the transaction. In both cases the effects of dispatching this particular message/target pair is sent to the configuration in a #trace
message (for logging and debugging).
LET monitor_beh(config, msg, target, sponsor) = \result.[ CASE result OF #ok : [ SEND (k_ok, #commit) TO sponsor CREATE k_ok WITH \(mlist, alist).[ SEND (#trace, (msg, target), (mlist, alist)) TO config SEND mlist TO SELF BECOME \list.[ IF $list = ((m, a), rest) [ SEND (#send, m, a) TO config SEND rest TO SELF ] ] ] ] #fail : [ SEND (k_fail, #revert) TO sponsor CREATE k_fail WITH \(#exception, exception).[ SEND (#trace, (msg, target), (#exception, exception)) TO config ] ] END ]
When monitor_beh receives a result of #ok
, it sends a #commit
request to the sponsor. The customer k_ok receives the list of pending messages mlist and the list of created actors alist, and sends them in a #trace
message to the configuration config. The pending message list is dispatched by sending #send
requests to the configuration config for each message m and target actor a.
When monitor_beh receives a result of #fail
, it sends a #revert
request to the sponsor. The customer k_fail receives the exception, and sends it in a #trace
message to the configuration config.
Configuration
The configuration is responsible for actually performing the actions requested, either by an actor’s behavior, or injected from outside the configuration. When a configuration dispatches a message, it creates a sponsor to capture the transaction and a monitor to handle the final status of the dispatch process.
LET config_beh(log) = \msg.[ CASE msg OF (#new, cust) : [ CREATE actor WITH new_actor_beh SEND actor TO cust SEND (#trace, ?, (NIL, (actor, NIL))) TO SELF ] (#send, msg, target) : [ CREATE sponsor WITH ok_xact_beh(NIL, NIL) CREATE monitor WITH monitor_beh(SELF, msg, target, sponsor) SEND (monitor, sponsor, msg) TO target ] (#trace, action, result) : [ SEND (action, result) TO log ] END ]
When config_beh receives a #new
message, it creates a new uninitialized actor and sends it to the customer cust. It also sends a #trace
message to itself, indicating that an actor was created outside of an actor behavior (thus the action is “undefined”).
When config_beh receives a #send
message, it creates a new transaction sponsor and a monitor, and uses them to dispatch the message msg to the target actor. Note that the actor behavior (described in part 6) is serialized, so concurrently dispatched messages are handled properly.
When config_beh receives a #trace
message, it sends the action and result to the log, where the provenance of all actions may be recorded.
Conclusion
Over the course of this 7-part series we have incrementally constructed a meta-circular description of the Humus programming language. We started with a simple, but Turing-complete, functional core in part 1. Special handling of conditional expressions was explored in part 2. Composite values, built from pairs, and parallel evaluation were presented in part 3. The pure-functional language of expressions was completed in part 4, with the introduction of pattern-matching equations, a powerful alternative to assignment. Mutable state was introduced in part 5, as a means of handling mutually-recursive references. With this strong foundation laid, part 6 presented an implementation of the actor primitives (SEND
, CREATE
and BECOME
). And in this final, part 7, we have rounded out the language with transactions and exceptions, allowing fully-concurrent semantics even within the behavior of an actor.
Along the way we have seen many patterns and techniques for building non-trivial systems with actors. We have used concurrent, rather than sequential, composition pervasively. We have established and evolved a variety of protocols between actors, allowing significantly different implementations to be substituted in a modular fashion. One of these, the “environment”, was shown at least three semantically-distinct ways, to support the needs of static-binding, dynamic-binding and single-assignment data-flow variables. Hopefully, this exploration will help you better understand the elegance and strength of Humus as a pure-actor language.
Tags: actor, concurrency, data-flow, evaluation, exception, execution, language, parallel, protocol, synchronization, transaction
[…] When an actor’s behavior function is applied to a message, it results in a block statement that represents the actions the actor will take in response to the message. The block establishes a dynamic environment for any bindings created by the statements. The statements are chained together and executed sequentially (parallel execution is described in part 7). […]