Transaction API
The transaction API is an optional, Golem specific API that allows you to control the transactional and persistent behavior of a program.
The low-level API is language independent and provides direct control over Golem's transactional features. We also provide higher level, language specific libraries to some of the supported languages. Using these libraries is more safe and convenient than using the low-level API directly.
High-level APIs
Rust
The Rust-specific high-level transaction API is exposed as part of the golem-rust (opens in a new tab) crate.
Commit oplog
The oplog_commit
function waits until the oplog is committed to its persistent storage. The function takes a single argument, replicas
, with the desired number of storage replicas the worker's journal is replicated to. The function will block until the oplog is committed to the specified number of replicas, or, if this number is larger than the available number of replicas, until it is written to all the replicas.
Temporarily change the persistence level
The use_persistence_level
(returning a guard object that restores it on drop) and the with_persistence_level
(taking a function as a parameter) functions temporarily change the persistence level of the oplog.
The following persistence levels are available:
Level | Description |
---|---|
PersistenceLevel::PersistNothing | Nothing is persisted; all operations will be repeated during the worker's recovery |
PersistenceLevel::PersistRemoteSideEffects | All operations that affects remote entities such as HTTP requests, RPC calls etc. are persisted |
PersistenceLevel::Smart | Let Golem decide |
Temporarily change the retry policy
The use_retry_policy
(returning a guard object that restores it on drop) and the with_retry_policy
(taking a function as a parameter) functions temporarily change the retry policy applied when the component crashes.
The retry policy itself is used for any failure that occurs during the execution of the program, including fatal user errors (for example a Rust panic!
) and external service failures (for example a network error). The policy defines an exponential backoff, and is defined by the following parameters:
#[derive(Clone, Copy)]
pub struct RetryPolicy {
/// The maximum number of retries before the worker becomes permanently failed
pub max_attempts: u32,
/// The minimum delay between retries (applied to the first retry)
pub min_delay: Duration,
/// The maximum delay between retries
pub max_delay: Duration,
/// Multiplier applied to the delay on each retry to implement exponential backoff
pub multiplier: u32,
}
Temporarily change the idempotence mode
The use_idempotence_mode
(returning a guard object that restores it on drop) and the with_idempotence_mode
(taking a function as a parameter) functions temporarily change the idempotence mode.
The idempotence mode defines how the worker behaves when it receives the same operation multiple times. The following modes are available:
When idempotence is enabled, the worker will assume that the external operations are idempotent, and apply at-least-once semantics when recovering from a failure. This means that Golem only stores results of external calls that already either succeeded or failed - if there was no such result when the worker crashes, the operation will be repeated.
When idempotence is disabled, the worker implements at-most-once semantics. In case a failure occurs after the external request was sent, but before receiving a result, the worker becomes permanently failed as there is no way to know if the non-idempontent external call was performed.
The generate_idempotency_key
function generates an idempotency key (a Uuid
) that can be used to deduplicate external operations. The function is guaranteed to always return the same value at a given point in the program, so it can be passed to external services in order to ensure that the operation is idempotent.
Atomic regions
The atomically
function executes a given closure atomically. This means that if the worker crashes during the execution of the closure, the closure will be re-executed after the worker recovers.
Fallible transactions
Many times external operations (such as HTTP calls to remote hosts) need to be executed transactionally. If some of the operations failed the transaction need to be rolled back - compensation actions need to undo whatever the already successfully performed operations did.
A fallible transaction only deals with domain errors. Within the transaction every operation that succeeds gets recorded. If an operation fails, all the recorded operations get compensated in reverse order before the transaction block returns with a failure.
A fallible transaction can be executed using the fallible_transaction
function, by passing a closure that can execute operations on the open transaction (see below).
Infallible transactions
An infallible transaction must always succeed - in case of a failure or interruption, it gets retried. If there is a domain error, the compensation actions are executed before the retry.
An infallible transaction can be executed using the infallible_transaction
function, by passing a closure that can execute operations on the open transaction (see below).
Defining operations
An operation is something that can be executed within a fallible or infallible transaction.
It is defined with the following trait:
/// Represents an atomic operation of the transaction which has a rollback action.
///
/// Implement this trait and use it within a `transaction` block.
/// Operations can also be constructed from closures using `operation`.
pub trait Operation: Clone {
type In: Clone;
type Out: Clone;
type Err: Clone;
/// Executes the operation which may fail with a domain error
fn execute(&self, input: Self::In) -> Result<Self::Out, Self::Err>;
/// Executes a compensation action for the operation.
fn compensate(&self, input: Self::In, result: Self::Out) -> Result<(), Self::Err>;
}
There are multiple ways to define an operation:
-
Implement the trait manually
-
Use the
operation
function to create an operation from a pair of closures
pub fn operation<In: Clone, Out: Clone, Err: Clone>(
execute_fn: impl Fn(In) -> Result<Out, Err> + 'static,
compensate_fn: impl Fn(In, Out) -> Result<(), Err> + 'static,
) -> impl Operation<In = In, Out = Out, Err = Err>
- Use the
golem_operation
macro
The #[golem_operation(compensation=xyz)]
annotation can be applied to a function that takes any number of inputs, and returns a Result
. The compensation parameter must point to another function which can have one of the following forms:
- no parameter
- single parameter, getting the result of the successful operation
- multiple parameters where the first one is the result of the successful operation, and the rest of them are the inputs of the operation
When using this macro, it generates associated functions for the transaction so they can be directly called within the transaction in the following way:
#[golem_operation(compensation = compensation_step)]
fn transaction_step(step: u64) -> Result<bool, String> {
println!("Step {step}");
Ok(remote_call(step))
}
fn compensation_step(_: bool, step: u64) -> Result<(), String> {
println!("Compensating step {step}");
remote_call_undo(step);
Ok(())
}
fallible_transaction(|tx| {
tx.transaction_step(1)?;
tx.transaction_step(2)?;
Ok(11)
})?
Low-level API
The low-level API is exposed as part of the golem:api@0.2.0
WIT package.
The following table lists all the functions available:
Function | Description |
---|---|
get-oplog-index | Returns the current oplog index. |
set-oplog-index | Restarts execution from a previous oplog index. |
oplog-commit | Explicitly waits until the oplog is committed. |
mark-begin-operation | Marks the beginning of an atomic operation. |
mark-end-operation | Commits an atomic operation. |
get-retry-policy | Returns the current retry policy. |
set-retry-policy | Sets the retry policy. |
get-oplog-persistence-level | Returns the current oplog persistence level. |
set-oplog-persistence-level | Sets the oplog persistence level. |
get-idempotence-mode | Returns the current idempotence mode. |
set-idempotence-mode | Sets the idempotence mode. |
generate-idempotency-key | Generates an idempotency key. |
Get/set oplog index
The get-oplog-index
function returns a value of oplog-index
which can be passed to the set-oplog-index
function later. When set-oplog-index
is called, the execution of the program stops immediately, followed by recovering its state up to the provided index (pointing to where get-oplog-index
was called).
Each external operation followed by get-oplog-index
will be executed again.
This is true for set-oplog-index
as well, so it always should be called only conditionally based on some side-effecting call performed between get-oplog-index
and set-oplog-index
. Otherwise the program will be stuck in an infinite loop.
Commit oplog
The oplog-commit
function waits until the oplog is committed to its persistent storage. The function takes a single argument, replicas
, with the desired number of storage replicas the worker's journal is replicated to. The function will block until the oplog is committed to the specified number of replicas, or, if this number is larger than the available number of replicas, until it is written to all the replicas.
Atomic operations
The mark-begin-operation
and mark-end-operation
functions allow you to mark the beginning and end of an atomic operation. If any failure occurs before mark-end-operation
is called, all the external operations such as HTTP calls will be repeated when the operation is retried. Once mark-end-operation
is called, the result of all these operations gets persisted and will not be repeated in the future when the worker gets recovered.
Retry policies
The get-retry-policy
function returns the worker's current retry policy, which can be overridden using set-retry-policy
.
The retry policy itself is used for any failure that occurs during the execution of the program, including fatal user errors (for example a Rust panic!
) and external service failures (for example a network error). The policy defines an exponential backoff, and is defined by the following parameters:
/// Configures how the executor retries failures
record retry-policy {
/// The maximum number of retries before the worker becomes permanently failed
max-attempts: u32,
/// The minimum delay between retries (applied to the first retry)
min-delay: duration,
/// The maximum delay between retries
max-delay: duration,
/// Multiplier applied to the delay on each retry to implement exponential backoff
multiplier: u32
}
Persistence level
The get-oplog-persistence-level
function returns the worker's current oplog persistence level, which can be overridden using set-oplog-persistence-level
.
There are three persistence levels:
Level | Description |
---|---|
persist-nothing | Nothing is persisted; all operations will be repeated during the worker's recovery |
persist-remote-side-effects | All operations that affects remote entities such as HTTP requests, RPC calls etc. are persisted |
smart | Let Golem decide |
Disabling persistence for a region of the program can be used to make sure some operations are always repeated no matter how many times the worker gets loaded. An example could be registering or logging that the worker is loaded in memory to an external service.
The default setting is smart
.
Idempotence
The get-idempotence-mode
function returns the worker's current idempotence mode, which can be overridden using set-idempotence-mode
. The default setting is true
.
When idempotence is enabled, the worker will assume that the external operations are idempotent, and apply at-least-once semantics when recovering from a failure. This means that Golem only stores results of external calls that already either succeeded or failed - if there was no such result when the worker crashes, the operation will be repeated.
When idempotence is disabled, the worker implements at-most-once semantics. In case a failure occurs after the external request was sent, but before receiving a result, the worker becomes permanently failed as there is no way to know if the non-idempontent external call was performed.
The generate-idempotency-key
function generates an idempotency key (a UUID) that can be used to deduplicate external operations. The function is guaranteed to always return the same value at a given point in the program, so it can be passed to external services in order to ensure that the operation is idempotent.