Tarantool 3.3
Release date: November 29, 2024
Releases on GitHub: v. 3.3.0
The 3.3 release of Tarantool adds the following main product features and improvements for the Community and Enterprise editions:
- Community Edition (CE)
- Improvements around queries with offsets.
- Improvement in Raft implementation.
- Persistent replication state.
- New C API for sending work to the TX thread from user threads.
- JSON cluster configuration schema.
- New
on_event
callback in application roles. - API for user-defined alerts.
- Isolated instance mode.
- Automatic instance expulsion.
- New configuration option for Lua memory size.
- Enterprise Edition (EE)
- Offset-related improvements in read views.
- Supervised failover improvements.
Tarantool 3.3 brings a number of improvements around queries with offsets.
The performance of tree index select() with offset and count() methods was improved. Previously, the algorithm complexity had a linear dependency on the provided offset size (
O(offset)
) or the number of tuples to count. Now, the new algorithm complexity isO(log(size))
wheresize
is the number of tuples in the index. This change also eliminates the dependency on the offset value or the number of tuples to count.The index and space entities get a new
offset_of
method that returns the position relative to the given iterator direction of the tuple that matches the given key.-- index: {{1}, {3}} index:offset_of({3}, {iterator = 'eq'}) -- returns 1: [1, <3>] index:offset_of({3}, {iterator = 'req'}) -- returns 0: [<3>, 1]
The
offset
parameter has been added to the index:pairs() method, allowing to skip the first tuples in the iterator.
Same improvements are also introduced to read views in the Enterprise Edition.
- Improved performance of the tree index read view
select()
with offset. - A new
offset_of()
method of index read views. - A new
offset
parameter in theindex_read_view:pairs()
method.
To better match the canonical Raft algorithm design, Tarantool no longer rolls back synchronous transactions on timeout (upon reaching replication.synchro_timeout). In the new implementation, transactions can only be rolled back by a new leader after it is elected. Otherwise, they can wait for a quorum infinitely.
Given this change in behavior, a new replication_synchro_timeout
compat option is introduced.
To try the new behavior, set this option to new
:
In YAML configuration:
compat: replication_synchro_timeout: new
In Lua code:
tarantool> require('compat').replication_synchro_timeout = 'new' --- ...
There is also a new replication.synchro_queue_max_size
configuration option
that limits the total size of transactions in the master synchronous queue. The default
value is 16 megabytes.
New public C API functions tnt_tx_push()
and tnt_tx_flush()
allow to send work to the TX thread from any other thread:
tnt_tx_push()
schedules the given callback to be executed with the provided arguments.tnt_tx_flush()
sends all pending callbacks for execution in the TX thread. Execution is started in the same order as the callbacks were pushed.
Tarantool cluster configuration schema is now available in the JSON format. A schema lists configuration options of a certain Tarantool version with descriptions. As of Tarantool 3.3 release date, the following versions are available:
Additionally, there is the latest schema that reflects the latest configuration schema in development (master branch).
Use these schemas to add code completion for YAML configuration files and get hints with option descriptions in your IDE, or validate your configurations, for example, with check-jsonschema:
$ check-jsonschema --schemafile https://download.tarantool.org/tarantool/schema/config.schema.3.3.0.json config.yaml
There is also a new API for generating the JSON configuration schema as a Lua table –
the config:jsonschema()
function.
Now application roles can have on_event
callbacks.
They are executed every time a box.status
system event is
broadcast or the configuration is updated. The callback has three arguments:
config
– the current configuration.key
– an event that has triggered the callback:config.apply
orbox.status
.value
– the value of thebox.status
system event.
Example:
return {
name = 'my_role',
validate = function() end,
apply = function() end,
stop = function() end,
on_event = function(config, key, value)
local log = require('log')
log.info('on_event is triggered by ' .. key)
log.info('is_ro: ' .. value.is_ro)
log.info('roles_cfg.my_role.foo: ' .. config.foo)
end,
}
Now developers can raise their own alerts from their application or application roles.
For this purpose, a new API is introduced into the config
module.
The config:new_alerts_namespace()
function creates a new
alerts namespace – a named container for user-defined alerts:
local config = require('config')
local alerts = config:new_alerts_namespace('my_alerts')
Alerts namespaces provide methods for managing alerts within them. All user-defined
alerts raised in all namespaces are shown in box.info.config.alerts
.
To raise an alert, use the namespace methods add()
or set()
:
The difference between them is that set()
accepts a key to refer to the alert
later: overwrite or discard it. An alert is a table with one mandatory field message
(its value is logged) and arbitrary used-defined fields.
-- Raise a new alert.
alerts:add({
message = 'Test alert',
my_field = 'my_value',
})
-- Raise a new alert with a key.
alerts:set("my_alert", {
message = 'Test alert',
my_field = 'my_value',
})
You can discard alerts individually by keys using the unset()
method, or
all at once using clear()
:
alerts:unset("my_alert")
alerts:clear()
Since version 3.3, Tarantool allows DDL operations before calling box.schema.upgrade()
during an upgrade if the source schema version is 2.11.1 or later. This allows,
for example, granting execute access to user-defined functions in the cluster configuration
before the schema is upgraded.
A new instance-level configuration option isolated
puts an instance into the
isolated mode. In this mode, an instance doesn’t accept updates from other members
of its replica set and other iproto requests. It also performs no background
data modifications and remains in read-only mode.
groups:
group-001:
replicasets:
replicaset-001:
instances:
instance-001: {}
instance-002: {}
instance-003:
isolated: true
Use the isolated mode to temporarily isolate instances for maintenance, debugging, or other actions that should not affect other cluster instances.
A new configuration section replication.autoexpel
allows to automatically expel
instances after they are removed from the YAML configuration.
replication:
autoexpel:
enabled: true
by: prefix
prefix: '{{ replicaset_name }}'
The section includes three options:
enabled
: whether automatic expulsion logic is enabled in the cluster.by
: a criterion for selecting instances that can be expelled automatically. In version 3.3, the only available criterion isprefix
.prefix
: a prefix with which an instance name should start to make automatic expulsion possible.
A new configuration option lua.memory
specifies the maximum amount of memory
for Lua scripts execution, in bytes. For example, this configuration sets the Lua memory
limit to 4 GB:
lua:
memory: 4294967296
The default limit is 2 GB.
Tarantool 3.3 is receiving a number of supervised failover improvements:
Support for Tarantool-based stateboard as an alternative to etcd.
Instance priority configuration: new
failover.priority
configuration section. This section specify the instances“ relative order of being appointed by a coordinator: bigger values mean higher priority.failover: replicasets: replicaset-001: priority: instance-001: 5 instance-002: -5 instance-003: 4
Additionally, there is a
failover.learners
section that lists instances that should never be appointed as replica set leaders:failover: replicasets: replicaset-001: learners: - instance-004 - instance-005
Automatic failover configuration update.
Failover logging configuration with new configuration options
failover.log.to
andfailover.log.file
:failover: log: to: file # or stderr file: var/log/tarantool/failover.log
Learn more about supervised failover in Supervised failover.
Tarantool persistence mechanism uses two types of files: snapshots and write-ahead log (WAL) files. These files are also used for replication: read-only replicas receive data changes from the replica set leader by reading these files.
The garbage collector cleans up obsolete snapshots and WAL files, but it doesn’t remove the files while they are in use for replication. To make such a check possible, the replica set leaders store the replication state in connection with files. However, this information was not persisted, which could lead to issues in case of the leader restart. The garbage collector could delete WAL files after the restart even if there were replicas that still read these files. The wal.cleanup_delay configuration option was used to prevent such situations.
Since version 3.3, leader instances persist the information about WAL files in use
in a new system space _gc_consumers
. After a restart, the replication state
is restored, and WAL files needed for replication are protected from garbage collection.
This eliminates the need to keep all WAL files after a restart, so the wal.cleanup_delay
option is now deprecated.