Commit Briefs

b7b095a156 Sergey Bronnikov

httpc: replace ibuf_alloc with xibuf_alloc (ligurio/gh-xxxx-httpc-xibuf_alloc, origin/ligurio/gh-xxxx-httpc-xibuf_alloc)

There is no check for NULL for a value returned by `ibuf_alloc`, the NULL will be passed to `memcpy()` if the aforementioned function will return a NULL. The patch fixes that by replacing `ibuf_alloc` with macros `xibuf_alloc` that never return NULL. Found by Svace. NO_CHANGELOG=codehealth NO_DOC=codehealth NO_TEST=codehealth


4a866f64d6 Serge Petrenko

limbo: speed up synchronous transaction queue processing

This patch optimizes the process of collecting ACKs from replicas for synchronous transactions. Before this patch, collecting confirmations was slow in some cases. There was a possible situation where it was necessary to go through the entire limbo again every time the next ACK was received from the replica. This was especially noticeable in the case of a large number of parallel synchronous requests. For example, in the 1mops_write bench with parameters --fibers=6000 --ops=1000000 --transaction=1, performance increases by 13-18 times on small clusters of 2-4 nodes and 2 times on large clusters of 31 nodes. Closes #9917 NO_DOC=performance improvement NO_TEST=performance improvement


58f3c93b66 Serge Petrenko

vclock: introduce `vclock_nth_element` and `vclock_count_ge`

Two new vclock methods have been added: `vclock_nth_element` and `vclock_count_ge`. * `vclock_nth_element` takes n and returns whatever element would occur in nth position if vclock were sorted. This method is very useful for synchronous replication because it can be used to find out the lsn of the last confirmed transaction - it's simply the result of calling this method with argument {vclock_size - replication_synchro_quorum} (provided that vclock_size >= replication synchro quorum, otherwise it is obvious that no transaction has yet been confirmed). * `vclock_count_ge` takes lsn and returns the number of components whose value is greater than or equal to lsn. This can be useful to understand how many replicas have already received a transaction with a given lsn. Part of #9917 NO_CHANGELOG=Will be added in another commit NO_DOC=internal


f0f9647d8b Serge Petrenko

replication: prohibit roll back due to `replication_synchro_timeout`

To better match the canonical Raft design, this patch prohibits automatic transaction rollback due to `replication.synchro_timeout`. A new compat option has been added for this purpose. The compat option is named `compat.replication_synchro_timeout` and is `'old'` by default. When set to 'new', the `replication.synchro_timeout` option has slightly different semantics. With this semantics, transactions are no longer rolled back at this timeout, `replication.synchro_timeout` is used only to wait confirmation in promote/demote and gc-checkpointing. If some transaction in limbo did not have time to commit within `replication_synchro_timeput`, the corresponding operation: promote/demote or gc-checkpointing can be aborted automatically (in this aspect, the behavior of the option is no different from what it was before). If 'old' is set, the option has the same semantics as before. In order to be able to understand from the code what value the `compat.replication_synchro_timeout` option is set to - 'old' or 'new', a special Boolean tweak `replication_synchro_timeout_enabled` was introduced. Note that PROMOTE and DEMOTE can still rollback a transaction. Only the ability to rollback by timeout has been prohibited. Closes #7486 @TarantoolBot document Title: new compat option: 'compat.replication_synchro_timeout' Product: Tarantool Since: 3.3 Root document: New page - https://www.tarantool.io/en/doc/latest/reference/reference_lua/compat/replication_synchro_timeout/ The `compat` module allows you to choose between: * the old behavior: unconfirmed synchronous transactions are rolled back after a `replication.synchro_timeout`. * and the new behavior: A synchronous transaction can remain in the synchro queue indefinitely until it reaches a quorum of confirmations. `replication.synchro_timeout` is used only to wait confirmation in promote/demote and gc-checkpointing. If some transaction in limbo did not have time to commit within `replication_synchro_timeput`, the corresponding operation: promote/demote or gc-checkpointing can be aborted automatically.


e319c21ca3 Serge Petrenko

limbo: introduce limits on synchro queue

Two new fields added to the structure: the `size` counter and the `max_size` limit (both in bytes). And also added the corresponding configuration parameter: `replication.synchro_queue_max_size`. The counter is increased on every enqueued `txn_limbo_entry`, and decreased once an entry leaves the `txn_limbo.queue`. Also, the `approx_len` field has been added to the `txn_limbo_entry` structure, so that at the time of adding/deleting an entry to the queue, we have access to the size of the corresponding entry in the journal. This limitation only applies to the master queue. Once the size of master queue reaches the maximum value, txn_limbo blocks incoming requests until some of the transactions in the queue have a quorum of confirmations and there is free space. This limitation does not apply during the recovery process, because otherwise tarantool may fail during the process of the xlog files, if limbo queue size exceeds `replication.synchro_queue_max_size` and user will have to pick up the correct value of the `replication.synchro_queue_max_size` option in order to recover from his xlogs. The size limit isn't strict, i.e. if there's at least one free byte, the whole entry fits and no blocking is involved. Part of #7486 NO_CHANGELOG=Will be added in another commit @TarantoolBot document Title: new configuration option: 'replication.synchro_queue_max_size' Product: Tarantool Since: 3.3 Root document: https://www.tarantool.io/en/doc/latest/reference/configuration/configuration_reference/ `replication.synchro_queue_max_size` puts a limit on the number of transactions in the master synchronous queue. `replication.synchro_queue_max_size` is measured in number of bytes to be written (0 means unlimited, which was the default behaviour before). This option affects only the behavior of the master, and defaults to 16 megabytes. Now that `replication.synchro_queue_max_size` is set on the master node, tarantool will discard new transactions that try to queue after the limit is reached. If a transaction had to be discarded, user will get an error message "The synchronous transaction queue is full". This limitation does not apply during the recovery process. The current synchro queue size can be known using `box.info.synchro.queue.size`: ```lua tarantool> box.info.synchro --- - queue: owner: 1 size: 60 busy: false len: 1 term: 2 quorum: 2 ... ``` [box-info-synchro] https://www.tarantool.io/en/doc/latest/reference/reference_lua/box_info/synchro/


78cfc5ef4c Vladimir Davydov

net_box: fix a crash when a trigger deletes itself

In 6d88274 we rewrote Lua module `internal.trigger` that is used in `net.box` and possibly some external modules. Old implementation held all triggers in a Lua table, so deleting a trigger on the fly wasn't a problem. Now we store all triggers in a list and simply iterate over it when the triggers are fired, so the trigger list became non-resistent to modifications on the fly. In order to fix it, let's simply use `trigger_run` - it is resistant to the list modifications. Closes #10622 NO_DOC=bugfix


37bf64b7d7 Serge Petrenko

test: fix flaky gh_9918_synchro_queue_additional_info_test.lua

This patch eliminates flaking in the `gh_9918_synchro_queue_additional_info_test.lua` test. The problem was that the test did not wait for the connection between master and replica to be established, and therefore master node was in the "orphan" state at the time `box.ctl.promote()` was called. Thus, it turned out that the master node became the owner of the limbo, but was still read only. To fix this, this patch simply calls `cg.replica_set:wait_for_fullmesh()` on the previous line before `box.ctl.promote()`. Closes #10463 NO_DOC=test fix NO_CHANGELOG=test fix


2bcac013b4 Serge Petrenko

third_party: fix nanoarrow install directory

Currently, in case of in-source build, the nanoarrow library is compiled in `<tarantool_git>/third_party/nanoarrow/`, although it is better to compile it in `<tarantool_git>/build/nanoarrow/`, because the `build` directory is ignored by `.gitignore`. Follow-up #10508 NO_DOC=build NO_TEST=build NO_CHANGELOG=build


2eee236f1e Alexander Turenko

config: use failover priorities in leader selection

Closes #10552 @TarantoolBot document Title: Use priorities when bootstrapping replica in supervised mode Now, when working in `replication.failover = supervised` mode the instance priorities specified in the `failover.replicasets.<replicaset-name>.priority` section are used to select the bootstrap leader when using `bootstrap_strategy: auto`. The replica with the highest priority is chosen as a bootstrap leader. If there are more than one instance with the highest priority the first one sorted by name alphabetically is chosen. Example: ```yaml replication: failover: supervised failover: replicasets: replicaset-001: priority: instance-002: 5 instance-003: -3 instance-004: 5 ``` Setting up the config like this will make Tarantool choose `instance-002` as the bootstrap leader.


23b21a4cdb Alexander Turenko

gcov: fix clang build

Fix the macros and cmake recipe for gcov to make it possible to build tarantool with clang compiler and gcov enabled. Similar to commit bd813168467e ("gcov: use __gcov_dump + __gcov_reset instead of __gcov_flush") But in clang those changes were made since version 12. https://github.com/llvm/llvm-project/commit/5809a32e7c2d79a9a463eb9c15cde994b42e3002 Closes #10612 NO_CHANGELOG=internal NO_DOC=internal NO_TEST=internal


Branches



























































































Tags

Tree

.editorconfigcommits | blame
.gdbinitcommits | blame
.gitattributescommits | blame
.github/
.gitignorecommits | blame
.gitmodulescommits | blame
.luacheckrccommits | blame
.pack.mkcommits | blame
.test.mkcommits | blame
AUTHORScommits | blame
CMakeLists.txtcommits | blame
CONTRIBUTING.mdcommits | blame
Doxyfilecommits | blame
Doxyfile.API.incommits | blame
FreeBSD/
LICENSEcommits | blame
README.FreeBSDcommits | blame
README.MacOSXcommits | blame
README.OpenBSDcommits | blame
README.mdcommits | blame
TODOcommits | blame
apk/
asan/
changelogs/
cmake/
debian/
doc/
docker/
extra/
patches/
perf/
rpm/
rump/
src/
static-build/
test/
test-run$commits | blame
third_party/
tools/

README.md

# Tarantool

[![Actions Status][actions-badge]][actions-url]
[![Code Coverage][coverage-badge]][coverage-url]
[![OSS Fuzz][oss-fuzz-badge]][oss-fuzz-url]
[![Telegram][telegram-badge]][telegram-url]
[![GitHub Discussions][discussions-badge]][discussions-url]
[![Stack Overflow][stackoverflow-badge]][stackoverflow-url]

[Tarantool][tarantool-url] is an in-memory computing platform consisting of a
database and an application server.

It is distributed under [BSD 2-Clause][license] terms.

Key features of the application server:

* Heavily optimized Lua interpreter with incredibly fast tracing JIT compiler,
  based on LuaJIT 2.1.
* Cooperative multitasking, non-blocking IO.
* [Persistent queues][queue].
* [Sharding][vshard].
* [Cluster and application management framework][cartridge].
* Access to external databases such as [MySQL][mysql] and [PostgreSQL][pg].
* A rich set of built-in and standalone [modules][modules].

Key features of the database:

* MessagePack data format and MessagePack based client-server protocol.
* Two data engines: 100% in-memory with complete WAL-based persistence and an
  own implementation of LSM-tree, to use with large data sets.
* Multiple index types: HASH, TREE, RTREE, BITSET.
* Document oriented JSON path indexes.
* Asynchronous master-master replication.
* Synchronous quorum-based replication.
* RAFT-based automatic leader election for the single-leader configuration.
* Authentication and access control.
* ANSI SQL, including views, joins, referential and check constraints.
* [Connectors][connectors] for many programming languages.
* The database is a C extension of the application server and can be turned
  off.

Supported platforms are Linux (x86_64, aarch64), Mac OS X (x86_64, M1), FreeBSD
(x86_64).

Tarantool is ideal for data-enriched components of scalable Web architecture:
queue servers, caches, stateful Web applications.

To download and install Tarantool as a binary package for your OS or using
Docker, please see the [download instructions][download].

To build Tarantool from source, see detailed [instructions][building] in the
Tarantool documentation.

To find modules, connectors and tools for Tarantool, check out our [Awesome
Tarantool][awesome-list] list.

Please report bugs to our [issue tracker][issue-tracker]. We also warmly
welcome your feedback on the [discussions][discussions-url] page and questions
on [Stack Overflow][stackoverflow-url].

We accept contributions via pull requests. Check out our [contributing
guide][contributing].

Thank you for your interest in Tarantool!

[actions-badge]: https://github.com/tarantool/tarantool/workflows/release/badge.svg
[actions-url]: https://github.com/tarantool/tarantool/actions
[coverage-badge]: https://coveralls.io/repos/github/tarantool/tarantool/badge.svg?branch=master
[coverage-url]: https://coveralls.io/github/tarantool/tarantool?branch=master
[telegram-badge]: https://img.shields.io/badge/Telegram-join%20chat-blue.svg
[telegram-url]: http://telegram.me/tarantool
[discussions-badge]: https://img.shields.io/github/discussions/tarantool/tarantool
[discussions-url]: https://github.com/tarantool/tarantool/discussions
[stackoverflow-badge]: https://img.shields.io/badge/stackoverflow-tarantool-orange.svg
[stackoverflow-url]: https://stackoverflow.com/questions/tagged/tarantool
[oss-fuzz-badge]: https://oss-fuzz-build-logs.storage.googleapis.com/badges/tarantool.svg
[oss-fuzz-url]: https://oss-fuzz.com/coverage-report/job/libfuzzer_asan_tarantool/latest
[tarantool-url]: https://www.tarantool.io/en/
[license]: LICENSE
[modules]: https://www.tarantool.io/en/download/rocks
[queue]: https://github.com/tarantool/queue
[vshard]: https://github.com/tarantool/vshard
[cartridge]: https://github.com/tarantool/cartridge
[mysql]: https://github.com/tarantool/mysql
[pg]: https://github.com/tarantool/pg
[connectors]: https://www.tarantool.io/en/download/connectors
[download]: https://www.tarantool.io/en/download/
[building]: https://www.tarantool.io/en/doc/latest/dev_guide/building_from_source/
[issue-tracker]: https://github.com/tarantool/tarantool/issues
[contributing]: CONTRIBUTING.md
[awesome-list]: https://github.com/tarantool/awesome-tarantool/