Commits
- Commit:
1e440c53f1d23f4ee4cbb049fc7c30d64bff0d6a
- From:
- Sergey Bronnikov <sergeyb@tarantool.org>
- Date:
vinyl test
- Commit:
25382617b95722da7a57ed58bbef3ce528177ab8
- From:
- Serge Petrenko <sergepetrenko@tarantool.org>
- Via:
- Kirill Yukhin <kyukhin@tarantool.org>
- Date:
replication: append NOP as the last tx row
Since we stopped sending local space operations in replication, the last
tx row has to be global in order to preserve tx boundaries on replica.
If the last row happens to be a local one, replica will never receive
the tx end marker, yielding the following errors:
`ER_UNSUPPORTED: replication does not support interleaving
transactions`.
In order to fix the problem append a global NOP row at the tx end if
it happens to end on a local row.
Follow-up #4114
Closes #4928
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
- Commit:
f41d1ddd5faf95483f66e3dfeb31ea51b4c7a997
- From:
- Serge Petrenko <sergepetrenko@tarantool.org>
- Via:
- Kirill Yukhin <kyukhin@tarantool.org>
- Date:
wal: fix tx boundaries
In order to preserve transaction boundaries in replication protocol, wal
assigns each tx row a transaction sequence number (tsn). Tsn is equal to
the lsn of the first transaction row.
Starting with commit 7eb4650eecf1ac382119d0038076c19b2708f4a1, local
space requests are assigned a special replica id, 0, and have their own
lsns. These operations are not replicated.
If a transaction starting with a local space operation ends up in the
WAL, it gets a tsn equal to the lsn of the local space request. Then,
during replication, when such a transaction is replicated, the local
space request is omitted, and replica receives a global part of the
transaction with a seemingly random tsn, yielding an ER_PROTOCOL error:
"Transaction id must be equal to LSN of the first row in the transaction".
Assign tsn as equal to the lsn of the first global row in the
transaction to fix the problem, and assign tsn as before for fully local
transactions.
Follow-up #4114
Part-of #4928
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
- Commit:
9fcbbb3e7d5e9f5a876ee27a7bf93303321e26b2
- From:
- Serge Petrenko <sergepetrenko@tarantool.org>
- Via:
- Kirill Yukhin <kyukhin@tarantool.org>
- Date:
applier: fix tx boundary check for half-applied txns
In case there are 2 "new" instances, running tarantool 2.2+,
master and replica, and one "old" instance, running an earlier tarantool
version, in a full-mesh cluster, it may happen that the "new" replica
receives part of a tx from an "old" instance, and the remaining part
from a "new" instance.
Since "new" instances preserve tx boundaries, "new" replica would skip
the tx remains assuming it has already applied the full tx if it has
applied the first tx row. This leads to gaps in "new" replica's WAL and
to skipping the remaining part of the tx forever.
Fix this behaviour to apply the full tx even if it's beginning is
already applied in mixed clusters.
Closes #5125
- Commit:
4c7d8281502b1a26c59ffc9a87e082e7e8826932
- From:
- Alexander V. Tikhonov <avtikhon@tarantool.org>
- Via:
- Kirill Yukhin <kyukhin@tarantool.org>
- Date:
test: fix flaky box/net.box_readahead_gh-3958 test
Issue:
[014] --- box/net.box_readahead_gh-3958.result Mon Jun 15 15:33:23 2020
[014] +++ box/net.box_readahead_gh-3958.reject Tue Jun 16 02:24:04 2020
[014] @@ -46,6 +46,7 @@
[014] ...
[014] test_run:wait_log('default', 'readahead limit is reached', 1024, 0.1)
[014] ---
[014] +- readahead limit is reached
[014] ...
[014] s:drop()
[014] ---
[014]
[014] Last 15 lines of Tarantool Log file [Instance "box"][/tarantool/test/var/014_box/box.log]:
[014] 2020-06-16 02:24:03.792 [5585] main/121/console/unix/: I> set 'read_only' configuration option to false
[014] 2020-06-16 02:24:03.834 [5585] iproto iproto.cc:606 W> stopping input on connection fd 26, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
[014] 2020-06-16 02:24:03.835 [5585] iproto iproto.cc:606 W> stopping input on connection fd 26, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
[014] 2020-06-16 02:24:03.835 [5585] iproto iproto.cc:606 W> stopping input on connection fd 26, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
[014] 2020-06-16 02:24:03.836 [5585] iproto iproto.cc:606 W> stopping input on connection fd 26, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
[014] 2020-06-16 02:24:03.836 [5585] iproto iproto.cc:606 W> stopping input on connection fd 26, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
[014] 2020-06-16 02:24:03.836 [5585] iproto iproto.cc:606 W> stopping input on connection fd 26, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
[014] 2020-06-16 02:24:03.836 [5585] iproto iproto.cc:606 W> stopping input on connection fd 26, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
[014] 2020-06-16 02:24:03.837 [5585] iproto iproto.cc:606 W> stopping input on connection fd 26, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
[014] 2020-06-16 02:24:03.837 [5585] iproto iproto.cc:606 W> stopping input on connection fd 26, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
[014] 2020-06-16 02:24:03.837 [5585] iproto iproto.cc:606 W> stopping input on connection fd 26, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
[014] 2020-06-16 02:24:03.951 [5585] main/121/console/unix/: space.h:336 E> ER_NO_SUCH_INDEX_ID: No index #1 is defined in space '_space'
[014] 2020-06-16 02:24:04.180 [5585] main/121/console/unix/: I> set 'readahead' configuration option to 128
[014] 2020-06-16 02:24:04.183 [5585] main/121/console/unix/: I> set 'readahead' configuration option to 102400
[014] 2020-06-16 02:24:04.189 [5585] main/453/console/unix/: I> set 'readahead' configuration option to 16320
Found that the root cause of the issue, was the previously run test
'box/net.box_call_blocks_gh-946.test.lua' on the same worker, in this
case the log output mistakenly checked by wait_log/grep_log test_run
function, which finds the grepping string in the log of the previous
test. To avoid of it the tests can be swapped in worker running queue
and in this case both tests pass, check swapped log output:
2020-06-17 10:57:39.881 [69372] main C> entering the event loop
2020-06-17 10:57:39.896 [69372] main/119/console/unix/: I> set 'readahead' configuration option to 128
2020-06-17 10:57:39.898 [69372] main/119/console/unix/: I> set 'readahead' configuration option to 102400
2020-06-17 10:57:40.003 [69372] main/156/console/unix/: I> set 'readahead' configuration option to 16320
2020-06-17 10:57:40.053 [69372] iproto iproto.cc:606 W> stopping input on connection fd 33, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
2020-06-17 10:57:40.056 [69372] iproto iproto.cc:606 W> stopping input on connection fd 33, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
2020-06-17 10:57:40.056 [69372] iproto iproto.cc:606 W> stopping input on connection fd 33, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
2020-06-17 10:57:40.058 [69372] iproto iproto.cc:606 W> stopping input on connection fd 33, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
2020-06-17 10:57:40.058 [69372] iproto iproto.cc:606 W> stopping input on connection fd 33, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
2020-06-17 10:57:40.061 [69372] iproto iproto.cc:606 W> stopping input on connection fd 33, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
2020-06-17 10:57:40.061 [69372] iproto iproto.cc:606 W> stopping input on connection fd 33, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
2020-06-17 10:57:40.062 [69372] iproto iproto.cc:606 W> stopping input on connection fd 33, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
2020-06-17 10:57:40.062 [69372] iproto iproto.cc:606 W> stopping input on connection fd 33, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
2020-06-17 10:57:40.063 [69372] iproto iproto.cc:606 W> stopping input on connection fd 33, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
2020-06-17 10:57:40.067 [69372] main C> got signal 15 - Terminated
Also found that 'readahead' issue from the first test blocks its
printing to log file due to suppressed. To fix this issue the
default server must be restarted at the very start of the test.
Closes #5082