You are here

Galera Cluster and Antivirus Scanner on Linux

Today we had to investigate in a very strange behaviour of IST and SST on a MariaDB Galera Cluster.

The symptom was, that some Galera Cluster nodes took a very long time to start. Up to 7 minutes. So the customer was concluding that the Galera Cluster node does an SST instead of an IST and was asking why the SST happens.

It have to be mentioned here, that the MariaDB error log is very confusing about whether it is an SST or an IST. So the customer was confused and concluded, that MariaDB Galera Cluster was doing an SST instead of IST.

Further confusing was that this behaviour was not consistently on all 3 nodes and not consistently on the 3 stages production, test and integration.

First we had to clear if the Galera node was doing an IST or an SST to exclude problems with Galera Cache or event Bugs in MariaDB Galera Cluster. For this we were running our famous insert_test.sh and did some node restarts with forcing SST and without.

As a Galera Cluster operator you must mandatorily be capable to determine which one of both State Transfers happens from the MariaDB error log:

MariaDB Error Log with IST on Joiner

2017-12-12 22:29:33 140158145914624 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 204013)
2017-12-12 22:29:33 140158426741504 [Note] WSREP: State transfer required: 
        Group state: e2fbbca5-df26-11e7-8ee2-bb61f8ff3774:204013
        Local state: e2fbbca5-df26-11e7-8ee2-bb61f8ff3774:201439
2017-12-12 22:29:33 140158426741504 [Note] WSREP: New cluster view: global state: e2fbbca5-df26-11e7-8ee2-bb61f8ff3774:204013, view# 7: Primary, number of nodes: 3, my index: 2, protocol version 3
2017-12-12 22:29:33 140158426741504 [Warning] WSREP: Gap in state sequence. Need state transfer.
2017-12-12 22:29:33 140158116558592 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '127.0.0.1' --datadir '/home/mysql/database/magal-101-b/data/'  --defaults-file '/home/mysql/database/magal-101-b/etc/my.cnf'  --parent '16426' --binlog '/home/mysql/database/magal-101-b/binlog/laptop4_magal-101-b__binlog' '
2017-12-12 22:29:33 140158426741504 [Note] WSREP: Prepared SST request: rsync|127.0.0.1:4444/rsync_sst
2017-12-12 22:29:33 140158426741504 [Note] WSREP: REPL Protocols: 7 (3, 2)
2017-12-12 22:29:33 140158426741504 [Note] WSREP: Assign initial position for certification: 204013, protocol version: 3
2017-12-12 22:29:33 140158203852544 [Note] WSREP: Service thread queue flushed.
2017-12-12 22:29:33 140158426741504 [Note] WSREP: IST receiver addr using tcp://127.0.0.1:5681
2017-12-12 22:29:33 140158426741504 [Note] WSREP: Prepared IST receiver, listening at: tcp://127.0.0.1:5681
2017-12-12 22:29:33 140158145914624 [Note] WSREP: Member 2.0 (Node B) requested state transfer from 'Node C'. Selected 1.0 (Node C)(SYNCED) as donor.
2017-12-12 22:29:33 140158145914624 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 204050)
2017-12-12 22:29:33 140158426741504 [Note] WSREP: Requesting state transfer: success, donor: 1
2017-12-12 22:29:33 140158426741504 [Note] WSREP: GCache history reset: old(e2fbbca5-df26-11e7-8ee2-bb61f8ff3774:0) -> new(e2fbbca5-df26-11e7-8ee2-bb61f8ff3774:204013)
2017-12-12 22:29:33 140158145914624 [Note] WSREP: 1.0 (Node C): State transfer to 2.0 (Node B) complete.
2017-12-12 22:29:33 140158145914624 [Note] WSREP: Member 1.0 (Node C) synced with group.
WSREP_SST: [INFO] Joiner cleanup. rsync PID: 16663 (20171212 22:29:34.474)
WSREP_SST: [INFO] Joiner cleanup done. (20171212 22:29:34.980)
2017-12-12 22:29:34 140158427056064 [Note] WSREP: SST complete, seqno: 201439
2017-12-12 22:29:35 140158427056064 [Note] WSREP: Signalling provider to continue.
2017-12-12 22:29:35 140158427056064 [Note] WSREP: SST received: e2fbbca5-df26-11e7-8ee2-bb61f8ff3774:201439
2017-12-12 22:29:35 140158426741504 [Note] WSREP: Receiving IST: 2574 writesets, seqnos 201439-204013
2017-12-12 22:29:35 140158426741504 [Note] WSREP: IST received: e2fbbca5-df26-11e7-8ee2-bb61f8ff3774:204013
2017-12-12 22:29:35 140158145914624 [Note] WSREP: 2.0 (Node B): State transfer from 1.0 (Node C) complete.
2017-12-12 22:29:35 140158145914624 [Note] WSREP: Shifting JOINER -> JOINED (TO: 204534)
2017-12-12 22:29:35 140158145914624 [Note] WSREP: Member 2.0 (Node B) synced with group.
2017-12-12 22:29:35 140158145914624 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 204535)
2017-12-12 22:29:35 140158426741504 [Note] WSREP: Synchronized with group, ready for connections

MariaDB Error Log with SST on Joiner

2017-12-12 22:32:15 139817123833600 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 239097)
2017-12-12 22:32:15 139817401395968 [Note] WSREP: State transfer required: 
        Group state: e2fbbca5-df26-11e7-8ee2-bb61f8ff3774:239097
        Local state: 00000000-0000-0000-0000-000000000000:-1
2017-12-12 22:32:15 139817401395968 [Note] WSREP: New cluster view: global state: e2fbbca5-df26-11e7-8ee2-bb61f8ff3774:239097, view# 9: Primary, number of nodes: 3, my index: 2, protocol version 3
2017-12-12 22:32:15 139817401395968 [Warning] WSREP: Gap in state sequence. Need state transfer.
2017-12-12 22:32:15 139817094477568 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '127.0.0.1' --datadir '/home/mysql/database/magal-101-b/data/'  --defaults-file '/home/mysql/database/magal-101-b/etc/my.cnf'  --parent '25291' --binlog '/home/mysql/database/magal-101-b/binlog/laptop4_magal-101-b__binlog' '
2017-12-12 22:32:15 139817401395968 [Note] WSREP: Prepared SST request: rsync|127.0.0.1:4444/rsync_sst
2017-12-12 22:32:15 139817401395968 [Note] WSREP: REPL Protocols: 7 (3, 2)
2017-12-12 22:32:15 139817401395968 [Note] WSREP: Assign initial position for certification: 239097, protocol version: 3
2017-12-12 22:32:15 139817178507008 [Note] WSREP: Service thread queue flushed.
2017-12-12 22:32:15 139817401395968 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (e2fbbca5-df26-11e7-8ee2-bb61f8ff3774): 1 (Operation not permitted)
         at galera/src/replicator_str.cpp:prepare_for_IST():482. IST will be unavailable.
2017-12-12 22:32:15 139817123833600 [Note] WSREP: Member 2.0 (Node B) requested state transfer from 'Node C'. Selected 1.0 (Node C)(SYNCED) as donor.
2017-12-12 22:32:15 139817123833600 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 239136)
2017-12-12 22:32:15 139817401395968 [Note] WSREP: Requesting state transfer: success, donor: 1
2017-12-12 22:32:15 139817401395968 [Note] WSREP: GCache history reset: old(00000000-0000-0000-0000-000000000000:0) -> new(e2fbbca5-df26-11e7-8ee2-bb61f8ff3774:239097)
2017-12-12 22:32:17 139817123833600 [Note] WSREP: 1.0 (Node C): State transfer to 2.0 (Node B) complete.
2017-12-12 22:32:17 139817123833600 [Note] WSREP: Member 1.0 (Node C) synced with group.
WSREP_SST: [INFO] Joiner cleanup. rsync PID: 25520 (20171212 22:32:17.846)
WSREP_SST: [INFO] Joiner cleanup done. (20171212 22:32:18.352)
2017-12-12 22:32:18 139817401710528 [Note] WSREP: SST complete, seqno: 239153
2017-12-12 22:32:18 139817132226304 [Note] WSREP: (ebfd9e9c, 'tcp://127.0.0.1:5680') turning message relay requesting off
2017-12-12 22:32:22 139817401710528 [Note] WSREP: Signalling provider to continue.
2017-12-12 22:32:22 139817401710528 [Note] WSREP: SST received: e2fbbca5-df26-11e7-8ee2-bb61f8ff3774:239153
2017-12-12 22:32:22 139817123833600 [Note] WSREP: 2.0 (Node B): State transfer from 1.0 (Node C) complete.
2017-12-12 22:32:22 139817123833600 [Note] WSREP: Shifting JOINER -> JOINED (TO: 239858)
2017-12-12 22:32:22 139817123833600 [Note] WSREP: Member 2.0 (Node B) synced with group.
2017-12-12 22:32:22 139817123833600 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 239866)
2017-12-12 22:32:22 139817401395968 [Note] WSREP: Synchronized with group, ready for connections

After we cleared that it really was an IST and that it was not a SST because of some other reasons the question rose: Why does an IST of only a few thousand transactions was taking 420 seconds. And this was not always the case...

So we were looking with top at the Donor and the Joiner during IST and we found that on the Donor node the Antivirus software was heavily using CPU (2 x 50%) and otherwise the system was doing nothing for a while and then suddenly started to transfer data over the network (possibly IST?).
Later we found, that the MariaDB datadir (/var/lib/mysql) was not excluded from the Antivirus software. And finally it looks like the Antivirus software was not properly configured by its Master server because the Antivirus software agent was from a cloned VM and not reinitialized. So the Antivirus Master server seems to be confused because there are 2 Antivirus software agents with the same ID.

Another very surprising situation which we did not expect was, that IST was much heavier influenced by the Antivirus software than SST. SST finished in a few seconds while IST took 420 seconds.

Conclusion: Be careful when using Antivirus software in combination with MariaDB Galera Cluster databases and exclude at least all database directories from virus scanning. If you want to be sure to avoid side effects (noisy neighbours) disable the Antivirus software on the database server at all and make sure by other means, that no virus is reaching your precious MariaDB Galera Cluster...

Taxonomy upgrade extras: