Galera Cluster last inactive check and VMware snapshots

Sat, 2015-04-11 11:46 — Shinguz

From time to time we see at Galera Cluster customer engagements the following, for me scary, warning in the MySQL error log:

[Warning] WSREP: last inactive check more than PT1.5S ago (PT7.06159S), skipping check

We mostly see this in VMware set-ups. Some further enquiry with the Galera developers did not give a satisfying answer:

This can be seen on bare metal as well - with poorly configured mysqld, O/S, or simply being overloaded. All it means is that this thread could not get CPU time for 7.1 seconds. You can imagine that access to resources in virtual machines is even harder (especially I/O) than on bare metal, so you will see this in virtual machines more often.

This is not a Galera specific issue (it just reports being stuck, other mysqld threads are equally stuck) so there is no configuration options for that. You simply must make sure that your system and mysqld are properly configured, that there is enough RAM (buffer pool not over provisioned), that there is swap, that there are proper I/O drivers installed on guest and so on.

Basically, Galera runs in virtual machines as well as well virtual machines approximates bare metal.

We were still suspecting that this is somehow VMware related. This week we had the chance to investigate... At 01:36 am node Galera2 lost connection to the Cluster and became NON-PRIMARY. This is basically a bad sign:

150401  1:36:15 [Warning] WSREP: last inactive check more than PT1.5S ago (PT5.08325S), skipping check
150401  1:36:15 [Note] WSREP: (09c6b2f2, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://192.168.42.2:4567
150401  1:36:16 [Note] WSREP: view(view_id(NON_PRIM,09c6b2f2,30) memb {
        09c6b2f2,0
} joined {
} left {
} partitioned {
        ce6bf2e1,0
        d1f9bee0,0
})
150401  1:36:16 [Note] WSREP: view(view_id(NON_PRIM,09c6b2f2,31) memb {
        09c6b2f2,0
} joined {
} left {
} partitioned {
        ce6bf2e1,0
        d1f9bee0,0
})
150401  1:36:16 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
150401  1:36:16 [Note] WSREP: Flow-control interval: [16, 16]
150401  1:36:16 [Note] WSREP: Received NON-PRIMARY.
150401  1:36:16 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 26304132)
150401  1:36:16 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
150401  1:36:16 [Note] WSREP: Flow-control interval: [16, 16]
150401  1:36:16 [Note] WSREP: Received NON-PRIMARY.
150401  1:36:16 [Warning] WSREP: Send action {(nil), 328, TORDERED} returned -107 (Transport endpoint is not connected)
150401  1:36:16 [Note] WSREP: New cluster view: global state: dcca768c-b5ad-11e3-bbc0-fb576fb3c451:26304132, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version 3
150401  1:36:17 [Note] WSREP: (09c6b2f2, 'tcp://0.0.0.0:4567') reconnecting to d1f9bee0 (tcp://192.168.42.1:4567), attempt 0

I suspected, after some investigation with the FromDual Performance Monitor for MySQL and MariaDB, that the database backup (mysqldump) could be the reason. It was not. But the customer explained, that after the database backup they do a VMware snapshot.

And when we compared our problem with the backup log file:

2015/04/01 01:35:08 [3] backup.fromdual.com: Creating a snapshot of galera3
2015/04/01 01:35:16 [3] backup.fromdual.com: Created a snapshot of galera3
2015/04/01 01:35:23 [3] backup.fromdual.com: galera3: backup the changed blocks of disk 'Festplatte 1' using NBD transport
2015/04/01 01:36:10 [3] backup.fromdual.com: galera3: saving the Change Block Tracking's reference for disk 'Festplatte 1'
2015/04/01 01:36:10 [3] backup.fromdual.com: Removing Arkeia's snapshot of galera3

we can see that our problem pretty much started with the end of the WMware snapshot (01:36:10 + 5.08 = 1:36:15). By the way: For such kind of investigations it is always good to have a ntp daemon for time synchronization running. Otherwise problem investigation becomes much harder...

Some more and deeper investigation shows that we loose from time to time nodes during VMware snapshots (galera3). But they recover quickly because they can do an IST. In worst case we can loose 2 nodes and then the whole Galera Cluster has gone.

192.168.42.3 / node Galera3

2015-04-10 01:44:00 [3] backup.fromdual.com: Creating a snapshot of galera3
2015-04-10 01:44:08 [3] backup.fromdual.com: Created a snapshot of galera3
2015-04-10 01:44:15 [3] backup.fromdual.com: galera3: backup the changed blocks of disk 'Festplatte 1' using NBD transport
2015-04-10 01:45:39 [3] backup.fromdual.com: galera3: saving the Change Block Tracking's reference for disk 'Festplatte 1'
2015-04-10 01:45:39 [3] backup.fromdual.com: Removing Arkeia's snapshot of galera3

150410  1:44:07 [Note] WSREP: (158f71de, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://galera1:4567 tcp://galera2:4567
150410  1:44:07 [Warning] WSREP: last inactive check more than PT1.5S ago (PT7.06159S), skipping check
150410  1:44:08 [Note] WSREP: Received NON-PRIMARY.
150410  1:44:10 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 30399299)
150410  1:44:11 [Warning] WSREP: Gap in state sequence. Need state transfer.
150410  1:44:11 [Note] WSREP: Prepared IST receiver, listening at: tcp://galera3:4568
150410  1:44:11 [Note] WSREP: Member 0.0 (galera3) requested state transfer from '*any*'. Selected 2.0 (galera2)(SYNCED) as donor.
150410  1:44:11 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 30399309)
150410  1:44:11 [Note] WSREP: Requesting state transfer: success, donor: 2
150410  1:44:11 [Note] WSREP: 2.0 (galera2): State transfer to 0.0 (galera3) complete.
150410  1:44:11 [Note] WSREP: Member 2.0 (galera2) synced with group.
150410  1:44:11 [Note] WSREP: Receiving IST: 8 writesets, seqnos 30399291-30399299
150410  1:44:11 [Note] WSREP: IST received: dcca768c-b5ad-11e3-bbc0-fb576fb3c451:30399299
150410  1:44:11 [Note] WSREP: 0.0 (galera3): State transfer from 2.0 (galera2) complete.
150410  1:44:11 [Note] WSREP: Shifting JOINER -> JOINED (TO: 30399309)
150410  1:44:11 [Note] WSREP: Member 0.0 (galera3) synced with group.
150410  1:44:11 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 30399309)
150410  1:44:11 [Note] WSREP: Synchronized with group, ready for connections
150410  1:44:13 [Note] WSREP: (158f71de, 'tcp://0.0.0.0:4567') turning message relay requesting off
150410  1:45:42 [Warning] WSREP: last inactive check more than PT1.5S ago (PT2.47388S), skipping check
150410  1:45:43 [Note] WSREP: (158f71de, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://galera1:4567 tcp://galera2:4567
150410  1:45:44 [Note] WSREP: (158f71de, 'tcp://0.0.0.0:4567') reconnecting to 54de92f8 (tcp://galera1:4567), attempt 0
150410  1:45:44 [Note] WSREP: (158f71de, 'tcp://0.0.0.0:4567') reconnecting to c9d964d3 (tcp://galera2:4567), attempt 0
150410  1:45:48 [Note] WSREP: (158f71de, 'tcp://0.0.0.0:4567') turning message relay requesting off

150410  1:47:26 [Note] WSREP: (158f71de, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://galera1:4567
150410  1:47:27 [Note] WSREP: (158f71de, 'tcp://0.0.0.0:4567') reconnecting to 54de92f8 (tcp://galera1:4567), attempt 0
150410  1:47:31 [Note] WSREP: (158f71de, 'tcp://0.0.0.0:4567') turning message relay requesting off

192.168.42.1 / node Galera1

2015-04-10 01:47:24 [3] backup.fromdual.com: Creating a snapshot of galera1
2015-04-10 01:47:29 [3] backup.fromdual.com: Created a snapshot of galera1
2015-04-10 01:47:40 [3] backup.fromdual.com: galera1: backup the changed blocks of disk 'Festplatte 1' using NBD transport
2015-04-10 01:48:43 [3] backup.fromdual.com: galera1: saving the Change Block Tracking's reference for disk 'Festplatte 1'
2015-04-10 01:48:44 [3] backup.fromdual.com: Removing Arkeia's snapshot of galera1

150410  1:44:02 [Note] WSREP: (54de92f8, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://galera3:4567
150410  1:44:04 [Note] WSREP: (54de92f8, 'tcp://0.0.0.0:4567') reconnecting to 158f71de (tcp://galera3:4567), attempt 0
150410  1:44:12 [Note] WSREP: Member 0.0 (galera3) requested state transfer from '*any*'. Selected 2.0 (galera2)(SYNCED) as donor.

150410  1:45:43 [Note] WSREP: (54de92f8, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://galera3:4567
150410  1:45:44 [Note] WSREP: (54de92f8, 'tcp://0.0.0.0:4567') reconnecting to 158f71de (tcp://galera3:4567), attempt 0
150410  1:45:48 [Note] WSREP: (54de92f8, 'tcp://0.0.0.0:4567') turning message relay requesting off

150410  1:47:27 [Warning] WSREP: last inactive check more than PT1.5S ago (PT3.66452S), skipping check
150410  1:47:27 [Note] WSREP: (54de92f8, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://galera3:4567
150410  1:47:30 [Note] WSREP: (54de92f8, 'tcp://0.0.0.0:4567') turning message relay requesting off

192.168.42.2 / node Galera2

2015-04-10 02:09:55 [3] backup.fromdual.com: Creating a snapshot of galera2
2015-04-10 02:09:58 [3] backup.fromdual.com: Created a snapshot of galera2
2015-04-10 02:10:05 [3] backup.fromdual.com: galera2: backup the changed blocks of disk 'Festplatte 1' using NBD transport
2015-04-10 02:10:53 [3] backup.fromdual.com: galera2: saving the Change Block Tracking's reference for disk 'Festplatte 1'
2015-04-10 02:10:54 [3] backup.fromdual.com: Removing Arkeia's snapshot of galera2

150410  1:44:02 [Note] WSREP: (c9d964d3, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://galera3:4567
150410  1:44:03 [Note] WSREP: (c9d964d3, 'tcp://0.0.0.0:4567') reconnecting to 158f71de (tcp://galera3:4567), attempt 0

150410  1:44:08 [Warning] WSREP: discarding established (time wait) 158f71de (tcp://192.168.42.3:4567)
150410  1:44:11 [Note] WSREP: Member 0.0 (galera3) requested state transfer from '*any*'. Selected 2.0 (galera2)(SYNCED) as donor.
150410  1:44:13 [Note] WSREP: (c9d964d3, 'tcp://0.0.0.0:4567') turning message relay requesting off

150410  1:45:43 [Note] WSREP: (c9d964d3, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://galera3:4567
150410  1:45:44 [Note] WSREP: (c9d964d3, 'tcp://0.0.0.0:4567') reconnecting to 158f71de (tcp://galera3:4567), attempt 0
150410  1:45:48 [Note] WSREP: (c9d964d3, 'tcp://0.0.0.0:4567') turning message relay requesting off

150410  1:47:26 [Note] WSREP: (c9d964d3, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://galera1:4567
150410  1:47:27 [Note] WSREP: (c9d964d3, 'tcp://0.0.0.0:4567') reconnecting to 54de92f8 (tcp://galera1:4567), attempt 0
150410  1:47:30 [Note] WSREP: (c9d964d3, 'tcp://0.0.0.0:4567') turning message relay requesting off

150410  2:09:57 [Warning] WSREP: last inactive check more than PT1.5S ago (PT1.83618S), skipping check

The backups are done with the 2 options:

Change Block Tracking (CBT) (Arkeia, VMware) and
Quiesce Guest File System (Arkeia, VMware)

enabled.

Possibly this is the reason and one should disable those features in combination with Galera. Further investigation is going on. In worst case VMware snapshotting with Galera should be avoided.

Taxonomy upgrade extras:

galera

snapshot

vmware

Shinguz's blog
Log in or register to post comments

Languages

Login

You are here

Galera Cluster last inactive check and VMware snapshots

192.168.42.3 / node Galera3

192.168.42.1 / node Galera1

192.168.42.2 / node Galera2

Navigation

Trademarks

Languages

Login

Search form

You are here

Galera Cluster last inactive check and VMware snapshots

192.168.42.3 / node Galera3

192.168.42.1 / node Galera1

192.168.42.2 / node Galera2

Navigation

Trademarks