You are here

Rolling upgrade of Galera 1.0 to 1.1

A few days ago Codership announced their new version Galera v1.1 - synchronous Replication Cluster for MySQL. Before we look at the new feature of Rolling Online Schema Upgrade (OSU) we have a look at how to upgrade to the new Galera release.

A rolling upgrade of your synchronous Galera Replication Cluster from version 1.0 to 1.1 is quite easy when you stay at the same MySQL version (5.5).

To not lose the availability of your database service during the upgrade you should have at least 3 Galera nodes in your Cluster.

For further details please also look at MySQL/Galera cluster upgrade.

Hint: If you can do without rolling upgrade, you better avoid it and take your Galera Cluster down.

Check the version

Check first the version you are currently running on:

SHOW GLOBAL VARIABLES LIKE 'version';
+---------------+-----------------------+
| Variable_name | Value                 |
+---------------+-----------------------+
| version       | 5.5.15-wsrep_21.1-log |
+---------------+-----------------------+

We can see that we are using MySQL 5.5.15 with the wsrep API Version 21 and the wsrep patch 21.1.

SHOW GLOBAL STATUS LIKE 'wsrep_provider_version';
+------------------------+-------------+
| Variable_name          | Value       |
+------------------------+-------------+
| wsrep_provider_version | 21.1.0(r86) |
+------------------------+-------------+

Here we see that we are using the Galera Replicator (plugin) 1.0 (revision 86) based on the wsrep API Version 21.

Some rules to Galera versioning. We have 4 different version numbers we should care about:

  • MySQL version (5.5.15)
  • Wsrep API version (21, 22, ...)
    The wsrep API versions will always be single monotonically increasing numbers: 21, 22, ... That indicates API compatibility between MySQL and Galera.
  • Wsrep Patch version (21.1)
    The wsrep patch versions has the form 21.1 where the 21 represents the API version and the 1 represents bug-fix of that API version.
  • Galera Replicator (= provider, plugin) (1.0(r86))
    Galera versions will be in the form <major>.<minor> with minor meaning: bug-fixes and small features and major: major features, which involve a lot of code change.

Galera 22.1.1 is backward compatible with Galera 21.1.0. I was told that Galera should be at least ONE version backward compatible. So 1.0 should be for 0.8 and 1.1 for 1.0 and 1.2 for 1.1 and 2.0 for 1.2 etc.

Preparation

Download the packages for your preferred installation method from here:

In my case there was only a binary tar ball provided for Codership-MySQL but not for the Galera Plugin v1.1. So I extracted it from the Debian package as follows:

dpkg-deb -x galera-22.1.1-amd64.deb /tmp/oli/
cp /tmp/oli/usr/bin/garbd /home/mysql/product/mysql-5.5.17-wsrep-22.3/bin
cp /tmp/oli/usr/lib/galera/libgalera_smm.so \
/home/mysql/product/mysql-5.5.17-wsrep-22.3/lib/plugin/

For RPM's it should work in a similar way:

rpm2cpio package.rpm | cpio -idmv

Precautions

Make sure, that during upgrade from 5.1 to 5.5 no DDL's are allowed!

Upgrade

Then upgrade your Galera Cluster as follows:

  • Shift load away from this node.
  • Shutdown node (/etc/init.d/mysql stop)
  • Uninstall or remove the old Galera plugin.
  • Uninstall or remove the old Codership-MySQL Binaries
  • Install the new Codership-MySQL binaries with the wsrep API version 22
  • Install the new Galera plugin v1.1
  • Check if wsrep_provider in my.cnf is pointing to the correct new location.
  • Start node (/etc/init.d/mysql start)
  • Check if node came up properly:
    SHOW GLOBAL STATUS LIKE 'wsrep%';
    +----------------------------+--------------------------------------+
    | Variable_name              | Value                                |
    +----------------------------+--------------------------------------+
    | wsrep_local_state          | 4                                    |
    | wsrep_local_state_comment  | Synced (6)                           |
    | wsrep_cluster_size         | 3                                    |
    | wsrep_cluster_status       | Primary                              |
    | wsrep_connected            | ON                                   |
    | wsrep_local_index          | 1                                    |
    | wsrep_provider_version     | 22.1.1(r95)                          |
    | wsrep_ready                | ON                                   |
    +----------------------------+--------------------------------------+
  • If this is the case shift load back to this node.
    If you have already troubles up to here we recommend to solve the problems first and NOT to continue with the upgrade procedure. Otherwise you risk the loss of you complete service.
  • If your reached this step you can upgrade the next node in your Galera Cluster.

When you have upgraded all your nodes in the Galera Cluster you should notice, that the Protocol version will switch automatically from 1 to 2:

111212 17:32:24 [Note] WSREP: Quorum results:
        version    = 1,
        component  = PRIMARY,
        conf_id    = 6,
...
111212 17:34:33 [Note] WSREP: Quorum results:
        version    = 2,
        component  = PRIMARY,
        conf_id    = 7,

Configuration for Rolling restart

In Galera Cluster configurations you see often that the Cluster is still set to its initial start configuration which is inappropriate for a rolling restart operation:

Galera configuration for an initial Cluster start

Galera node 1: wsrep_cluster_address: gcomm://
Galera node 2: wsrep_cluster_address: gcomm://192.168.1.101:4567
Galera node 3: wsrep_cluster_address: gcomm://192.168.1.102:4567

In this case Node 2 and 3 are OK for a rolling restart but Galera Node 1 will fail to restart.

Galera configuration for normal operations and a Cluster rolling restart

This is the way we recommend to have a Galera configuration for normal operations and a rolling restart:

Galera node 1: wsrep_cluster_address: gcomm://192.168.1.103:4567
Galera node 2: wsrep_cluster_address: gcomm://192.168.1.101:4567
Galera node 3: wsrep_cluster_address: gcomm://192.168.1.102:4567

Every Galera node points to its "left" neigbour.

Upgrade from MySQL 5.1/Galera 0.8 to MySQL 5.5/Galera 1.1

Upgrading from MySQL 5.1/Galera 0.8 to MySQL 5.5/Galera 1.1 has to be done in 2 steps because Codership only provides backwards-compatibility for one minor version jumps (0.8 -> 1.0 -> 1.1 -> 1.2 -> 2.0).

We have 2 possibilities now:

5.1/0.8 -> 5.1/1.0 -> 5.5/1.1

or

5.1/0.8 -> 5.5/1.0 -> 5.5/1.1

Which one you choose is up to you.

A rolling upgrade on a running system is impossible without a snapshot state transfer (SST) at the moment. So be prepared it takes a while and causes some load on the systems.

In our case we chose the way via 5.5/1.0 (2nd way).

To upgrade from 5.1/0.8 to 5.5/1.0 proceed as follows:

  • Shift load away from this node to the other 2 nodes.
  • Shutdown this node (/etc/init.d/mysql stop)
  • Set wsrep_provider = none in my.cnf
  • Uninstall or remove the old Galera plugin.
  • Uninstall or remove the old Codership-MySQL Binaries
  • Install the new Codership-MySQL binaries
  • Install the new Galera plugin
  • Start this node (/etc/init.d/mysql start)
  • Then you will get some error messages:
    111214 11:44:59 [ERROR] Missing system table mysql.proxies_priv; please run mysql_upgrade to create it
    111214 11:44:59 [ERROR] Native table 'performance_schema'.'events_waits_current' has the wrong structure
    ...
    111214 11:44:59 [ERROR] Native table 'performance_schema'.'file_instances' has the wrong structure
    111214 11:44:59 [Note] Event Scheduler: Loaded 0 events
    111214 11:44:59 [Note] WSREP: wsrep_load(): loading provider library 'none'
  • Run mysql_upgrade (see MySQL upgrade instructions and consider that a MySQL binary upgrade is not officially supported/recommended (this is not a problem, because SST with mysqldump will do a logical restore anyway)).
  • Set wsrep_provider in my.cnf to the new plugin location.
  • Prepare SST upgrade script on (all) the donor(s) node(s).
    cp wsrep_sst_mysqldump wsrep_sst_mysqldump_upgrade
  • Change the script wsrep_sst_mysqldump_upgrade that it dumps all databases except the mysql database:
    diff wsrep_sst_mysqldump wsrep_sst_mysqldump_upgrade
    59c59
    < --skip-comments --flush-privileges --all-databases"
    ---
    > --skip-comments --flush-privileges --databases test foodmart"
  • Caution: Be careful with Stored Procedures, Stored Functions, Triggers and Events! This upgrade procedure will NOT work completely if you use some of those MySQL features. This upgrade procedure will further not work completely if you have differences in the mysql schema of your Galera nodes (for what ever reason).
  • Set wsrep_sst_method = mysqldump_upgrade in my.cnf
  • Start this node (/etc/init.d/mysql start). Keep in mind that one of the remaining Galera nodes will act as a SST donor and during the synchronization he is not available for queries!
  • Check if node came properly up: SHOW GLOBAL STATUS LIKE 'wsrep%';
  • If this is the case shift load back to this node.
    If you have already troubles up to here we recommend to solve the problems first and NOT to continue with the upgrade procedure. Otherwise you risk the loss of you complete service.
  • Set wsrep_sst_method back to its original value (mysqldump).
  • If your reached this step you can upgrade the next node in your Galera Cluster.

If you finally manged to upgrade from MySQL 5.1/Galera 0.8 to MySQL 5.5/Galera 1.0 you can follow the procedure mentioned above to upgrade to Galera 1.1

Findings

To identify our different Galera Clusters we name them:

wsrep_cluster_name             = "Galera-0.8 wsrep-21"

In this upgrade scenario this naming convention is very non-optimal because the name of our Galera Cluster should change as well. But the value of wsrep_cluster_name should be the same on all Galera nodes otherwise a node is not capable to join the Cluster (this is to make sure that a Galera node is not connecting by accident to a/the wrong Galera Cluster).
To change the wsrep_cluster_name parameter you have to bring down the whole Galera Cluster. This is not possible at the moment during a rolling restart.
Hopefully this constraint is released in a later Galera version.

Taxonomy upgrade extras: