Feed Aggregator
Active/active failover cluster with MySQL Replication
Electing a slave as new master and aligning the other slaves to the new master
In a simple MySQL Replication set-up you have high-availability (HA) on the read side (r). But for the master which covers all the writes (w) and the time critical read (rt) there is no HA implemented. For some situations this can be OK. For example if you have rarely writes or if you can wait until a new Master is set up.
But in other cases you need a fast failover to a new master.
In the following article it is shown how to implement the election of a new master and how to align the slaves to the new master.
We can have two possible scenarios:
- This scenario assumes, that every slave can become the new master.
- This scenario assumes, that only one dedicated slave will become master.
The advantages and disadvantages of both scenarios:
Scenario 1
+ You can choose the slave which is the most actual one.
- Higher possibility of errors if not automatized.
- You do not need an extra spare slave.
- More bin log writing on all Slaves. …
Taxonomy upgrade extras: english, high availability, mysql, replication, cluster, active-active, failover,
Typical automated MySQL maintenance jobs
[http://www.blogger.com/profile/13385339200643211924 knetknight] said…
Just what I needed. I have become the maintainer of a small but previously neglected mysql server and was looking for exactly this kind of info. It serves one database which feeds an intranet application. The database/tables hadn’t had any maintenance in three years so I found some errors in it initially and am now in the process of automating basic maintenance.
I’ve currently got a nightly job which runs in this order:
flush query cache.
check table.
analyze table.
optimize table.
repair table.
Steps 2 - 5 are run against all the tables of the database that serves an intranet site. I’m fairly certain that check and analyze are not all that helpful in an AUTOMATED routine unless that routine has some logic to figure out what to do with the result codes. i.e. I may only need to run repair if check/analyze find errors but right now my automated process is running repair each night regardless.
Any thoughts?
Taxonomy upgrade extras:
Typical automated MySQL maintenance jobs, query cache
The following maintenance jobs are typically run against a MySQL database:
- Backup
- Clean-up binary logs
- Optimize tables
- Purge query cache
- Rotate binary logs
Backup
A backup is not a typical maintenance job. But it behaves more or less like one. The backup should be done regularly depending on the restore/PITR (Point in Time Recovery) requirements.
Make sure, that in the backup all the necessary files (data files, transaction log files, configuration files and binary log files) are included. To prove that the backup process is working properly a regular restore should be performed. This can ideally be combined with the set-up of new database instances for developers or testing.
Clean-up the binary logs
The binary logs can be cleaned-up in two ways:
a
) Passive by MySQL itself:# my.cnf expire_logs_days = 7b
) Active by the customers environment:mysql> PURGE MASTER LOGS TO 'binarylog.000999'; mysql> PURGE MASTER LOGS BEFORE '2008-07-29 22:46:26';
Make sure NO binary logs are purged which …
Taxonomy upgrade extras: english, operations, mysql, mainenance jobs, query cache,
Some more details about DiskSyncSize
Jon Stephens said…
According to the information I have, it was added in 5.1.12, not 5.1.23.
In any case, it is available in all recent MySQL Cluster NDB 6.2 and 6.3 releases.
Taxonomy upgrade extras:
My thoughts about MySQL (Cluster) replication
According to Johans wishes I write down my concerns about MySQL (Cluster) replication. These items are things I run again and again into it with customers:
- SQL-nodes are still loosing too easy connection to cluster after data node or management node restart (which leads into gaps, see next point). Automatic failover or reconnection is just a dream (maybe it works in about 90% of the cases at least).
- Gaps: Whenever I do a cluster reconfiguration (should not be necessary too often, see loosing connection above) or a mysqld restart I get a gap (these are the planned ones, see also automatic channel failover). Then we have the not planned ones…
I cannot understand, why we are not able to keep at least a certain amount of traffic in a binlog-injector-buffer to avoid the majority of these gaps. This could be something like a Round-Robin buffer which stores all the cluster information which should be sent to the binlog-injector thread. When we configure this buffer to 1 Gbyte we can keep close to 2 minutes …
Taxonomy upgrade extras: english, mysql, replication, mysql cluster,
Some more details about DiskSyncSize
Johan Andersson said…
I guess it is related to DiskCheckpointSpeed nowadays (>= 5.1 ).
johan (johanandersson.blogspot.com)
Taxonomy upgrade extras:
Some more details about DiskSyncSize
The parameter DiskSyncSize is a MySQL Cluster parameter and was added in MySQL 5.1.23.
After the amount of stored bytes of data per file, the data node will fsync (flush) the LCP file to disk, even if a fsync is not needed for consistency.
This is done because the OS will otherwise buffer all the writes, and when a fsync is really needed, it can take a lot of time…
Originally this parameter was hard coded. Now it defaults to 4 Mbyte.
The parameter DiskSyncSize is related to the parameters NoOfDiskPagesToDiskAfterRestartTUP and NoOfDiskPagesToDiskAfterRestartACC which are deprecated right now. It does NOT replace the parameter TimeBetweenLocalCheckpoint.
This parameter should not be changed on any OS (with reasonable settings). With ODIRECT it is not used at all.
Thanks to Jonas for the help.
Taxonomy upgrade extras: english, mysql cluster, disksyncsize, parameter,
With MySQL-Enterprise Montior through firewalls
Sometimes it is nice to show customers the functionality of MySQL-Enterprise Monitor (aka Merlin). I install the agents on the servers and the dashboard runs on my laptop. But very often only ssh is open to these servers.
So how to dig a whole through the firewall for MySQL-Enterprise Monitor?
# ssh -R 18080:localhost:18080 oli@where_the_agent_sits
Maybe trivial for you but for me its hard to remember…
Taxonomy upgrade extras: english, mysql, enterprise monitor, firewall,
Sparse files
What is a sparse file?
“A sparse file is a file where space has been allocated but not actually filled with data. These space is not written to the file system. Instead, brief information about these empty regions is stored, which takes up much less disk space. These regions are only written to disk at their actual size when data is written to them. The file system transparently converts reads from empty sections into blocks filled with zero bytes at runtime.”
[ 1
]
In other words: Files are not as big as expected.
With databases this can be seen often: For example the MySQL Cluster REDO log files are created as sparse files or some ORACLE tablespace files.
But first let us create such a sparse file:
# dd if=/dev/zero of=sparsefile count=0 obs=1 seek=100G
# ls -lah sparsefile
-rw-r--r-- 1 oli users 100G 2007-10-24 11:18 sparsefile
# df -h .
Filesystem Size Used Avail Use% Mounted on
/dev/sda9 5.0G 3.5G 1.2G 75% /home
Funny: How can I have a 100 Gbyte file on a 5 Gbyte …
Taxonomy upgrade extras: english, sparse files,
Ranking in MySQL results
A friend of me asked me long time ago: “How can I have a ranking on a result with MySQL?”. Now I found some time to write it down:
Lets do first some preparation for the example:
CREATE TABLE sales (
id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY
, fruit VARCHAR(32)
, amount DECIMAL
);
INSERT INTO sales
VALUES (NULL, 'apple', 12.75), (NULL, 'orange', 1.89), (NULL, 'pear', 19.23)
, (NULL, 'banana', 4.25), (NULL, 'cherry', 123.75), (NULL, 'plum', 23.15)
;
Now lets query:
SELECT fruit, amount
FROM sales
ORDER BY amount DESC
;
+--------+--------+
| fruit | amount |
+--------+--------+
| cherry | 124 |
| plum | 23 |
| pear | 19 |
| apple | 13 |
| banana | 4 |
| orange | 2 |
+--------+--------+
Hmmmm…., this not yet what we want!
And now with ranking:
SET @rank=0;
SELECT @rank:=@rank+1 AS rank, fruit, amount
FROM sales
ORDER BY amount DESC
;
+------+--------+--------+
| rank | fruit | amount |
+------+--------+--------+
| 1 | cherry …Taxonomy upgrade extras: mysql, ranking, result,
MySQL logon trigger
With MySQL 5.0 the database provides trigger functionality on INSERT, REPLACE, UPDATE and DELETE.
Those of you who know some other RDBMS know, that there are also some system events where one would like to have triggers.
Unfortunately MySQL does not (yet) provide such functionality. This is sad because as database administrator this would be sometimes very helpful.
But you can build your own LOGON and STARTUP trigger.
MySQL provides some hooks for these events…
Complete Story (PDF 160 kbyte).
Taxonomy upgrade extras: english, mysql, logon trigger, trigger, login trigger, audit,
MySQL Cluster restore
Recently the question came up if it is faster to restore a MySQL cluster when all nodes are up or only ONE node from each node group during restore.
The answer from our gurus was: All nodes up during restore! I wanted to find out why. So I set up the following cluster and started to measure:
MySQL Cluster set up

MySQL Cluster backup
The backup is not that interesting. But I made the drawing for possible future use :-) :

MySQL Cluster restore
For the restore there are 4 different ways thinkable:
- Restore with all nodes up and all 4 backup pieces are restored in sequence. (1a)
- Restore with all nodes up and all 4 backup pieces are restored in parallel. (1b)
- Restore with 1 node of each node group down and all 4 backup pieces are restored in sequence. (2a)
- Restore with 1 node of each node group down and all 4 backup pieces are restored in parallel. (2b)

And we got the following times:
| Way | Initial | Stop | Meta | Data | Exit | Total | Start |
|---|---|---|---|---|---|---|---|
| 1a | ~ 30 - 35 s | n.a. | ~ 80 - 85 s | ~ 170 s | < 1 s | ~ 280 - 290 s | n.a. |
| 1b | ~ … |
Taxonomy upgrade extras: english, backup, restore, recovery, mysql, cluster,
MyEnv (MySQL and MariaDB BasEnv)
What is MyEnv?
MyEnv is a tool to run several MySQL or MariaDB database instances on ONE host. You can even run multiple database instances with different binary versions. We call this multi-instance set-ups.
With MyEnv a multi-instance set-up is more comfortable to handle than with mysqld_multi (old) or Systemd services (new) and it provides more useful functionality.
Using MyEnv does not need the use of the root user to operate MySQL and MariaDB database instances. Thus it is appropriate for classical enterprise DBA organizations.
Further multi-instance set-ups is a much more efficient way of consolidating MySQL and MariaDB instances an has less overhead than virtualization solutions. To do proper resource fencing MyEnv provides cgroup integration (container integration (docker) is planned for the future).
MyEnv users are saying…
A large MySQL user (Alexa Ranking: Top 500 world-wide, top 20 in Germany) stated: I found MyEnv accidentally when I was searching for something like this. I liked it …
Taxonomy upgrade extras: environment, virtualization, consolidation, multi instance, saas, myenv, license, container, docker,
MySQL Active - Active Clustering
It is possible to use an active - active shared-disk cluster in MySQL in some cases. For doing this you have to fulfill the following requirements:
- Works with MyISAM tables only.
- POSIX-locking compliant cluster file system on the device (such as OCFS2 or GFS).
- External locking must be enabled.
- The MySQL query cache must be turned off.
- The MySQL delay key write must be turned off.
- OS where file locking is supported in MySQL.

Interested? To read more, follow the link: MySQL active - active Cluster (PDF 157 kbyte).
Related pages
- A very interesting comment from Peter Zaitsev you MUST read. He comments, why this set up you should NOT use: MySQL MyISAM Active Active Clustering - looking for trouble?
And I do fully agree with him, that in 99% of all use cases this set up is not the right choice (and do not believe yours is the 1% :-))! - The MySQL architecture stack
Taxonomy upgrade extras: english, mysql, performance, active-active clustering, cluster,
Transaction performance
Transaction performance relates among other things from I/O performance. This means hard disk performance.
Hard disk performance
When you select a hard disk, an important feature to consider is the performance (speed) of the drive. Hard disks come in a wide range of performance capabilities. As is true of many things, one of the best indicators of a drive’s relative performance is its price. An old saying from the automobile-racing industry is appropriate here: “Speed costs money. How fast do you want to go?”
The performance of a hard disk depends on several delays associated with reading or writing data on a computer’s disk drive.
A measurement, called average access time (AAT), involves the elements, average seek time (AST), average rotational latency (ARL) and transfer time (TT).
Interested? To read more, follow the link: Transaction Performance (PDF 92 kbyte).
Taxonomy upgrade extras: english, performance tuning, performance, transaction,
Round-Robin Database Storage Engine (RRD)
In a round-robin database (RRD) usually time-series data like network bandwidth, temperatures, CPU load etc. is stored. The data is stored in the way that system storage footprint remains constant over time. This avoids resource expensive purge jobs and reduces complexity.

MySQL does NOT yet provide this kind of storage engine. Although some people were thinking about and some prototypes exists.
Nevertheless in this paper it is shown how you can build your own RRD tables: Round-Robin Database Storage Engine (RRD) (PDF 242 kbyte).
Taxonomy upgrade extras: english, mysql, storage engine, rrd, round-robin database,
SATA Flash Solid State Disk up to 160 Gbyte announced!
The price for a 160 Gbyte disk will be around USD 15'000. This is still a bit expensive. But the access time is around 0.5 ms (both for reading and writing) which is around 10 times faster than a normal 15'000 rpm SCSI disk! The disk has NO cache because it is a cache itself (according to the supplier. Maybe this will change in the future). And the lifetime of a cell is > 5 mio writes. For the same performance one needs usually an array of around 10 disks. If your database is heavily write-I/O bound you should consider this solution.
I/O-Systems
Will NAND Flash / Solid State Disk (SSD) will be the future for Database I/O systems?
Actually SSD are still limited in size (64 GB) and expensive (EUR 30/GB) and thus cannot yet compete with SCSI or IDE disks. But they also have an advantage. They are fast!
For some uses like databases the price per GB is not that relevant. Also most of the databases fit into one or two 64 GB disks (more than 90%).
When you really get an I/O bottleneck you should consider to …
Taxonomy upgrade extras: english, performance, sata, flash, solid state disk,
MySQL Multi-Master - Single-Slave - Replication
MySQL provides its replication for High Availability (HA) and for read Scale-out. Generally it is known that in a MySQL replication you can only replicate from one Master to many slaves. In this paper it is shown how a set-up can look like to replicate from two masters to one slave.
Caution: Handle this information with care!!!
Taxonomy upgrade extras: multi-master, slave, replication, multi-source,
Profiling MySQL with oprofile
Probably the answer to this question is already known. But we want to prove it and by the way learn to deal with MySQL and oprofile.
Taxonomy upgrade extras: english, mysql, profiling, oprofile,
Materialized Views (MV) with MySQL
Materialised View (MV) is the pre-calculated (materialised) result of a query. Unlike a simple VIEW the result of a Materialised View is stored somewhere, generally in a table. Materialised Views are used when immediate response is needed and the query where the Materialised View bases on would take to long to produce a result. Materialised Views have to be refreshed once in a while. It depends on the requirements how often a Materialised View is refreshed and how actual its content is. Basically a Materialised View can be refreshed immediately or deferred, it can be refreshed fully or to a certain point in time. MySQL does not provide Materialised Views by itself. But it is easy to build Materialised Views yourself.
Taxonomy upgrade extras: mysql, view, performance, materialized views, materialised views,

