You are here
Online DDL vs pt-online-schema-change
One of the most expensive database operations is performing Data Definition Language (DDL, e.g. CREATE, DROP, ALTER, etc.) statements, specially, the ALTER statements because MySQL blocks the entire table for both reads and writes while modifying the table.
For the huge tables, this might take hours to get the table changed which affects the application, so that, a good planning is required for such operations in order to avoid doing these changes during the peak times. For those people who have 24/7 services or limited maintenance window, DDL on huge tables is a really nightmare.
Percona developed a very good tool called pt-online-schema-change (version 2.2.6 at the time of writing this article) to perform such operations online without blocking/affecting the application and read/write operations to the table being changed is available.
Also MySQL made some enhancements for DDL statements and introduced the Online DDL feature in MySQL 5.6.
In this article, I will talk about an overview of both ways (Online DDL & pt-online-schema-change) alongside with an example and which one of them should be used in different scenarios.
This tool is developed by Percona to alter tables without locking them during the ALTER operation.
Simply, this tool creates a new empty table like the original table with the needed structure change, copy the data from the original table in small chunks to the new table, drop the original table and then rename the new table to the original name. During the copy process all new changes to the original table are being applied to the new one because a trigger is created on the original table which ensure that all new changes will be applied on the new table.
For more information about pt-online-schema-change tool, check out the manual documentation.
Altering a table called "test.test1" by adding an index (name_idx) on column "name":
[root@gcservera ~]# pt-online-schema-change --execute --alter "add index name_idx (name)" D=test,t=test1,h=localhost Operation, tries, wait: copy_rows, 10, 0.25 create_triggers, 10, 1 drop_triggers, 10, 1 swap_tables, 10, 1 update_foreign_keys, 10, 1 Altering `test`.`test1`... Creating new table... Created new table test._test1_new OK. Altering new table... Altered `test`.`_test1_new` OK. 2014-02-09T15:33:27 Creating triggers... 2014-02-09T15:33:27 Created triggers OK. 2014-02-09T15:33:27 Copying approximately 1 rows... 2014-02-09T15:33:27 Copied rows OK. 2014-02-09T15:33:27 Swapping tables... 2014-02-09T15:33:27 Swapped original and new tables OK. 2014-02-09T15:33:27 Dropping old table... 2014-02-09T15:33:27 Dropped old table `test`.`_test1_old` OK. 2014-02-09T15:33:27 Dropping triggers... 2014-02-09T15:33:27 Dropped triggers OK. Successfully altered `test`.`test1`.
The output is perfectly describing all steps that the tool is doing in the background.
Limitations of pt-online-schema-change
- A PRIMARY KEY or a unique index should be defined for the table before using this tool because it is required for the DELETE trigger.
- Not supported if the table has already triggers defined.
- The tool become complicate a little if the table has a foreign key constraint and an additional option
--alter-foreign-keys-methodshould be used.
- Also because of the foreign keys, the object names might be changed (indexes names , .. etc).
- In Galera Cluster environment, altering MyISAM tables is not supported and the system variable "wsrep_OSU_method" must be set to "TOI" (total order isolation).
In MySQL 5.5 and 5.1 with the InnoDB plugin, a new feature known as Fast Index Creation was introduced to avoid copying the tables data - when adding or removing secondary indexes - using the optimized CREATE INDEX and DROP INDEX statements.
In MySQL 5.6, the Online DDL method was introduced to allow more changes to be made on the table while accessing and writing to the table being changed is available.
The Online DDL syntax is exactly the same like the normal alter statement after specifying two parameters:ALGORITHM:
- INPLACE: the table change will be made in-place without rebuilding the entire table (in most cases, no copying data to temporary table is required).
- COPY: copying data to a temporary table, rebuilding the table and reconstructing the secondary indexes will be made (equivalent to the traditional method).
- NONE: Read and write operations are allowed during the altering process.
- SHARED: Only read operations are allowed during the altering operations (DML is not allowed).
- EXCLUSIVE: The entire table will be locked for both reading and writing (neither select nor DML are allowed).
The Online DDL is perfectly explained in the online manual documentation, you can check it out here for more information.
Altering a table called "test.test2" by adding an index (name_idx) on column "name":
mysql> alter table test2 -> add index name_idx (name),algorithm=inplace, lock=none; Query OK, 0 rows affected (0.03 sec) Records: 0 Duplicates: 0 Warnings: 0
- Works only with InnoDB (syntax wise it could be used with other storage engines "like MyISAM" but only "algorithm=copy" is allowed which is equivalent to the traditional method).
- Regardless of the locking used (none,shared or exclusive) a brief period at the beginning and at the end of the process is requiring an exclusive lock on the table.
foreign_key_checksshould be disabled when adding/dropping foreign keys to avoid table copying behavior.
- Still some alter operations require table copying or table locking in order to make the change (the old behavior). For more details on which table change require table-copying or table locking, check out this manual page.
- LOCK=NONE is not allowed in the alter table statement if there are ON...CASCADE or ON...SET NULL constraints on the table.
- While the Online DDL will be replicated on the slaves the same like the master (if LOCK=NONE no table-locking will take place on the slaves during the alter execution) but the replication itself will be blocked as the replay process executes in a single thread on the replicas which will cause slave lagging problem.
The following is a comparison results between Online DDL and pt-online-schema-change for some alter operations applied on a table contains 1,078,880 rows:
|Change Operation||Row(s) affected||Is table locked?||Time (sec)||Row(s) affected||Is table locked?||Time (sec)|
|Add Index||0||No||3.76||All rows||No||38.12|
|Drop Index||0||No||0.34||All rows||No||36.04|
|Add Column||0||No||27.61||All rows||No||37.21|
|Rename Column||0||No||0.06||All rows||No||34.16|
|Rename Column + change its data type||All rows||Yes||30.21||All rows||No||34.23|
|Drop Column||0||No||22.41||All rows||No||31.57|
|Change table ENGINE||All rows||Yes||25.30||All rows||No||35.54|
Which method should be used?
Now the question is, which method should we use to perform alter table statements?
While pt-online-schema-change allows read and write operations to the table being altered, it still copies the tables data to a temporary table in the background which adds overhead on the MySQL server. So basically, we should use pt-online-schema-change if the Online DDL will not work efficiently. In other words, if the Online DDL will require copying data to a temporary table (algorithm=copy) and the table will be blocked for long time (lock=exclusive) or when altering huge tables in a replication environment then we should use pt-online-schema-change tool.