You are here

MySQL Cluster overview

This is a chaotic collection of my MySQL Cluster experience...

Content

config.ini template

A generic MySQL Cluster configuration file (config.ini) to start with. It is pretty much what the MySQL Cluster experts recommend right now:

#
# config.ini
#

# This configuration file is fore MySQL Clusters 6.2 and above...

# ----------------------------------------------------------------------

[TCP DEFAULT]

# Default is too small!
SendBufferMemory              = 2M
ReceiveBufferMemory           = 2M


# When this is configured together with section above ndb_mgmd will
# return with erro -1 (255). This is a bug and should be fixed earlier
# or later!
# When you move this section to the bottom it should work.
# You need one TCP section for EACH cluster node pair!
# (for example: 4 nodes = 6 sections)

# [TCP]
#
# NodeId1: 10
# NodeId2: 11
# Hostname1: 10.40.219.161
# Hostname2: 10.40.219.162

# ----------------------------------------------------------------------

[NDB_MGMD DEFAULT]

DataDir                       = /var/lib/mysql/cluster

# ----------------------------------------------------------------------

[NDB_MGMD]

Id                            = 1
HostName                      = 192.168.1.33


#[NDB_MGMD]
#
# Id                            = 2
# HostName                      = 192.168.1.34

# ----------------------------------------------------------------------

[NDBD DEFAULT]

# needs root privileges?
LockPagesInMainMemory         =    1
MemReportFrequency            =  600

# Everything else is not supported
NoOfReplicas                  =    2

# Should be the same on all data node for eas of maintenance
DataDir                       = /var/lib/mysql/cluster
# needs root privileges?
ODirect                       =    1

# Memory sizing either from ndb_size.pl, calculation or ALL DUMP 1000
DataMemory                    =  128M
# IndexMemory = DataMemory / (5 - 10)
IndexMemory                   =   16M

# REDO log sizing
# 256M is the new recommendation
FragmentLogFileSize           =  256M
# 6 x DataMemory / (4 x FragmentLogFileSize)
# >= 6.4
# 6 x DataMemory x NO_OF_LQH_THREADS / (4 * FragmentLogFileSize)
NoOfFragmentLogFiles          =    3

# should be around 16 - 64 MB default is too small (128 MB in extreme cases)
RedoBuffer                    =   32M

# No calculation available
SharedGlobalMemory            =   64M

# 1 second data loss in max.
TimeBetweenGlobalCheckpoints  = 1000

# leave as default to start with
# this is too big under low load (better 6-10)
TimeBetweenLocalCheckpoints   =   20

# On a good I/O system you can set this value higher
Diskcheckpointspeed           =   10M

# Backup and LCP related parameters

# This means full speed
Diskcheckpointspeedinrestart  =  100M

BackupMaxWriteSize            =    1M
BackupLogBufferSize           =    4M
BackupDataBufferSize          =   16M
BackupMemory                  =   20M

# # of expected concurrent trx / # data nodes
# This memory is pre-allocated, thus keep it low!
# leave as default to start with
MaxNoOfConcurrentTransactions =  256

# This memory is pre-allocated, thus keep it low! The default (32k) is much too high!
# Every INSERT/UPDATE/DELETE (Record = Operation) uses 1 kbyte!!!
# MaxNoOfConcurrentOperations > Operations/s (10k - 100k) / number data nodes
# MaxNoOfConcurrentOperations > MaxNoOfConcurrentTransactions x Operations/Transaction / 4
# Leave as default to start with. Use LIMIT <n> to limit number of operation records
# If you want to be on the safe side set it to 100k - 250k
MaxNoOfConcurrentOperations   = 2048

# Defaults of 64 can be fine.
# Each index consumes approx. 15 kbyte per node!
MaxNoOfUniqueHashIndexes      =  160

# Each object consumes approx. 10 kbyte per node!
# Use ndb_size.pl or calculate the number of OI indexes
# Also PK and UK create 1 OI each. Thus this values is at least as big
# as MaxNoOfTables. Usually 2 - 4 times bigger.
# The default of 128 can be too small soon.
MaxNoOfOrderedIndexes         =  256

# Not that each OI, UI and BLOB counts towards one table object
# Usually safe
# Each table consumes approx. 20 kbyte per node!
MaxNoOfTables                 =  128

# Each attribute consumes around 200 bytes of storage per node
# a) 6 x MaxNoOfTables
# b) NoOfTables x AvgAttributsPerTable
MaxNoOfAttributes             = 1024

# Default of 768 is often fine.
# Minmal value = #NoOfTables + #OI + 3 x #UI
MaxNoOfTriggers               =  768

# Default too big.
MaxNoOfLocalScans             =   64

# TimeBetweenEpochs             =  100

# CGE >= 6.3.7
# CompressedLCP                 =    1
# CompressedBackup              =    1

# GCE >= 6.3.4
# RealTimeScheduler             =    1
# SchedulerSpinTimer            =  400
# SchedulerExecutionTimer       =  100

# ----------------------------------------------------------------------

[NDBD]
Id                            =   10
HostName                      = 192.168.1.37
# CGE >= 6.3.7
# LockExecuteThreadToCPU        = 1
# LockMaintThreadsToCPU         = 0 # on same CPU as eth interupts occur


[NDBD]
Id                            =   11
HostName                      = 192.168.1.36
# CGE >= 6.3.7
# LockExecuteThreadToCPU        = 1
# LockMaintThreadsToCPU         = 0 # on same CPU as eth interupts occur

# ----------------------------------------------------------------------

[MYSQLD DEFAULT]


[MYSQLD]

Id                            =   20
HostName                      = 192.168.1.35


# Keep some spare for later
# Prevents cluster restart
[MYSQLD]
[MYSQLD]
[MYSQLD]
[MYSQLD]
[MYSQLD]
[MYSQLD]

my.cnf template

A generic MySQL configuration file (my.cnf) to start with MySQL Cluster.

[mysqld]

# NDB Cluster related parameter
ndbcluster                          = 1

# Use IP no DNS name
ndb_connectstring                   = "host1;host2"

ndb_cluster_connection_pool         = 5

# Recommended default setting changes:

ndb_use_exact_count                 = 0

# this is default
# Some of these parameter are performance relevant but have some
# drawbacks
# multi_range_count                   = 256

# 0, 1, 10, 10 is usually too verbouse
# ndb_extra_logging                   = 1
# ndb_cache_check_time                = 0

# Increase if you are doing many inserts
ndb_autoincrement_prefetch_sz         = 1024
engine_condition_pushdown             = 1

ndb_force_send                        = 1
ndb_index_stat_enable     = 0
# ndb_index_stat_cache_entries        = 32

# Statistics cost more than they give
# ndb_index_stat_enable               = 0
# ndb_index_stat_update_freq          = 20

# Leave as default
# ndb_report_thresh_binlog_epoch_slip = 3
# ndb_report_thresh_binlog_mem_usage  = 10
# ndb_use_copying_alter_table         = 0
# ndb_use_transactions                = 1

Genral rules and/or experience

MySQL Cluster v5.0

  • Avoid ALTER TABLE in MySQL Cluster.
    It needs 2 x the RAM of a table (memory is reclaimed only after a node restart).
    Do ALTER TABLE <table> ENGINE=MyISAM; instead, then the changes and then alter back to the cluster.
  • DROP/TRUCATE reclaims pages.
  • UPDATE/INSERT/DELETE causes memory de-fragmentation (same symptoms like a memory leak) ⇒ requires a node restart from time to time.
  • For production use is still and only v5.0 (exception CGE)!

MySQL Cluster v5.1

  • On disk table ⇒ is stable now.
  • Parallel node recovery is not yet implemented in v5.1 ⇒ maybe 5.2/6.0?
  • Partitioning does not work from the SQL interface (mysqld) but from the NDB-API.
  • No more memory de-fragmentation is done (see above).
  • Cluster to MySQL replication is still buggy! But should work.
  • Cluster to Cluster replication is stable.
  • In Cluster replication:
    binlog_cache_size    = TimeBetweenGlobalCheckpoints/1000 (default 2000)
    (default 32 kbyte)     * total traffic (Mbyte/s) (for example 5 Mbyte/s)
                         = 2000 ms / 1000 ms/s * 5 Mbyte/s
                         ≌ 10 Mybte
    
  • pdflush(kswapd) flushes every 5 seconds pages to disk ⇒ Causes troubles with cluster.
    The kernel parameters vm.page-cluster and vm.lower-zone-protection (later one does not exist on my maybe non-NUMA system) can help:
    /
    # cat /proc/sys/vm/lower_zone_protection
    # cat /proc/meminfo
    # sysctl vm | egrep 'page-cluster|lower-zone-protection'

  • For MySQL Cluster on AMD Hardware set NUMA to OFF in BIOS! ⇒ Preferred hardware: Intel!
  • File system: Use ext3 or XFS.

Literature