You are here

Create a single-node MariaDB ColumnStore test installation

Table of Contents


For a long time I wanted to investigate a bit more into the MariaDB ColumnStore architecture. Now I took the time to start with.

MariaDB ColumnStore is an interesting technology if you want to do reports (OLAP, data warehouse (DWH), BI) over only a few columns on a huge amount of rows and if you have only or mostly batch writes. These data are ideally spread (sharded) over several machines and the results where calculated on those different machines an aggregated. Because the data are stored in columns the locality of your data should be better and thus a more efficient caching (in-memory) should result in faster response times for you reporting queries.

MariaDB ColumnStore can be deployed as a single-node or a multi-node set-up. The first one is considered for testing, the later one for production purposes.

For setting up a testing system I have chosen a single-node set-up on a VirtualBox VM first.

MariaDB ColumnStore hardware requirements

There are different informations available about the minimal recommended hardware resources. The range lasts from 8 to 32 physical cores with 16 to 64 Gibyte of RAM, HDD disks are fine because the system is optimized for block streaming (sequential read and write). At least an 1 Gbit network is recommended. [ 1 ]

For our first test we rely on 1 virtual Machine with 4 Gibyte of RAM (possibly with 2 Gibyte of RAM it still would work?), 10 Gibyte of HDD disk space and 1 vCPU.

O/S (Linux) settings

Check Linux kernel settings:

$ sysctl -a | grep -e '^vm.swap' -e '^vm.vfs' -e '^net.core..mem_max' -e '^net.ipv4.tcp_.mem' -e '^net.ipv4.tcp_no_metrics' -e '^net.core.netdev_max' 
net.core.netdev_max_backlog = 1000
net.core.rmem_max = 212992
net.core.wmem_max = 212992
net.ipv4.tcp_no_metrics_save = 0
net.ipv4.tcp_rmem = 4096        131072  6291456
net.ipv4.tcp_wmem = 4096        16384   4194304
vm.swappiness = 60
vm.vfs_cache_pressure = 100

Check network speed:

$ ethtool enp0s9 | grep Speed
        Speed: 1000Mb/s

Persist the Linux kernel settings:

$ cat >/etc/sysctl.d/90-mariadb-enterprise-columnstore.conf <<_EOF
#
# /etc/sysctl.d/90-mariadb-enterprise-columnstore.conf
#

# minimize swapping
vm.swappiness = 10

# optimize Linux to cache directories and inodes
vm.vfs_cache_pressure = 10

# Increase the TCP max buffer size
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216

# Increase the TCP buffer limits
# min, default, and max number of bytes to use
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216

# don't cache ssthresh from previous connection
net.ipv4.tcp_no_metrics_save = 1

# for 1 GigE, increase this to 2500
# for 10 GigE, increase this to 30000
net.core.netdev_max_backlog = 2500
_EOF

and make them active:

$ sysctl --load=/etc/sysctl.d/90-mariadb-enterprise-columnstore.conf

Disable AppArmor (for installation only?):

$ systemctl status apparmor
$ systemctl stop apparmor
$ systemctl disable apparmor
$ aa-status

Configure Character Enconding (why exactly?):

$ localedef -i en_US -f UTF-8 en_US.UTF-8

Installation of MariaDB ColumnStore

The first problem I had was finding a good an reliable information about where to download the software. Finally I ended up here: MariaDB ColumnStore download. But MariaDB ColumnStore is the wrong tab. You have to choose the MariaDB Community Server tab.

If you prefer, the software can also be downloaded from the command line:

$ wget https://dlm.mariadb.com/1623874/MariaDB/mariadb-10.5.10/repo/ubuntu/mariadb-10.5.10-ubuntu-bionic-amd64-debs.tar

and there are a lot of packages contained in this tar-ball:

$ tar xf mariadb-10.5.10-ubuntu-bionic-amd64-debs.tar
-rw-rw-r--  1 oli  oli  9059256 May  7 04:34 galera-4_26.4.8-bionic_amd64.deb
-rw-rw-r--  1 oli  oli  5955264 May  7 04:34 galera-arbitrator-4_26.4.8-bionic_amd64.deb
-rw-rw-r--  1 oli  oli     1698 May  7 04:35 InRelease
-rw-rw-r--  1 oli  oli   149208 May  7 04:34 libmariadb3_10.5.10+maria~bionic_amd64.deb
-rw-rw-r--  1 oli  oli     3276 May  7 04:34 libmariadb3-compat_10.5.10+maria~bionic_amd64.deb
-rw-rw-r--  1 oli  oli     3248 May  7 04:34 libmariadbclient18_10.5.10+maria~bionic_amd64.deb
-rw-rw-r--  1 oli  oli  5130428 May  7 04:34 libmariadbd19_10.5.10+maria~bionic_amd64.deb
-rw-rw-r--  1 oli  oli  6226020 May  7 04:34 libmariadbd-dev_10.5.10+maria~bionic_amd64.deb
-rw-rw-r--  1 oli  oli   159556 May  7 04:34 libmariadb-dev_10.5.10+maria~bionic_amd64.deb
-rw-rw-r--  1 oli  oli     3452 May  7 04:34 libmariadb-dev-compat_10.5.10+maria~bionic_amd64.deb
-rw-rw-r--  1 oli  oli     3260 May  7 04:34 libmysqlclient18_10.5.10+maria~bionic_amd64.deb
-rw-rw-r--  1 oli  oli  6483096 May  7 04:34 mariadb-backup_10.5.10+maria~bionic_amd64.deb
-rw-rw-r--  1 oli  oli     4672 May  7 04:35 MariaDB-C74CD1D8-public.asc
-rw-rw-r--  1 oli  oli  1604684 May  7 04:34 mariadb-client-10.5_10.5.10+maria~bionic_amd64.deb
-rw-rw-r--  1 oli  oli     3088 May  7 04:34 mariadb-client_10.5.10+maria~bionic_all.deb
-rw-rw-r--  1 oli  oli   784824 May  7 04:34 mariadb-client-core-10.5_10.5.10+maria~bionic_amd64.deb
-rw-rw-r--  1 oli  oli     4428 May  7 04:34 mariadb-common_10.5.10+maria~bionic_all.deb
-rw-rw-r--  1 oli  oli  5811572 May  7 04:34 mariadb-plugin-columnstore_10.5.10-5.5.2+maria~bionic_amd64.deb
-rw-rw-r--  1 oli  oli   558288 May  7 04:34 mariadb-plugin-connect_10.5.10+maria~bionic_amd64.deb
-rw-rw-r--  1 oli  oli     5216 May  7 04:34 mariadb-plugin-cracklib-password-check_10.5.10+maria~bionic_amd64.deb
-rw-rw-r--  1 oli  oli     6172 May  7 04:34 mariadb-plugin-gssapi-client_10.5.10+maria~bionic_amd64.deb
-rw-rw-r--  1 oli  oli     7592 May  7 04:34 mariadb-plugin-gssapi-server_10.5.10+maria~bionic_amd64.deb
-rw-rw-r--  1 oli  oli  1187228 May  7 04:34 mariadb-plugin-mroonga_10.5.10+maria~bionic_amd64.deb
-rw-rw-r--  1 oli  oli    63820 May  7 04:34 mariadb-plugin-oqgraph_10.5.10+maria~bionic_amd64.deb
-rw-rw-r--  1 oli  oli  3122524 May  7 04:34 mariadb-plugin-rocksdb_10.5.10+maria~bionic_amd64.deb
-rw-rw-r--  1 oli  oli   845428 May  7 04:34 mariadb-plugin-s3_10.5.10+maria~bionic_amd64.deb
-rw-rw-r--  1 oli  oli   354276 May  7 04:34 mariadb-plugin-spider_10.5.10+maria~bionic_amd64.deb
-rw-rw-r--  1 oli  oli  4562128 May  7 04:34 mariadb-server-10.5_10.5.10+maria~bionic_amd64.deb
-rw-rw-r--  1 oli  oli     3216 May  7 04:34 mariadb-server_10.5.10+maria~bionic_all.deb
-rw-rw-r--  1 oli  oli  6932836 May  7 04:34 mariadb-server-core-10.5_10.5.10+maria~bionic_amd64.deb
-rw-rw-r--  1 oli  oli     3260 May  7 04:34 mysql-common_10.5.10+maria~bionic_all.deb
-rw-rw-r--  1 oli  oli    42897 May  7 04:34 Packages
-rw-rw-r--  1 oli  oli     2278 May  7 04:35 README
-rw-rw-r--  1 oli  oli      816 May  7 04:34 Release
-rw-rw-r--  1 oli  oli      833 May  7 04:35 Release.gpg
-rwxrwxr-x  1 oli  oli     1010 May  7 04:35 setup_repository*

Installation of MariaDB ColumnStore was straight forward:

$ apt-get install gnupg2
$ ./setup_repository
$ apt-get update
$ apt-get install mariadb-server mariadb-plugin-columnstore libjemalloc1 mariadb-backup mariadb-client

Configuration of MariaDB ColumnStore

The minimum recommended MariaDB configuration for ColumnStore looks like this:

$ cat >/etc/mysql/mariadb.conf.d/zz-fromdual.cnf <<_EOF
#
# /etc/mysql/mariadb.conf.d/zz-fromdual.cnf
#

[mariadb]

log_error                              = mariadbd-error.log
character_set_server                   = utf8
collation_server                       = utf8_general_ci
columnstore_use_import_for_batchinsert = ALWAYS
_EOF

In the MariaDB ColumnStore documentation you can find all other Server System Variables and Options (columnstore*).

Cross Engine Join User

For Cross Engine Joins a special user is required. These credentials are stored in the file /etc/columnstore/Columnstore.xml and can be queried as follows:

$ mcsGetConfig -v -a | grep Cross
CrossEngineSupport.Host = 127.0.0.1
CrossEngineSupport.Port = 3306
CrossEngineSupport.User = root
CrossEngineSupport.Password = 
CrossEngineSupport.TLSCA = 
CrossEngineSupport.TLSClientCert = 
CrossEngineSupport.TLSClientKey = 

and changed as follows:

$ mcsSetConfig CrossEngineSupport Host 127.0.0.1
$ mcsSetConfig CrossEngineSupport Port 3306
$ mcsSetConfig CrossEngineSupport User cross_engine
$ mcsSetConfig CrossEngineSupport Password secret

Caution: Querying the Cross Engine Join User password can be done by an non-privileged O/S user. Thus I consider this as a security relevant bug. This is known already since 2018 and should be fixed in version 5.6.1 :-( (MCOL-1175, MCOL-4714).

Start the ColumnStore Processes

Starting MariaDB ColumnStore after the configuration changes are quite intuitive:

$ systemctl restart mariadb
$ systemctl restart mariadb-columnstore

After these commands the mariadbd and some ColumnStore processes are started:

$ ps aux | grep -e mysql -e VSZ
USER       PID %CPU %MEM     VSZ   RSS TTY      STAT START   TIME COMMAND
mysql     1380  0.0  2.4 1274896 99020 ?        Ssl  11:20   0:00 /usr/sbin/mariadbd
mysql     1499  0.0  0.4  267392 16904 ?        Sl   11:21   0:00 /usr/bin/workernode DBRM_Worker1
mysql     1507  0.0  0.2  530448 11648 ?        Sl   11:21   0:00 /usr/bin/controllernode
mysql     1522  0.0  0.2 2391288 10560 ?        Sl   11:21   0:00 /usr/bin/PrimProc
mysql     1545  0.0  0.3  300192 14116 ?        Sl   11:21   0:00 /usr/bin/WriteEngineServer
mysql     1560  0.0  0.2  212452 10188 ?        Sl   11:21   0:00 /usr/bin/ExeMgr
mysql     1571  0.0  0.4  342364 18704 ?        Sl   11:21   0:00 /usr/bin/DMLProc
mysql     1592  0.0  0.3  193640 12444 ?        Sl   11:21   0:00 /usr/bin/DDLProc

It looks like each of those processes works already multi-threaded:

$ ps -eLf | grep -e mysql -e PID 
UID        PID  PPID   LWP  C NLWP STIME TTY          TIME CMD
mysql     1380     1  1380  0   10 11:20 ?        00:00:00 /usr/sbin/mariadbd
mysql     1380     1  1381  0   10 11:20 ?        00:00:00 /usr/sbin/mariadbd
mysql     1380     1  1382  0   10 11:20 ?        00:00:00 /usr/sbin/mariadbd
mysql     1380     1  1383  0   10 11:20 ?        00:00:00 /usr/sbin/mariadbd
mysql     1380     1  1384  0   10 11:20 ?        00:00:00 /usr/sbin/mariadbd
mysql     1380     1  1385  0   10 11:20 ?        00:00:00 /usr/sbin/mariadbd
mysql     1380     1  1389  0   10 11:20 ?        00:00:00 /usr/sbin/mariadbd
mysql     1380     1  1391  0   10 11:20 ?        00:00:00 /usr/sbin/mariadbd
mysql     1380     1  1392  0   10 11:20 ?        00:00:00 /usr/sbin/mariadbd
mysql     1380     1  1407  0   10 11:20 ?        00:00:00 /usr/sbin/mariadbd

mysql     1499     1  1499  0    5 11:21 ?        00:00:00 /usr/bin/workernode DBRM_Worker1
mysql     1499     1  1501  0    5 11:21 ?        00:00:00 /usr/bin/workernode DBRM_Worker1
mysql     1499     1  1502  0    5 11:21 ?        00:00:00 /usr/bin/workernode DBRM_Worker1
mysql     1499     1  1503  0    5 11:21 ?        00:00:00 /usr/bin/workernode DBRM_Worker1
mysql     1499     1  1504  0    5 11:21 ?        00:00:00 /usr/bin/workernode DBRM_Worker1

mysql     1507     1  1507  0    6 11:21 ?        00:00:00 /usr/bin/controllernode
mysql     1507     1  1567  0    6 11:21 ?        00:00:00 /usr/bin/controllernode
mysql     1507     1  1573  0    6 11:21 ?        00:00:00 /usr/bin/controllernode
mysql     1507     1  1580  0    6 11:21 ?        00:00:00 /usr/bin/controllernode
mysql     1507     1  1581  0    6 11:21 ?        00:00:00 /usr/bin/controllernode
mysql     1507     1  1596  0    6 11:21 ?        00:00:00 /usr/bin/controllernode

mysql     1522     1  1522  0   18 11:21 ?        00:00:00 /usr/bin/PrimProc
mysql     1522     1  1523  0   18 11:21 ?        00:00:00 /usr/bin/PrimProc
mysql     1522     1  1524  0   18 11:21 ?        00:00:00 /usr/bin/PrimProc
mysql     1522     1  1525  0   18 11:21 ?        00:00:00 /usr/bin/PrimProc
mysql     1522     1  1526  0   18 11:21 ?        00:00:00 /usr/bin/PrimProc
mysql     1522     1  1527  0   18 11:21 ?        00:00:00 /usr/bin/PrimProc
mysql     1522     1  1528  0   18 11:21 ?        00:00:00 /usr/bin/PrimProc
mysql     1522     1  1529  0   18 11:21 ?        00:00:00 /usr/bin/PrimProc
mysql     1522     1  1530  0   18 11:21 ?        00:00:00 /usr/bin/PrimProc
mysql     1522     1  1531  0   18 11:21 ?        00:00:00 /usr/bin/PrimProc
mysql     1522     1  1532  0   18 11:21 ?        00:00:00 /usr/bin/PrimProc
mysql     1522     1  1533  0   18 11:21 ?        00:00:00 /usr/bin/PrimProc
mysql     1522     1  1534  0   18 11:21 ?        00:00:00 /usr/bin/PrimProc
mysql     1522     1  1535  0   18 11:21 ?        00:00:00 /usr/bin/PrimProc
mysql     1522     1  1561  0   18 11:21 ?        00:00:00 /usr/bin/PrimProc
mysql     1522     1  1563  0   18 11:21 ?        00:00:00 /usr/bin/PrimProc
mysql     1522     1  1586  0   18 11:21 ?        00:00:00 /usr/bin/PrimProc
mysql     1522     1  1593  0   18 11:21 ?        00:00:00 /usr/bin/PrimProc

mysql     1545     1  1545  0    2 11:21 ?        00:00:00 /usr/bin/WriteEngineServer
mysql     1545     1  1546  0    2 11:21 ?        00:00:00 /usr/bin/WriteEngineServer

mysql     1560     1  1560  0    6 11:21 ?        00:00:00 /usr/bin/ExeMgr
mysql     1560     1  1562  0    6 11:21 ?        00:00:00 /usr/bin/ExeMgr
mysql     1560     1  1564  0    6 11:21 ?        00:00:00 /usr/bin/ExeMgr
mysql     1560     1  1565  0    6 11:21 ?        00:00:00 /usr/bin/ExeMgr
mysql     1560     1  1572  0    6 11:21 ?        00:00:00 /usr/bin/ExeMgr
mysql     1560     1  1577  0    6 11:21 ?        00:00:00 /usr/bin/ExeMgr

mysql     1571     1  1571  0    3 11:21 ?        00:00:00 /usr/bin/DMLProc
mysql     1571     1  1587  0    3 11:21 ?        00:00:00 /usr/bin/DMLProc
mysql     1571     1  1588  0    3 11:21 ?        00:00:00 /usr/bin/DMLProc
mysql     1592     1  1592  0    3 11:21 ?        00:00:00 /usr/bin/DDLProc
mysql     1592     1  1594  0    3 11:21 ?        00:00:00 /usr/bin/DDLProc
mysql     1592     1  1595  0    3 11:21 ?        00:00:00 /usr/bin/DDLProc

In the database you can run this SELECT query to check which MariaDB ColumnStore version is running:

SQL> SELECT plugin_name, plugin_status, plugin_type_version, plugin_library_version, plugin_license, plugin_maturity, plugin_auth_version
  FROM information_schema.plugins
 WHERE plugin_library LIKE 'ha_column%';
+---------------------+---------------+---------------------+------------------------+----------------+-----------------+---------------------+
| plugin_name         | plugin_status | plugin_type_version | plugin_library_version | plugin_license | plugin_maturity | plugin_auth_version |
+---------------------+---------------+---------------------+------------------------+----------------+-----------------+---------------------+
| Columnstore         | ACTIVE        | 100510.0            | 1.14                   | GPL            | Stable          | 5.5.2               |
| COLUMNSTORE_COLUMNS | ACTIVE        | 100510.0            | 1.14                   | GPL            | Stable          | 5.5.2               |
| COLUMNSTORE_TABLES  | ACTIVE        | 100510.0            | 1.14                   | GPL            | Stable          | 5.5.2               |
| COLUMNSTORE_FILES   | ACTIVE        | 100510.0            | 1.14                   | GPL            | Stable          | 5.5.2               |
| COLUMNSTORE_EXTENTS | ACTIVE        | 100510.0            | 1.14                   | GPL            | Stable          | 5.5.2               |
+---------------------+---------------+---------------------+------------------------+----------------+-----------------+---------------------+
5 rows in set (0.002 sec)

or if the MariaDB ColumnStore is enabled at all with:

SQL> SHOW ENGINES;
+--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+
| Engine             | Support | Comment                                                                                         | Transactions | XA   | Savepoints |
+--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+
| Columnstore        | YES     | ColumnStore storage engine                                                                      | YES          | NO   | NO         |
| MRG_MyISAM         | YES     | Collection of identical MyISAM tables                                                           | NO           | NO   | NO         |
| MEMORY             | YES     | Hash based, stored in memory, useful for temporary tables                                       | NO           | NO   | NO         |
| Aria               | YES     | Crash-safe tables with MyISAM heritage. Used for internal temporary tables and privilege tables | NO           | NO   | NO         |
| MyISAM             | YES     | Non-transactional engine with good performance and small data footprint                         | NO           | NO   | NO         |
| SEQUENCE           | YES     | Generated tables filled with sequential values                                                  | YES          | NO   | YES        |
| InnoDB             | DEFAULT | Supports transactions, row-level locking, foreign keys and encryption for tables                | YES          | YES  | YES        |
| PERFORMANCE_SCHEMA | YES     | Performance Schema                                                                              | NO           | NO   | NO         |
| CSV                | YES     | Stores tables as CSV files                                                                      | NO           | NO   | NO         |
+--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+

Create the Cross Engine Join User in the database

The Cross Engine Join User specified above must also be created inside the database:

SQL> CREATE USER 'cross_engine'@'127.0.0.1' IDENTIFIED BY 'secret';
SQL> CREATE USER 'cross_engine'@'localhost' IDENTIFIED BY 'secret';

SQL> GRANT SELECT ON *.* TO 'cross_engine'@'127.0.0.1';
SQL> GRANT SELECT ON *.* TO 'cross_engine'@'localhost';

This is basically all you have to do for a running single-node MariaDB ColumnStore installation.

MariaDB ColumnStore schemas

MariaDB ColunStore creates 3 new schemas in the database:

SQL> SHOW SCHEMAS;
+---------------------+
| Database            |
+---------------------+
| calpontsys          |
| columnstore_info    |
| infinidb_querystats |
| information_schema  |
| mysql               |
| performance_schema  |
+---------------------+

We will investigate later what they are used for.

Server System Variables and Server Status Variables

A detailed description of the Server System Variables (columnstore_*) can be found in the MariaDB documentation (old source: infinidb_*).

SQL> SHOW GLOBAL STATUS LIKE 'columnstore%';
+-------------------------+--------+
| Variable_name           | Value  |
+-------------------------+--------+
| Columnstore_commit_hash | source |
| Columnstore_version     | 5.5.2  |
+-------------------------+--------+
2 rows in set (0.001 sec)

SQL> SHOW GLOBAL VARIABLES LIKE 'columnstore%';
+-------------------------------------------------+--------+
| Variable_name                                   | Value  |
+-------------------------------------------------+--------+
| columnstore_cache_flush_threshold               | 500000 |
| columnstore_cache_inserts                       | OFF    |
| columnstore_compression_type                    | SNAPPY |
| columnstore_decimal_scale                       | 8      |
| columnstore_derived_handler                     | ON     |
| columnstore_diskjoin_bucketsize                 | 100    |
| columnstore_diskjoin_largesidelimit             | 0      |
| columnstore_diskjoin_smallsidelimit             | 0      |
| columnstore_double_for_decimal_math             | OFF    |
| columnstore_group_by_handler                    | ON     |
| columnstore_import_for_batchinsert_delimiter    | 7      |
| columnstore_import_for_batchinsert_enclosed_by  | 17     |
| columnstore_local_query                         | 0      |
| columnstore_orderby_threads                     | 16     |
| columnstore_ordered_only                        | OFF    |
| columnstore_replication_slave                   | OFF    |
| columnstore_select_handler                      | ON     |
| columnstore_select_handler_in_stored_procedures | ON     |
| columnstore_string_scan_threshold               | 10     |
| columnstore_stringtable_threshold               | 20     |
| columnstore_um_mem_limit                        | 0      |
| columnstore_use_decimal_scale                   | OFF    |
| columnstore_use_import_for_batchinsert          | ALWAYS |
| columnstore_varbin_always_hex                   | OFF    |
+-------------------------------------------------+--------+
24 rows in set (0.003 sec)

MariaDB ColumnStore also writes to the MariaDB error log:

$ cat mariadbd-error.log 
210511 11:20:47 Columnstore: Started; Version: 5.5.2-2

The MariaDB ColumnStore configuration is stored in /etc/columnstore. Those files can be read by everybody on the system by default. So it possibly makes sense to prohibit access:

$ chown mysql: /etc/columnstore/*
$ chmod o-rw /etc/columnstore/*

We have not seen yet an negative impact after this "hardening":

$ ll
total 100
drwxr-xr-t  2 mysql mysql  4096 May 11 11:17 ./
drwxr-xr-x 78 root  root   4096 May 11 10:12 ../
-rw-r-----  1 mysql mysql 19169 May 11 11:17 Columnstore.xml
-rw-r-----  1 mysql mysql 19145 May 11 10:12 Columnstore.xml-20210511
-rw-rw----  1 mysql mysql 19153 May 11 11:17 Columnstore.xml.columnstoreSave
-rw-r-----  1 mysql mysql 15372 May  5 23:00 ErrorMessage.txt
-rw-r-----  1 mysql mysql  5619 May  5 23:00 MessageFile.txt
-rw-r-----  1 mysql mysql  7456 May  5 23:03 storagemanager.cnf

MariaDB ColumnStore on the file system

It is also interesting to know where on the filesystem the MariaDB ColumnStore files are located. By default they are located under /var/lib/columnstore (Note: Can they be located somewhere else and how?).

$ tree *
data1
├── 000.dir
│   └── 000.dir
│       ├── 003.dir
│       │   ├── 233.dir
│       │   │   └── 000.dir
│       │   │       └── FILE000.cdf
...
│       │   └── 255.dir
│       │       └── 000.dir
│       │           └── FILE000.cdf
│       ├── 004.dir
│       │   ├── 000.dir
│       │   │   └── 000.dir
│       │   │       └── FILE000.cdf
...
│       │   └── 018.dir
│       │       └── 000.dir
│       │           └── FILE000.cdf
│       ├── 007.dir
│       │   ├── 209.dir
│       │   │   └── 000.dir
│       │   │       └── FILE000.cdf
│       │   └── 212.dir
│       │       └── 000.dir
│       │           └── FILE000.cdf
│       ├── 008.dir
│       │   ├── 013.dir
│       │   │   └── 000.dir
│       │   │       └── FILE000.cdf
...
│       │   └── 028.dir
│       │       └── 000.dir
│       │           └── FILE000.cdf
│       ├── 011.dir
│       ├── 012.dir
│       ├── 013.dir
│       ├── 014.dir
│       ├── 015.dir
│       ├── 016.dir
│       ├── 017.dir
│       ├── 018.dir
│       │   ├── 164.dir
│       │   │   └── 000.dir
│       │   │       └── FILE000.cdf
...
│       │   └── 255.dir
│       │       └── 000.dir
│       │           └── FILE000.cdf
│       └── 019.dir
│           ├── 000.dir
│           │   └── 000.dir
│           │       └── FILE000.cdf
...
...
│           └── 221.dir
│               └── 000.dir
│                   └── FILE000.cdf
├── bulkRollback
├── dbroot1-lock
├── systemFiles
│   └── dbrm
│       ├── BRM_savesA_em
│       ├── BRM_savesA_vbbm
│       ├── BRM_savesA_vss
│       ├── BRM_savesB_em
│       ├── BRM_savesB_vbbm
│       ├── BRM_savesB_vss
│       ├── BRM_saves_current
│       ├── BRM_saves_em
│       ├── BRM_saves_journal
│       ├── BRM_saves_vbbm
│       ├── BRM_saves_vss
│       ├── oidbitmap
│       ├── SMTxnID
│       └── tablelocks
└── versionbuffer.cdf
local
└── module
storagemanager
└── storagemanager-lock

688 directories, 353 files

Creating our first MariaDB ColumnStore table

For the very first tests we use our well known test table. But for MariaDB ColumnStore tables you should NOT use the test schema: ColumnStore tables should not be created in the mysql, information_schema, calpontsys or test databases. (Why not test?).
In other documents about MariaDB ColumnStore we can clearly see, that the one who was writing the article was using the test schema [ 2 ]. An other restriction is: ColumnStore stores all object names in lower case. So probably CamelCase table names will not work as expected or they will be converted to lower case.

SQL> SET SESSION default_storage_engine = ColumnStore;

SQL> CREATE TABLE test (
  id   INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY
, data VARCHAR(128) DEFAULT NULL
, ts   TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
);
ERROR 1069 (42000): Too many keys specified; max 0 keys allowed

Oppps! What is this? After some searching I found that MariaDB ColumnStore does not know indexes: There is no need for indexing. See also MCOL-1080. As such indexes typically used to optimize query access for row based systems do not make sense since selectivity is low for such queries.

After removing the Primary Key the next try:

SQL> CREATE TABLE test (
  id   INT UNSIGNED NOT NULL AUTO_INCREMENT
, data VARCHAR(128) DEFAULT NULL
, ts   TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
);
ERROR 1075 (42000): Incorrect table definition; there can be only one auto column and it must be defined as a key

Also AUTO_INCREMENT seems not to be supported:

SQL> CREATE TABLE test (
  id   INT UNSIGNED NOT NULL
, data VARCHAR(128) DEFAULT NULL
, ts   TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
);
ERROR 1178 (42000): The storage engine for the table doesn't support The syntax or the data type(s) is not supported by Columnstore. Please check the Columnstore syntax guide for supported syntax or data types.

Hmmm. What is wrong now? Checking the MariaDB ColumnStore datatypes does not show any obvious problem. Also the the page ColumnStore Create Table was not helpful. Relying on intuition I tried this:

SQL> CREATE TABLE test (
  id   INT UNSIGNED NOT NULL
, data VARCHAR(128) DEFAULT NULL
, ts   TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
);
ERROR 1815 (HY000): Internal error: The default value is out of range for the specified data type.

Becoming stressed a bit now this:

SQL> CREATE TABLE test (
  id   INT UNSIGNED NOT NULL
, data VARCHAR(128) DEFAULT NULL
, ts   TIMESTAMP NOT NULL
);
ERROR 1815 (HY000): Internal error: Config::Config: error accessing config file /etc/columnstore/Columnstore.xml

OK. This makes sense. We were a bit to restrictive in the first run above when we were hardening the MariaDB ColumnStore configuration file.

After fixing this we tried again and it finally works...

Query OK, 0 rows affected (1.755 sec)

For comparison only: This CREATE TABLE DDL statement with InnoDB takes about 20 ms.

Migrating MariaDB (InnoDB, MyISAM, Aria) tables is not so straight forward as it could be. So the MariaDB ColumnStore seems to be still a bit picky! Interesting is, that if we look at the CREATE TABLE statement it is shown in a non usable form:

SQL> SHOW CREATE TABLE test\G
*************************** 1. row ***************************
       Table: test
Create Table: CREATE TABLE `test` (
  `id` int(10) unsigned NOT NULL,
  `data` varchar(128) DEFAULT NULL,
  `ts` timestamp NOT NULL DEFAULT current_timestamp() ON UPDATE current_timestamp()
) ENGINE=Columnstore DEFAULT CHARSET=utf8
1 row in set (0.001 sec)

See also: MCOL-4716, MCOL-1039 and MCOL-4039. This problem seems to be known since a while yet...

This also means that if you want to use mysqldump for migrating the table structure it will not work for restore without editing:

$ mysqldump --user=root --no-data --skip-lock-tables test > /tmp/test_structure_dump.sql

Querying and changing data in MariaDB ColumnStore

Note: This is NOT the preferred way how you should add a massive amount of data into MariaDB ColumnStore!

SQL> INSERT INTO test VALUES (1, 'some data', NULL);
Query OK, 1 row affected (0.224 sec)

SQL> INSERT INTO test VALUES (2, 'some data', NULL);
Query OK, 1 row affected (0.137 sec)

SQL> INSERT INTO test VALUES (3, 'some data', NULL);
Query OK, 1 row affected (0.145 sec)


SQL> SELECT * FROM test;
+----+-----------+---------------------+
| id | data      | ts                  |
+----+-----------+---------------------+
|  1 | some data | 2021-05-11 14:01:31 |
|  2 | some data | 2021-05-11 14:01:35 |
|  3 | some data | 2021-05-11 14:01:38 |
+----+-----------+---------------------+
3 rows in set (0.104 sec)

SQL> DELETE FROM test WHERE id = 2;
Query OK, 1 row affected (0.194 sec)

SQL> SELECT * FROM test;
+----+-----------+---------------------+
| id | data      | ts                  |
+----+-----------+---------------------+
|  1 | some data | 2021-05-11 14:01:31 |
|  3 | some data | 2021-05-11 14:01:38 |
+----+-----------+---------------------+
2 rows in set (0.022 sec)

SQL> UPDATE test SET data = 'new data' WHERE id = 3;
Query OK, 1 row affected (0.222 sec)
Rows matched: 1  Changed: 1  Warnings: 0

SQL> SELECT * FROM test;
+----+-----------+---------------------+
| id | data      | ts                  |
+----+-----------+---------------------+
|  1 | some data | 2021-05-11 14:01:31 |
|  3 | new data  | 2021-05-11 14:03:11 |
+----+-----------+---------------------+
2 rows in set (0.018 sec)

So basic functionality (INSERT, UPDATE, DELETE and SELECT) works. But as we can see, response times for these tiny data sets and single row DML commands are fare above as we are used with other MariaDB Storage Engines (InnoDB, Aria, MyISAM), which is expected... But at least it works. So this is not the best use case for MariaDB ColumnStore. And SELECT * FROM <table> is probably one of the worst things you can do to a ColumnStore in general.

MariaDB ColumnStore using sequences instead of AUTO_INCREMENT

Because AUTO_INCREMENT is not supported in MariaDB ColumnStore we try to use a SEQUENCE instead:

SQL> CREATE SEQUENCE test_s START WITH 5 INCREMENT BY 1;

SQL> INSERT INTO test SELECT NEXT VALUE FOR test_s, 'Some new data', CURRENT_TIMESTAMP();
Query OK, 1 row affected (1.403 sec)
Records: 1  Duplicates: 0  Warnings: 0

SQL> INSERT INTO test SELECT NEXT VALUE FOR test_s, 'Some new data', CURRENT_TIMESTAMP() FROM test;
Query OK, 4 rows affected (1.210 sec)
Records: 4  Duplicates: 0  Warnings: 0

Querying the data shows that sequences work but not as expected. They are evaluated once per query and not once per row:

SQL> SELECT * FROM test;
+----+---------------+---------------------+
| id | data          | ts                  |
+----+---------------+---------------------+
|  1 | Some data     | 2021-05-19 16:58:09 |
|  2 | Some data     | 2021-05-19 16:58:14 |
|  3 | Some data     | 2021-05-19 16:58:18 |
|  5 | Some new data | 2021-05-19 16:59:00 |
|  6 | Some new data | 2021-05-19 16:59:20 |
|  6 | Some new data | 2021-05-19 16:59:20 |
|  6 | Some new data | 2021-05-19 16:59:20 |
|  6 | Some new data | 2021-05-19 16:59:20 |
+----+---------------+---------------------+
8 rows in set (0.028 sec)

And if we do this for many rows, we can suddenly see that this becomes much faster than with other MariaDB Storage Engines:

SQL> INSERT INTO test SELECT NEXT VALUE FOR test_s, 'Some new data', CURRENT_TIMESTAMP() FROM test;
Query OK, 131072 rows affected (1.221 sec)
Records: 131072  Duplicates: 0  Warnings: 0

SQL> INSERT INTO test SELECT NEXT VALUE FOR test_s, 'Some new data', CURRENT_TIMESTAMP() FROM test;
Query OK, 262144 rows affected (2.236 sec)
Records: 262144  Duplicates: 0  Warnings: 0

SQL> INSERT INTO test SELECT NEXT VALUE FOR test_s, 'Some new data', CURRENT_TIMESTAMP() FROM test;
Query OK, 524288 rows affected (2.266 sec)
Records: 524288  Duplicates: 0  Warnings: 0

SQL> INSERT INTO test SELECT NEXT VALUE FOR test_s, 'Some new data', CURRENT_TIMESTAMP() FROM test;
Query OK, 1048576 rows affected (4.316 sec)
Records: 1048576  Duplicates: 0  Warnings: 0

SQL> INSERT INTO test SELECT NEXT VALUE FOR test_s, 'Some new data', CURRENT_TIMESTAMP() FROM test;
Query OK, 2097152 rows affected (8.367 sec)
Records: 2097152  Duplicates: 0  Warnings: 0

SQL> INSERT INTO test SELECT NEXT VALUE FOR test_s, 'Some new data', CURRENT_TIMESTAMP() FROM test;
Query OK, 4194304 rows affected (15.472 sec)
Records: 4194304  Duplicates: 0  Warnings: 0

To compare with InnoDB: About above 1 Mio rows seems to be the break even for MariaDB ColumnStore:

SQL> INSERT INTO test_innodb SELECT NEXT VALUE FOR test_s, 'Some new data', CURRENT_TIMESTAMP() FROM test_innodb;
Query OK, 1048576 rows affected (3.499 sec)
Records: 1048576  Duplicates: 0  Warnings: 0

SQL> INSERT INTO test_innodb SELECT NEXT VALUE FOR test_s, 'Some new data', CURRENT_TIMESTAMP() FROM test_innodb;
Query OK, 2097152 rows affected (10.100 sec)
Records: 2097152  Duplicates: 0  Warnings: 0

SQL> INSERT INTO test_innodb SELECT NEXT VALUE FOR test_s, 'Some new data', CURRENT_TIMESTAMP() FROM test_innodb;
Query OK, 4194304 rows affected (21.766 sec)
Records: 4194304  Duplicates: 0  Warnings: 0

MariaDB ColumnStore Query Execution Plan

If we want to see how the queries are executed the usual EXPLAIN command does not help much:

SQL> EXPLAIN SELECT * FROM test;
+------+---------------+-------+------+---------------+------+---------+------+------+-------+
| id   | select_type   | table | type | possible_keys | key  | key_len | ref  | rows | Extra |
+------+---------------+-------+------+---------------+------+---------+------+------+-------+
|    1 | PUSHED SELECT | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | NULL  |
+------+---------------+-------+------+---------------+------+---------+------+------+-------+

Instead we should do it in MariaDB ColumnStore like this:

SQL> SELECT calSetTrace(1);
+----------------+
| calSetTrace(1) |
+----------------+
|              0 |
+----------------+

SQL> SELECT * FROM test;
...
2048 rows in set, 1 warning (0.090 sec)

SQL> SELECT calGetTrace()\G
*************************** 1. row ***************************
calGetTrace(): 
Desc Mode Table TableOID ReferencedColumns PIO LIO PBE Elapsed Rows 
BPS  PM   test  5086     (data,id,ts)      69  44  0   0.066   2048 
TNS  UM   -     -        -                 -   -   -   0.000   2048 

1 row in set (0.000 sec)

If we run the query a second time we can see that the physical I/O (PIO) has been reduced, so everything comes out of memory:

SQL> SELECT * FROM test;
...
2048 rows in set, 1 warning (0.021 sec)

SQL> SELECT calGetTrace()\G
*************************** 1. row ***************************
calGetTrace(): 
Desc Mode Table TableOID ReferencedColumns PIO LIO PBE Elapsed Rows 
BPS  PM   test  5086     (data,id,ts)      0   44  0   0.000   2048 
TNS  UM   -     -        -                 -   -   -   0.000   2048 

1 row in set (0.000 sec)

See also: Query execution and Viewing the ColumnStore query plan.