Chapter 10. MySQL Cluster Disk Data Tables
Table of Contents
Beginning with MySQL 5.1.6, it is possible to store the nonindexed
columns of NDB tables on disk, rather
than in RAM as with previous versions of MySQL Cluster.
As part of implementing MySQL Cluster Disk Data work, a number of improvements were made in MySQL Cluster for the efficient handling of very large amounts (terabytes) of data during node recovery and restart. These include a “no-steal” algorithm for synchronising a starting node with very large data sets. For more information, see the paper Recovery Principles of MySQL Cluster 5.1, by MySQL Cluster developers Mikael Ronström and Jonas Oreland.
MySQL Cluster Disk Data performance can be influenced by a number of configuration parameters. For information about these parameters and their effects, see MySQL Cluster Disk Data configuration parameters and MySQL Cluster Disk Data storage and GCP Stop errors
The performance of a MySQL Cluster that uses Disk Data storage can also be greatly improved by separating data node file systems from undo log files and tablespace data files, which can be done using symbolic links. For more information, see Section 10.2, “Using Symbolic Links with Disk Data Objects”.
MySQL Cluster Disk Data storage is implemented using a number of Disk Data objects. These include the following:
Tablespaces act as containers for other Disk Data objects.
Undo log files undo information required for rolling back transactions.
One or more undo log files are assigned to a log file group, which is then assigned to a tablespace.
Data files store Disk Data table data. A data file is assigned directly to a tablespace.
Undo log files and data files are actual files in the filesystem
of each data node; by default they are placed in
ndb_ in
the node_id_fsDataDir specified in the MySQL
Cluster config.ini file, and where
node_id is the data node's node
ID. It is possible to place these elsewhere by specifying either
an absolute or relative path as part of the filename when creating
the undo log or data file. Statements that create these files are
shown later in this section.
MySQL Cluster tablespaces and log file groups are not implemented as files.
Important
Although not all Disk Data objects are implemented as files,
they all share the same namespace. This means that
each Disk Data object must be uniquely
named (and not merely each Disk Data object of a given type).
For example, you cannot have a tablespace and a log file group
both named dd1.
Assuming that you have already set up a MySQL Cluster with all nodes (including management and SQL nodes) running MySQL 5.1.6 or newer, the basic steps for creating a Cluster table on disk are as follows:
Create a log file group, and assign one or more undo log files to it (an undo log file is also sometimes referred to as an undofile).
Note
In MySQL 5.1 and later, undo log files are necessary only for Disk Data tables. They are no longer used for
NDBCLUSTERtables that are stored only in memory.Create a tablespace; assign the log file group, as well as one or more data files, to the tablespace.
Create a Disk Data table that uses this tablespace for data storage.
Each of these tasks can be accomplished using SQL statements in the mysql client or other MySQL client application, as shown in the example that follows.
We create a log file group named
lg_1usingCREATE LOGFILE GROUP. This log file group is to be made up of two undo log files, which we nameundo_1.logandundo_2.log, whose initial sizes are 16 MB and 12 MB, respectively. (The default initial size for an undo log file is 128 MB.) Optionally, you can also specify a size for the log file group's undo buffer, or allow it to assume the default value of 8 MB. In this example, we set the UNDO buffer's size at 2 MB. A log file group must be created with an undo log file; so we addundo_1.logtolg_1in thisCREATE LOGFILE GROUPstatement:CREATE LOGFILE GROUP lg_1 ADD UNDOFILE 'undo_1.log' INITIAL_SIZE 16M UNDO_BUFFER_SIZE 2M ENGINE NDBCLUSTER;To add
undo_2.logto the log file group, use the followingALTER LOGFILE GROUPstatement:ALTER LOGFILE GROUP lg_1 ADD UNDOFILE 'undo_2.log' INITIAL_SIZE 12M ENGINE NDBCLUSTER;Some items of note:
The
.logfile extension used here is not required. We use it merely to make the log files easily recognisable.Every
CREATE LOGFILE GROUPandALTER LOGFILE GROUPstatement must include anENGINEclause. In MySQL 5.1, the permitted values for this clause areNDBCLUSTERandNDB.Important
In MySQL 5.1.8 and later, there can exist only one log file group in the same MySQL Cluster at any given time.
When you add an undo log file to a log file group using
ADD UNDOFILE ', a file with the namefilename'filenameis created in thendb_directory within thenode_id_fsDataDirof each data node in the cluster, wherenode_idis the node ID of the data node. Each undo log file is of the size specified in the SQL statement. For example, if a MySQL Cluster has 4 data nodes, then theALTER LOGFILE GROUPstatement just shown creates 4 undo log files, 1 each on in the data directory of each of the 4 data nodes; each of these files is namedundo_2.logand each file is 12 MB in size.UNDO_BUFFER_SIZEis limited by the amount of system memory available.For more information about the
CREATE LOGFILE GROUPstatement, seeCREATE LOGFILE GROUPSyntax. For more information aboutALTER LOGFILE GROUP, seeALTER LOGFILE GROUPSyntax.
Now we can create a tablespace, which contains files to be used by MySQL Cluster Disk Data tables for storing their data. A tablespace is also associated with a particular log file group. When creating a new tablespace, you must specify the log file group which it is to use for undo logging; you must also specify a data file. You can add more data files to the tablespace after the tablespace is created; it is also possible to drop data files from a tablespace (an example of dropping data files is provided later in this section).
Assume that we wish to create a tablespace named
ts_1which useslg_1as its log file group. This tablespace is to contain two data files nameddata_1.datanddata_2.dat, whose initial sizes are 32 MB and 48 MB, respectively. (The default value forINITIAL_SIZEis 128 MB.) We can do this using two SQL statements, as shown here:CREATE TABLESPACE ts_1 ADD DATAFILE 'data_1.dat' USE LOGFILE GROUP lg_1 INITIAL_SIZE 32M ENGINE NDBCLUSTER; ALTER TABLESPACE ts_1 ADD DATAFILE 'data_2.dat' INITIAL_SIZE 48M ENGINE NDBCLUSTER;The
CREATE TABLESPACEstatement creates a tablespacets_1with the data filedata_1.dat, and associatests_1with log file grouplg_1. TheALTER TABLESPACEadds the second data file (data_2.dat).Some items of note:
As is the case with the
.logfile extension used in this example for undo log files, there is no special significance for the.datfile extension; it is used merely for easy recognition of data files.When you add a data file to a tablespace using
ADD DATAFILE ', a file with the namefilename'filenameis created in thendb_directory within thenode_id_fsDataDirof each data node in the cluster, wherenode_idis the node ID of the data node. Each undo log file is of the size specified in the SQL statement. For example, if a MySQL Cluster has 4 data nodes, then theALTER TABLESPACEstatement just shown creates 4 undo log files, 1 each on in the data directory of each of the 4 data nodes; each of these files is nameddata_2.datand each file is 48 MB in size.All
CREATE TABLESPACEandALTER TABLESPACEstatements must contain anENGINEclause; only tables using the same storage engine as the tablespace can be created in the tablespace. In MySQL 5.1, the only permitted values for this clause areNDBCLUSTERandNDB.For more information about the
CREATE TABLESPACEandALTER TABLESPACEstatements, seeCREATE TABLESPACESyntax, andALTER TABLESPACESyntax.
Now it is possible to create a table whose nonindexed columns are stored on disk in the tablespace
ts_1:CREATE TABLE dt_1 ( member_id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, last_name VARCHAR(50) NOT NULL, first_name VARCHAR(50) NOT NULL, dob DATE NOT NULL, joined DATE NOT NULL, INDEX(last_name, first_name) ) TABLESPACE ts_1 STORAGE DISK ENGINE NDBCLUSTER;The
TABLESPACE ... STORAGE DISKoption tells theNDBCLUSTERstorage engine to use tablespacets_1for disk data storage.Note
Beginning with MySQL Cluster NDB 6.2.5 and MySQL Cluster NDB 6.3.2, it is also possible to specify whether an individual column is stored on disk or in memory by using a
STORAGEclause as part of the column's definition in aCREATE TABLEorALTER TABLEstatement.STORAGE DISKcauses the column to be stored on disk, andSTORAGE MEMORYcauses in-memory storage to be used. SeeCREATE TABLESyntax, for more information.Once table
ts_1has been created as shown, you can performINSERT,SELECT,UPDATE, andDELETEstatements on it just as you would with any other MySQL table.For table
dt_1as it has been defined here, only thedobandjoinedcolumns are stored on disk. This is because there are indexes on theid,last_name, andfirst_namecolumns, and so data belonging to these columns is stored in RAM. In MySQL 5.1, only nonindexed columns can be held on disk; indexes and indexed column data continue to be stored in memory. This tradeoff between the use of indexes and conservation of RAM is something you must keep in mind as you design Disk Data tables.
Performance note. The performance of a cluster using Disk Data storage is greatly improved if Disk Data files are kept on a separate physical disk from the data node file system. This must be done for each data node in the cluster to derive any noticeable benefit.
You may use absolute and relative file system paths with
ADD UNDOFILE and ADD
DATAFILE. Relative paths are calculated relative to the
data node's data directory. You may also use symbolic links; see
Section 10.2, “Using Symbolic Links with Disk Data Objects”, for more
information and examples.
A log file group, a tablespace, and any Disk Data tables using these must be created in a particular order. The same is true for dropping any of these objects:
A log file group cannot be dropped as long as any tablespaces are using it.
A tablespace cannot be dropped as long as it contains any data files.
You cannot drop any data files from a tablespace as long as there remain any tables which are using the tablespace.
Beginning with MySQL 5.1.12, it is no longer possible to drop files created in association with a different tablespace than the one with which the files were created. (Bug#20053)
For example, to drop all the objects created so far in this section, you would use the following statements:
mysql>DROP TABLE dt_1;mysql>ALTER TABLESPACE ts_1->DROP DATAFILE 'data_2.dat'->ENGINE NDBCLUSTER;mysql>ALTER TABLESPACE ts_1->DROP DATAFILE 'data_1.dat'->ENGINE NDBCLUSTER;mysql>DROP TABLESPACE ts_1->ENGINE NDBCLUSTER;mysql>DROP LOGFILE GROUP lg_1->ENGINE NDBCLUSTER;
These statements must be performed in the order shown, except that
the two ALTER TABLESPACE ... DROP DATAFILE
statements may be executed in either order.
You can obtain information about data files used by Disk Data
tables by querying the FILES table in
the INFORMATION_SCHEMA database. An extra
“NULL row” was added to this table
in MySQL 5.1.14 for providing additional information about undo
log files. For more information and examples of use, see
The INFORMATION_SCHEMA FILES Table.
The performance of a MySQL Cluster that uses Disk Data storage can be greatly improved by separating data node file systems from undo log files and tablespace data files and placing these on different disks. While there is currently no direct support for this in MySQL Cluster, it is possible to achieve this separation using symbolic links.
Each data node in the cluster creates a file system in the
directory named
ndb_
under the data node's
node_id_fsDataDir
as defined in the config.ini file. In this
example, we assume that each data node host has 3 disks, aliased
as /data0, /data1, and
/data2, and that the cluster's
config.ini includes the following:
[ndbd default] DataDir= /data0
Our objective is to place all Disk Data log files in
/data1, and all Disk Data data files in
/data2, on each data node host.
Note
In this example, we assume that the cluster's data node hosts are all using Linux operating systems. For other platforms, you may need to substitute you operating system's commands for those shown here.
To accomplish this, perform the following steps:
Under the data node file system create symbolic links pointing to the other drives:
shell>
cd /data0/ndb_2_fsshell>lsD1 D10 D11 D2 D8 D9 LCP shell>ln -s /data0 dnlogsshell>ln -s /data1 dndataYou should now have two symbolic links:
shell>
ls -l --hide=D*lrwxrwxrwx 1 user group 30 2007-03-19 13:58 dndata -> /data1 lrwxrwxrwx 1 user group 30 2007-03-19 13:59 dnlogs -> /data2We show this only for the data node with node ID 2; however, you must do this for each data node.
Now, in the mysql client, create a log file group and tablespace using the symbolic links, as shown here:
mysql>
CREATE LOGFILE GROUP lg1->ADD UNDOFILE 'dnlogs/undo1.log'->INITIAL_SIZE 150M->UNDO_BUFFER_SIZE = 1M->ENGINE=NDBCLUSTER;mysql>CREATE TABLESPACE ts1->ADD DATAFILE 'dndata/data1.log'->USE LOGFILE GROUP lg1->INITIAL_SIZE 1G->ENGINE=NDBCLUSTER;Verify that the files were created and placed correctly as shown here:
shell>
cd /data1shell>ls -ltotal 2099304 -rw-rw-r-- 1 user group 157286400 2007-03-19 14:02 undo1.dat shell>cd /data2shell>ls -ltotal 2099304 -rw-rw-r-- 1 user group 1073741824 2007-03-19 14:02 data1.datIf you are running multiple data nodes on one host, you must take care to avoid having them try to use the same space for Disk Data files. You can make this easier by creating a symbolic link in each data node filesystem. Suppose you are using
/data0for both data node filesystems, but you wish to have the Disk Data files for both nodes on/data1. In this case, you can do something similar to what is shown here:shell>
cd /data0shell>ln -s ndb_2_fs/dd /data1/dn2shell>ln -s ndb_3_fs/dd /data1/dn3shell>ls -l --hide=D* ndb_2_fslrwxrwxrwx 1 user group 30 2007-03-19 14:22 dd -> /data1/dn2 shell>ls -l --hide=D* ndb_3_fslrwxrwxrwx 1 user group 30 2007-03-19 14:22 dd -> /data1/dn3Now you can create a logfile group and tablespace using the symbolic link, like this:
mysql>
CREATE LOGFILE GROUP lg1->ADD UNDOFILE 'dd/undo1.log'->INITIAL_SIZE 150M->UNDO_BUFFER_SIZE = 1M->ENGINE=NDBCLUSTER;mysql>CREATE TABLESPACE ts1->ADD DATAFILE 'dd/data1.log'->USE LOGFILE GROUP lg1->INITIAL_SIZE 1G->ENGINE=NDBCLUSTER;Verify that the files were created and placed correctly as shown here:
shell>
cd /data1shell>lsdn2 dn3 shell>ls dn2undo1.log data1.log shell>ls dn3undo1.log data1.log
The following items apply to Disk Data storage requirements:
Variable-length columns of Disk Data tables take up a fixed amount of space. For each row, this is equal to the space required to store the largest possible value for that column.
For general information about calculating these values, see Data Type Storage Requirements.
You can obtain an estimate the amount of space available in data files and undo log files by querying the
INFORMATION_SCHEMA.FILEStable. For more information and examples, see TheINFORMATION_SCHEMA FILESTable.Note
The
OPTIMIZE TABLEstatement does not have any effect on Disk Data tables.In a Disk Data table, the first 256 bytes of a
TEXTorBLOBcolumn are stored in memory; only the remainder is stored on disk.Each row in a Disk Data table uses 8 bytes in memory to point to the data stored on disk. This means that, in some cases, converting an in-memory column to the disk-based format can actually result in greater memory usage. For example, convering a
CHAR(4)column from memory-based to disk-based format increases the amount ofDataMemoryused per row from 4 to 8 bytes.
Important
Starting the cluster with the --initial option
does not remove Disk Data files. You must
remove these manually prior to performing an initial restart of
the cluster.
