Appendix C. Restrictions and Limits
Table of Contents
The discussion here describes restrictions that apply to the use of MySQL features such as subqueries.
Known bug: If you compare a
NULLvalue to a subquery usingALL,ANY, orSOME, and the subquery returns an empty result, the comparison might evaluate to the nonstandard result ofNULLrather than toTRUEorFALSE. This is fixed in MySQL 5.0.36 and 5.1.16.A subquery's outer statement can be any one of:
SELECT,INSERT,UPDATE,DELETE,SET, orDO.Subquery optimization for
INis not as effective as for the=operator or for theIN(operator.value_list)A typical case for poor
INsubquery performance is when the subquery returns a small number of rows but the outer query returns a large number of rows to be compared to the subquery result.The problem is that, for a statement that uses an
INsubquery, the optimizer rewrites it as a correlated subquery. Consider the following statement that uses an uncorrelated subquery:SELECT ... FROM t1 WHERE t1.a IN (SELECT b FROM t2);
The optimizer rewrites the statement to a correlated subquery:
SELECT ... FROM t1 WHERE EXISTS (SELECT 1 FROM t2 WHERE t2.b = t1.a);
If the inner and outer queries return
MandNrows, respectively, the execution time becomes on the order ofO(, rather thanM×N)O(as it would be for an uncorrelated subquery.M+N)An implication is that an
INsubquery can be much slower than a query written using anIN(operator that lists the same values that the subquery would return.value_list)In general, you cannot modify a table and select from the same table in a subquery. For example, this limitation applies to statements of the following forms:
DELETE FROM t WHERE ... (SELECT ... FROM t ...); UPDATE t ... WHERE col = (SELECT ... FROM t ...); {INSERT|REPLACE} INTO t (SELECT ... FROM t ...);Exception: The preceding prohibition does not apply if you are using a subquery for the modified table in the
FROMclause. Example:UPDATE t ... WHERE col = (SELECT * FROM (SELECT ... FROM t...) AS _t ...);
Here the prohibition does not apply because a subquery in the
FROMclause is materialized as a temporary table, so the relevant rows inthave already been selected by the time the update tottakes place.Row comparison operations are only partially supported:
For
,exprIN (subquery)exprcan be ann-tuple (specified via row constructor syntax) and the subquery can return rows ofn-tuples.For
,exprop{ALL|ANY|SOME} (subquery)exprmust be a scalar value and the subquery must be a column subquery; it cannot return multiple-column rows.
In other words, for a subquery that returns rows of
n-tuples, this is supported:(
val_1, ...,val_n) IN (subquery)But this is not supported:
(
val_1, ...,val_n)op{ALL|ANY|SOME} (subquery)The reason for supporting row comparisons for
INbut not for the others is thatINis implemented by rewriting it as a sequence of=comparisons andANDoperations. This approach cannot be used forALL,ANY, orSOME.Row constructors are not well optimized. The following two expressions are equivalent, but only the second can be optimized:
(col1, col2, ...) = (val1, val2, ...) col1 = val1 AND col2 = val2 AND ...
Subqueries in the
FROMclause cannot be correlated subqueries. They are materialized (executed to produce a result set) before evaluating the outer query, so they cannot be evaluated per row of the outer query.The optimizer is more mature for joins than for subqueries, so in many cases a statement that uses a subquery can be executed more efficiently if you rewrite it as a join.
An exception occurs for the case where an
INsubquery can be rewritten as aSELECT DISTINCTjoin. Example:SELECT col FROM t1 WHERE id_col IN (SELECT id_col2 FROM t2 WHERE
condition);That statement can be rewritten as follows:
SELECT DISTINCT col FROM t1, t2 WHERE t1.id_col = t2.id_col AND
condition;But in this case, the join requires an extra
DISTINCToperation and is not more efficient than the subquery.Possible future optimization: MySQL does not rewrite the join order for subquery evaluation. In some cases, a subquery could be executed more efficiently if MySQL rewrote it as a join. This would give the optimizer a chance to choose between more execution plans. For example, it could decide whether to read one table or the other first.
Example:
SELECT a FROM outer_table AS ot WHERE a IN (SELECT a FROM inner_table AS it WHERE ot.b = it.b);
For that query, MySQL always scans
outer_tablefirst and then executes the subquery oninner_tablefor each row. Ifouter_tablehas a lot of rows andinner_tablehas few rows, the query probably will not be as fast as it could be.The preceding query could be rewritten like this:
SELECT a FROM outer_table AS ot, inner_table AS it WHERE ot.a = it.a AND ot.b = it.b;
In this case, we can scan the small table (
inner_table) and look up rows inouter_table, which will be fast if there is an index on(ot.a,ot.b).Possible future optimization: A correlated subquery is evaluated for each row of the outer query. A better approach is that if the outer row values do not change from the previous row, do not evaluate the subquery again. Instead, use its previous result.
Possible future optimization: A subquery in the
FROMclause is evaluated by materializing the result into a temporary table, and this table does not use indexes. This does not allow the use of indexes in comparison with other tables in the query, although that might be useful.Possible future optimization: If a subquery in the
FROMclause resembles a view to which the merge algorithm can be applied, rewrite the query and apply the merge algorithm so that indexes can be used. The following statement contains such a subquery:SELECT * FROM (SELECT * FROM t1 WHERE t1.t1_col) AS _t1, t2 WHERE t2.t2_col;
The statement can be rewritten as a join like this:
SELECT * FROM t1, t2 WHERE t1.t1_col AND t2.t2_col;
This type of rewriting would provide two benefits:
It avoids the use of a temporary table for which no indexes can be used. In the rewritten query, the optimizer can use indexes on
t1.It gives the optimizer more freedom to choose between different execution plans. For example, rewriting the query as a join allows the optimizer to use
t1ort2first.
Possible future optimization: For
IN,= ANY,<> ANY,= ALL, and<> ALLwith uncorrelated subqueries, use an in-memory hash for a result or a temporary table with an index for larger results. Example:SELECT a FROM big_table AS bt WHERE non_key_field IN (SELECT non_key_field FROM
tableWHEREcondition)In this case, we could create a temporary table:
CREATE TABLE t (key (non_key_field)) (SELECT non_key_field FROM
tableWHEREcondition)Then, for each row in
big_table, do a key lookup intbased onbt.non_key_field.
Identifiers are stored in
mysqldatabase tables (user,db, and so forth) usingutf8, but identifiers can contain only characters in the Basic Multilingual Plane (BMP). Supplementary characters are not allowed in identifiers.The
ucs2character sets has the following restrictions:It cannot be used as a client character set, which means that it does not work for
SET NAMESorSET CHARACTER SET. (See Section 9.1.4, “Connection Character Sets and Collations”.)It is currently not possible to use
LOAD DATA INFILEto load data files that use this character set.FULLTEXTindexes cannot be created on a column that this character set. However, you can performIN BOOLEAN MODEsearches on the column without an index.
The
REGEXPandRLIKEoperators work in byte-wise fashion, so they are not multi-byte safe and may produce unexpected results with multi-byte character sets. In addition, these operators compare characters by their byte values and accented characters may not compare as equal even if a given collation treats them as equal.
This section lists current limits in MySQL 4.1.
In MySQL 4.1, the maximum number of tables that can be referenced in a single join is 61. This also applies to the number of tables that can be referenced in the definition of a view.
There is a hard limit of 4096 columns per table, but the effective maximum may be less for a given table. The exact limit depends on several interacting factors, listed in the following discussion.
Every table has a maximum row size of 65,535 bytes. This maximum applies to all storage engines, but a given engine might have additional constraints that result in a lower effective maximum row size.
The maximum row size constrains the number of columns because the total width of all columns cannot exceed this size. For example,
utf8characters require up to three bytes per character, so for aCHAR(255) CHARACTER SET utf8column, the server must allocate 255 × 3 = 765 bytes per value. Consequently, a table cannot contain more than 65,535 / 765 = 85 such columns.Storage for variable-length columns includes length bytes, which are assessed against the row size. For example, a
VARCHAR(255) CHARACTER SET utf8column takes two bytes to store the length of the value, so each value can take up to 767 bytes.BLOBandTEXTcolumns count from one to four plus eight bytes each toward the row-size limit because their contents are stored separately.Declaring columns
NULLcan reduce the maximum number of columns allowed.NULLcolumns require additional space in the row to record whether or not their values areNULL.For
MyISAMandISAMtables, eachNULLcolumn takes one bit extra, rounded up to the nearest byte. The maximum row length in bytes can be calculated as follows:row length = 1 + (sum of column lengths) + (number of NULL columns+delete_flag+ 7)/8 + (number of variable-length columns)delete_flagis 1 for tables with static row format. Static tables use a bit in the row record for a flag that indicates whether the row has been deleted.delete_flagis 0 for dynamic tables because the flag is stored in the dynamic row header.These calculations do not apply for
InnoDBtables, for which storage size is no different forNULLcolumns than forNOT NULLcolumns.Each table has an
.frmfile that contains the table definition. The.frmfile size limit is fixed at 64KB. If a table definition reaches this size, no more columns can be added. The expression that checks information to be stored in the.frmfile against the limit looks like this:if (info_length+(ulong) create_fields.elements*FCOMP+288+ n_length+int_length+com_length > 65535L || int_count > 255)The relevant factors in this expression are:
info_lengthis space needed for “screens.” This is related to MySQL's Unireg heritage.create_fields.elementsis the number of columns.FCOMPis 17.n_lengthis the total length of all column names, including one byte per name as a separator.int_lengthis related to the list of values for SET and ENUM columns.com_lengthis the total length of column and table comments.
Thus, using long column names can reduce the maximum number of columns, as can the inclusion of
ENUMorSETcolumns, or use of column or table comments.Individual storage engines might impose additional restrictions that limit table column count. Examples:
The following limitations apply only to the Windows platform:
The number of open file descriptors on Windows is limited to a maximum of 2048, which may limit the ability to open a large number of tables simultaneously. This limit is due to the compatibility functions used to open files on Windows that use the POSIX compatibility layer.
This limitation will also cause problems if you try to set
open_files_limitto a value greater than the 2048 file limit.On Windows 32-bit platforms it is not possible to use more than 2GB of RAM within a single process, including MySQL. This is because the physical address limit on Windows 32-bit is 4GB and the default setting within Windows is to split the RAM between kernel (2GB) and user/applications (2GB).
You can increase this limit to 3GB by specifying the
/3GBoption in theboot.inifile. This changes the kernel/application memory split to 1GB and 3GB respectively. This boot option is available on Windows XP, Windows Server 2003, and Windows Server 2008.To use more memory than this you will need to use a 64-bit version of Windows.
When using
MyISAMtables, you cannot use aliases within Windows link to the data files on another volume and then link back to the main MySQLdatadirlocation.This facility is often used to move the data and index files to a RAID or other fast solution, while retaining the main
.FRMfiles in the default data directory configured with thedatadiroption.
The timers within MySQL used on Windows are of a lower precision than the timers used on Linux.
There is no 64-bit OLEDB Provider for ODBC (MSDASQL) in any 64-bit Windows operating system up to and including Windows Vista. In practical terms this means that you can't use the MySQL ODBC driver from ADO and other users of OLEDB.
