MySQL - Indexes

mysql

https://www.xaprb.com/blog/2006/07/02/innotop-mysql-innodb-monitor/
https://www.vividcortex.com/blog/2014/11/03/mysql-query-performance-statistics-in-the-performance-schema/
https://dev.mysql.com/doc/refman/5.6/en/performance-schema-query-profiling.html
https://dev.mysql.com/doc/refman/5.5/en/performance-schema-table-descriptions.html
http://stackoverflow.com/questions/3874199/how-to-store-historical-data
https://dev.mysql.com/doc/refman/5.7/en/mysql-indexes.html
http://blog.pu-gh.com/2015/01/14/mysql-datetime-indexes-and-intervals/
https://logicalread.com/improve-perf-mysql-index-columns-mc12/#.WO51NojyuCs
https://logicalread.com/impact-of-adding-mysql-indexes-mc12/#.WPLtcI7_r_Q
https://logicalread.com/limitations-in-mysql-indexes-mc12/#.WPLtdI7_r_Q
https://logicalread.com/mysql-index-usages-mc12/#.WPLtfI7_r_Q
https://logicalread.com/mysql-storage-engines-mc12/#.WPLv0Y7_r_Q
https://logicalread.com/mysql-index-cardinality-mc12/#.WPLwdI7_r_Q
https://logicalread.com/mysql-multi-column-indexes-mc12/#.WPLtgI7_r_Q
https://logicalread.com/optimize-mysql-indexes-mc12/#.WPLthY7_r_Q
https://logicalread.com/mysql-performance-tuning-with-indexes-mc13/#.WPLti47_r_Q

http://billauer.co.il/blog/2009/03/mysql-datetime-epoch-unix-time/
https://webmonkeyuk.wordpress.com/2010/09/22/what-makes-a-good-mysql-index-part-1-column-size/
https://webmonkeyuk.wordpress.com/2010/09/27/what-makes-a-good-mysql-index-part-2-cardinality/
http://www.dbforums.com/showthread.php?1683544-Datetime-Indexing-Problem
http://use-the-index-luke.com/sql/where-clause/obfuscation/dates

http://stackoverflow.com/questions/15425230/is-it-a-good-idea-to-index-datetime-field-in-mysql

What are the reason for using indexes?

  1. Indexes allow rows to be found more efficiently. For large table, the presence of an index can make the difference between a query that perform quickly and one that is unacceptably slow.
  2. We can use index to enforce uniqueness constraints to ensure that duplicate values do not occur.

How can we show available indexes?

You can use db_name.tbl_name as an alternative to the tbl_name FROM db_name syntax. These two statements are equivalent:

SHOW INDEX FROM yourtable;
SHOW INDEX FROM mytable FROM mydb;
SHOW INDEX FROM mydb.mytable;
SELECT DISTINCT
    TABLE_NAME,
    INDEX_NAME
FROM INFORMATION_SCHEMA.STATISTICS
WHERE TABLE_SCHEMA = 'your_schema';

The "SHOW CREATE TABLE tableName" statement also displays the indexes probably in a nicer way.

How can we add primary key index?

ALTER TABLE HeadOfState ADD PRIMARY KEY (ID);

How can we add composite index?

ALTER TABLE HeadOfState ADD INDEX (LastName, FirstName);

How can we add indexes to existing tables?

To add an index to a table, we can use ALTER TABLE or CREATE INDEX. ALTER TABLE is the most flexible.

ALTER TABLE HeadOfState ADD PRIMARY KEY (ID);
ALTER TABLE HeadOfState ADD INDEX (LastName, FirstName);

MySQL allows multiple actions to be performed with a single ALTER TABLE statement. One common use for multiple actions is to add several indexes to a table at the same time, which is more efficient than adding each one separately. To add several indexes at the same time:

ALTER TABLE HeadOfState ADD PRIMARY KEY (ID), ADD INDEX (LastName,FirstName);

The syntax for CREATE INDEX:

CREATE UNIQUE INDEX IDIndex ON HeadOfState (ID);
CREATE INDEX index_name ON HeadOfState (LastName, FirstName);

With CREATE INDEX, we must provide a name for the index. With ALTER TABLE, MySQL creates an index name automatically if you don't provide one.

Unlike ALTER TABLE, the CREATE INDEX statement can create only a single index per statement. In addition, only ALTER TABLE supports the use of PRIMARY KEY.

How can we create an index using only the first N characters of a given column?

CREATE INDEX indexName ON tableName (columnName(n));

How can we create indexes at table creation time?

CREATE TABLE HeadOfState(
    ID INT NOT NULL,
    LastName CHAR(30) NOT NULL,
    FirstName CHAR(30) NOT NULL,
    CountryCode CHAR(3) NOT NULL,
    Inauguration DATE NOT NULL,
    INDEX (Inauguration)
);

The keyword KEY may be used instead of INDEX.

To include multiple columns in an index (that is, to create a composite index), list all the column names within the parantheses, separated by commas:

CREATE TABLE HeadOfState(
    ID INT NOT NULL,
    LastName CHAR(30) NOT NULL,
    FirstName CHAR(30) NOT NULL,
    CountryCode CHAR(3) NOT NULL,
    Inauguration Date NOT NULL,
    INDEX (LastName,FirstName)
);

How can we drop primary key or index?

ALTER TABLE HeadOfState DROP PRIMARY KEY;
ALTER TABLE HeadOfState DROP INDEX index_name;

How can we delete existing indexes?

ALTER TABLE HeadOfState DROP PRIMARY KEY;

To drop another kind of index, you must specify its name. If you don't know the index name, you can use SHOW CREATE TABLE:

SHOW CREATE TABLE HeadOfState\G
ALTER TABLE HeadOfState DROP INDEX index_name;
DROP INDEX index_name ON table_name;

To drop a PRIMARY KEY with DROP INDEX, refer to the index name (PRIMARY), but use a quoted identifier because this name is a reserved word:

DROP INDEX `PRIMARY` ON t;

Unlike ALTER TABLE, the DROP INDEX statement can drop only a single index per statement.

What are different types of indexes?

  1. A primary index is an index for which each index value differs from every other and uniquely identifies a single row in the table. A primary key cannot contain NULL values.
  2. A unique index is similar to a primary key, except that it can be allowed to contain NULL values. Each non-NULL value uniquely identifies a single row in the table.
  3. A non-unique index is an index in which any key value may occur multiple times.
  4. A FULLTEXT index is specially designed for text searching
  5. A SPATIAL index is designed for spatial data types.

There is one exception to the uniqueness of values in a UNIQUE index: If a column in the index may contain NULL values, multiple NULL values are allowed. This differs from the behavior for non-NULL values.

A PRIMARY KEY is similar to a UNIQUE index. The differences between the two are as follows:

  1. A PRIMARY KEY cannot contain NULL values; a UNIQUE index can. If a unique-valued index must be allowed to contain NULL values, you must use a UNIQUE index, not a PRIMARY KEY
  2. Each table may have only one index defined as a PRIMARY KEY. The internal name for a PRIMARY KEY is always PRIMARY, and there can be only one index with a given name. It is possible to have multiple UNIQUE indexes for a table.

To index a column as a PRIMARY KEY, use the keywords PRIMARY KEY rather than UNIQUE and declare the column NOT NULL to make sure that it cannot contain NULL values.

A UNIQUE index that does not allow NULL values is functionally equivalent to a PRIMARY KEY index.

Composite indexes can be created for any type of index.

A table can have multiple indexes.

To create a unique-valued index, use the UNIQUE keyword instead of INDEX:

CREATE TABLE HeadOfState(
    ID INT NOT NULL,
    LastName CHAR(30) NOT NULL,
    FirstName CHAR(30) NOT NULL,
    CountryCode CHAR(3) NOT NULL,
    Inauguration DATE NOT NULL,
    UNIQUE (ID)
);
CREATE TABLE t(
    ID INT NOT NULL,
    name CHAR(30) NOT NULL,
    PRIMARY KEY (id)
);

CREATE TABLE t(
    ID INT NOT NULL,
    name CHAR(30) NOT NULL,
    UNIQUE (id)
);

For a single-column primary key, you can add the keywords PRIMARY KEY or UNIQUE directly to the end of the column definition:

CREATE TABLE t(
    ID INT NOT NULL PRIMARY KEY,
    name CHAR(30) NOT NULL
);

CREATE TABLE t(
    ID INT NOT NULL UNIQUE,
    name CHAR(30) NOT NULL
);

You can declare a PRIMARY KEY or UNIQUE index as a composite index that spans multiple columns. In this case, the index must be declared using a separate clause. (You cannot add the PRIMARY KEY or UNIQUE keywords to the end of a column definition because the index would apply only to that column.) The following definition creates a primary key on the last_name and first_name columns using a PRIMARY KEY clause:

CREATE TABLE people(
    last_name CHAR(30) NOT NULL,
    first_name CHAR(30) NOT NULL,
    PRIMARY KEY (last_name, first_name)
);

This primary key definition allows any given last name or first name to appear multiple times in the table, but no combination of last and first name can occur more than once.

For all index types other than PRIMARY KEY, you can name an index by including the name just before the column list:

CREATE TABLE HeadOfState(
    ID INT NOT NULL,
    LastName CHAR(30) NOT NULL,
    FirstName CHAR(30) NOT NULL,
    CountryCode CHAR(3) NOT NULL,
    Inauguration DATE NOT NULL,
    INDEX NameIndex (LastName, FirstName),
    UNIQUE IDIndex (ID)
);

If you don't provide a name for an index, MySQL assigns a name for you based on the name of the first column in the index. For a PRIMARY KEY, you provide no name because the name is always PRIMARY. A consequence of this fact is that you cannot define more than one PRIMARY KEY per table because indexes, like columns, must have unique names.

Index names are displayed by the SHOW CREATE TABLE or SHOW INDEX statement.

How can we choose the right indexing algorithm?

When we create an index, it is possible to specify the indexing algorithm to be used. The only engine for which this feature is currently applicable is the MEMORY engine that manage in-memory tables. For other engines, the syntax is recognized but ignored.

MEMORY tables use hash indexes by default. This index algorithm provides very fast lookups for all operations that use a unique index. However, hash indexes are usable only for comparisons that use the = or <=> operator. Also, for non-unique indexes, operations that change the indexed values (including DELETE statements) can become relatively slow when there are many duplicate index values.

If you will have only unique indexes on a MEMORY table, you should create them as HASH indexes. Because HASH indexes are the default for MEMORY tables, you can do so when defining an index either by specifying an explicit USING HASH clause or by omitting the index algorithm specification entirely.

CREATE TABLE lookup(
    ID INT,
    INDEX USING HASH (id)
) ENGINE = MEMORY;

On the other hand, if a MEMORY table contains only non-unique indexes for which you expect that there will be many duplicate values in the index key, a BTREE index is preferable. BTREE indexes also are usable if the indexed column will be used with comparison operators other than = or <=>. For example, BTREE can be used for range searches such as id < 100 or id BETWEEN 200 AND 300. To create an index that use BTREE algorithm:

CREATE TABLE lookup(
    id INT,
    INDEX USING BTREE (id)
) ENGINE = MEMORY;

If you have already create the table, you can add a new index:

ALTER TABLE lookup ADD INDEX USING BTREE (id);
CREATE INDEX index_name USING BTREE ON lookup (id);

Can MySQL use indexes when it involves a function?

Consider this SQL statement:

SELECT news_id,news_title,news_hit 
FROM news 
WHERE DATE(news_date) = '2007-03-14' ORDER BY news_hit DESC LIMIT 10

MySQL won't/can't use indexes where you apply a function to a column in a where condition. In your case it's the DATE() function that's stopping the query use an index. Perhaps we can rewrite the above SQL statement as:

SELECT news_id,news_title,news_hit 
FROM news 
WHERE news_date between '2007-03-14 00:00:00' and '2007-03-15 00:00:00' ORDER BY news_hit DESC LIMIT 10;

which would use the index on news_date.

How can we combine multiple ALTER TABLE operation into one statement?

Separate different clauses with a comma:

ALTER TABLE HeadOfState 
  RENAME TO CountryLeader, 
  MODIFY ID BIGINT UNSIGNED NOT NULL, 
  ADD Salutation CHAR(30) NULL AFTER FirstName;
ALTER TABLE HeadOfState 
  ADD PRIMARY KEY (ID), 
  ADD INDEX (LastName, FirstName);

How can we temporarily disable indexes during large data import?

If we are using MyISAM table, we can temporarily disable indexes:

ALTER TABLE mytable DISABLE KEYS;

If we are using InnoDB table, the above DDL will display a warning message: "Table storage engine for 'mytable' doesn't have this option". There is a very good reason why you cannot execute DISABLE KEYS on an InnoDB table; InnoDB is not designed to use it, and MyISAM is. In fact, here is what happens when you reload a mysqldump:

  1. You will see a CREATE TABLE for a MyISAM table following by a write lock.
  2. Before all the bulk inserts are run, a call to ALTER TABLE … DISABLE KEYS is done. What this does is turn off secondary indexes in the MyISAM table.
  3. Then, bulk inserts are done. While this is being done, the PRIMARY KEY and all UNIQUE KEYS in the MyISAM table are disabled. Before the UNLOCK TABLEs, a call ALTER TABLE … ENABLE KEYS is done in order to rebuild all non-unique indexes linearly.

IMHO this operation was not coded into the InnoDB Storage Engine because all keys in a non-unique index come with the primary key entry from gen_clust_index (aka Clustered Index). That would be a very expensive operation since building a non-unique index would require O(n log n) running time to retrieve each unique key to attach to a non-unique key. In light of this, posting a warning about trying to DISABLE KEYS/ENABLE KEYS on an InnoDB table is far easier than coding exceptions to the mysqldump for any special cases involving non-MyISAM storage engines.

I regularly build very large innodb tables with lots of indexes (hundreds of gigabytes / 10+ different indexes). There is a huge reduction in creation time when I INSERT into tables with no indexes. Obviously, "huge" depends greatly on the ultimate size of your table, the number and type of indexes, the types of data elements, etc. I usually build the table with nothing but the primary key (not needed if you're just loading data that already has a primary key id), and then build the indexes at the end. You may not notice a difference at 5MM rows and 1.5GB, but as your tables grow you will.

For InnoDB table, try using:

SET autocommit=0; 
SET unique_checks=0; 
SET foreign_key_checks=0;

Disable/enable keys is a global setting, not per session. Use:

show keys in my_table

to see if the indexes are being disabled or enabled.

If the above tips do not seem to help, try deleting the indexes and then add them back after we are done.

To reduce the costs for re-calculating the indexes you should insert the data either using DATA INFILE or using Mysql Multi Row Inserts, like:

INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);

to insert several rows with one statement. How many rows one can insert with one statement depends on the max_allowed_packet mysql setting.

  1. https://dev.mysql.com/doc/refman/5.5/en/optimizing-innodb-bulk-data-loading.html
  2. http://stackoverflow.com/questions/9524938/how-to-disable-index-in-innodb - done reading

Does MySQL support function based index?

The answer depends on the version of MySQL that we are using.

I don't believe 5.6 has this functionality. It is possible to only use the leading part of a column (this functionality has been around for a long time), but not one starting at the second or subsequent characters, or any other more complex function. For example, the following creates an index using the first five characters of a name:

create index name_first_five on cust_table (name(5));

For more complex expressions, you can achieve a similar effect by having another column with the indexable data in it, then using insert/update triggers to ensure it's populated correctly. Other than the wasted space for redundant data, that's pretty much the same thing. Although it technically violates 3NF, that's mitigated by the use of triggers to keep the data in sync.

Since MySQL 5.7.6, we can use an auto generated column to hold the substring with an index on it. Consider the following DDL statement:

CREATE TABLE SomeTable (
    id CHAR(10),
    sub_id CHAR(8) AS SUBSTRING(id, 1, 8) STORED, INDEX(sub_id)
)

In the above DDL, the value that is stored in the sub_id column is "auto generated" from the id column.

What is the purpose of PROCEDURE ANALYSE() and how can we use it to optimize performance?

We can use PROCEDURE ANALYSE() to help us determine whether columns can be redefined to smaller data types. It can also determine whether a column contains only a small number of values and could be defined as ENUM. To tell PROCEDURE ANALYSE() not to suggest long ENUM definitions, pass it two arguments indicating the maximum number of elements and number of characters allowed in the definitions:

mysql> SELECT * FROM CountryLanguage PROCEDURE ANALYSE(10,256)\G
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License