MySQL - Using EXPLAIN

http://phpmaster.com/using-explain-to-write-better-mysql-queries/

EXPLAIN works with SELECT, but can be used for UPDATE and DELETE statements too. Write a SELECT statement that has the same WHERE clause as the UPDATE or DELETE, and use EXPLAIN to analyze the SELECT.

To use EXPLAIN, write your SELECT query as you normally would, but place the keyword EXPLAIN in front of it:

EXPLAIN SELECT 1\G;

EXPLAIN produces several columns of information. NULL in the possible_keys and key columns shows that no index is considered available. The rows column of the EXPLAIN output show effect of this difference. Its value indicates the number of rows that MySQL estimates it will need to examine while processing the query.

For the first query, the value is 239, which happens to be the number of rows in the Country table. In other words, MySQL would scan all rows in the table, which is inefficient. For the second query, only one row need to be examined. This is because MySQL can use the table's primary key to go directly to the single relevant row.

EXPLAIN outputs one row for each table used in the query. The value of the 'type' column show how MySQL will read the corresponding table. The value of ALL indicates a full scan of all rows.

The number of row combinations is given by the product of values in the 'rows' column.

Good columns to index are those that you typically use for searching, filtering, grouping, or sorting records.

ALTER TABLE CountryList ADD PRIMARY KEY (Code)
ALTER TABLE CityList ADD INDEX (CountryCode)

Observe that EXPLAIN now lists the tables in a different order. CityList appears first, which indicates that MySQL will read rows from that table first and use them to search for matches in the second table, CountryList. The change in table processing order reflects the query optimizer's use of the index information that is now available for executing the query.

The type value of eq_ref indicates that an equality test is performed by referring to the column named in the ref field

The possible_keys value of PRIMARY show that the optimizer sees the primary key as a candidate for optimizing the query, and the key field indicates that it will actually use the primary key when executing the query.

The result from EXPLAIN shows that indexing CountryList.Code as a primary key improves query performance. However, it still indicates a full scan of CityList table. The optimizer sees that the index on CountryCode is available, but the key value of NULL indicates that it will not be used. Does that means the index on the CountryCode column is of no value? It depends. For this query, the index is not used. In general, however, it's good to index joined columns, so you likely would find for other queries on the CityList table that the index does help.

The WHERE clause of the query restricts CityList rows based on their Population values, so try creating an index on that column:

ALTER TABLE CityList ADD INDEX (Population);

The values of rows are only estimates. The optimizer cannot give an exact count without actually executing the query.

EXPLAIN produces one row of output for each table named in each SELECT of the analyzed statement (A statement can have more than one SELECT if it uses subqueries or UNION).

The columns produced by EXPLAIN:

'id' indicates which SELECT in the analyzed statement that the EXPLAIN output refers to.

'select_type' categorizes the SELECT referred to by the output row. The values for this column: SIMPLE (this is a simple SELECT statement involving no subqueries or unions), PRIMARY (the outer SELECT), UNION (the second or later SELECT in a union), DEPENDENT UNION (the second or later SELECT in a union that is dependent on the outer query), UNION RESULT (the result of a union), SUBQUERY (the first SELECT in a subquery), DEPENDENT SUBQUERY (the first SELECT in a subquery that is dependent on the outer query), DERIVED (subquery in the FROM clause)

'table' is the name of the table to which the information in the row applies. The order of the tables indicates the order in which MySQL will read the tables to process the query. This is not necessary the order in which you name them in the FROM clause, because the optimizer attempts to determine which order will result in the most efficient processing.

'type' indicates the join type. The value is a measure of how efficiently MySQL can scan the table. The value in the 'type' column of EXPLAIN output indicates the join type, but joins may be performed with varying degrees of efficiency. The 'type' value provides a measure of this efficiency by indicating the basis on which rows are selected from each table. Possible values (from best to worst):

  • system (the table has exactly one row)
  • const (the table has exactly one matching row. This is similar to system, except that the table may have other non-matching rows. The EXPLAIN output from the query with WHERE Code='FRA' is an example of this)
  • eq_ref (exactly one row is read from the table for each combination of rows from the tables listed earlier by EXPLAIN. This is common for joins where MySQL can use a primary key to identify table rows)
  • ref (several rows may be read from the table for each combination of rows from the tables listed earlier by EXPLAIN. This is similar to eq_ref, but can occur when a non-unique index is used to identify table rows or when only a leftmost prefix of an index is used. For example, the CountryLanguage table has a primary key on the CountryCode and Language columns. If you search using only a CountryCode value, MySQL can use that column as leftmost prefix, but there might be several rows for a country if multiple languages are spoken there)
  • ref_or_null (similar to ref, but MySQL also looks for rows that contain NULL)
  • index_merge (MySQL uses an index merge algorithm)
  • unique_subquery (similar to ref, but used for IN subqueries that select from the primary key column of a single table)
  • index_subquery (similar to unique_subquery, but used for IN subqueries that select from an indexed column of a single table)
  • range (the index is used to select rows that fall within a given range of index values. This is common for inequality comparisons such as id < 10)
  • index (MySQL perform a full scan, but it scans the index rather than the data rows)
  • ALL (a full table scan of all data rows. Typically, this indicates that no optimizations are done and represents the worst case. It is particularly unfortunate when tables listed later in EXPLAIN output have a join type of ALL because that indicates a table scan for every combination of rows selected from the tables processed earlier in the join).

'possible_keys' indicates which of the table's indexes MySQL considers to be candidates for identifying rows that satisfy the query. This value can be a list of one or more index names, or NULL if there are no candidates. The word PRIMARY indicates that MySQL considers the table's primary key to be candidate.

'key' indicates the optimizer's decision about which of the candidate indexes listed in possible_keys will yield most efficient query execution. If the key value is NULL, it means no index was chosen. This might happen either because there were no candidate or because the optimizer believes it will be just as fast to scan the table rows as to use any of the possible indexes. A table scan might be chosen over an index scan if the table is small, or because the index would yield to high a percentage of the rows in the table to be of much use.

'key_len' indicates how many bytes of index rows are used. From this value, you can derive how many columns from the index are used. For example, if you have an index consisting of three INT column, each index row contains three 4-byte values. If key_len is 12, you know that the optimizer use all three columns of the index when processing the query. If key_len is 4 or 8, it uses only the first one or two columns (that is, it uses a leftmost prefix of the index). If you've indexed partial values of string columns, take that into account when assessing the key_len. Supposed that you have a composite index on two CHAR(8) columns that indexes only the first 4 bytes of each column. In this case, a key_len value of 8 means that both column of the index would be used, not just the first column.

'ref' indicates which indexed column or columns are used to choose rows from the table. 'const' means key values in the index are compared to a constant expression, such as in Code='FRA'. NULL indicates that neither a constant nor another column is being used, indicating selection by an expression or range of values. It might also indicate that the column does not contain the value specified by the constant expression. If neither NULL nor 'const' is displayed, a table_name.column_name combination will be shown, indicating that the optimizer is looking at column_name in the rows returned from table_name to identify rows for the current table.

'rows' is the optimizer's estimate of how many rows from the table it will need to examine. The value is an approximation because, in general, MySQL cannot know the exact number of rows without actually executing the query. For a multiple-table query, the product of the values of the 'rows' column is an estimate of the total number number of row combinations that need to be read. This product gives you a rough measure of query performance. The smaller the value, the better.

'Extra' provides other information about the join. The Extra column of EXPLAIN output provides additional information about how a table is processed. Some values indicate that the query is efficient:

  • Using index: MySQL can optimize the query by reading values from the index without having to read the corresponding data rows. This optimization is possible when the query selects only columns that are in the index.
  • Where used: MySQL uses a WHERE clause to identify rows that satisfy the query. Without a WHERE clause, you get all rows from the table.
  • Distinct: MySQL reads a single row from the table for each combination of rows from the tables listed earlier in the EXPLAIN output.
  • Not exists: MySQL can perform a LEFT JOIN "missing rows" optimization that quickly eliminates rows from consideration.

By constrast, some Extra values indicate that the query is not efficient:

  • Using filesort: Rows that satisfy the query must be sorted, which adds an extra processing step.
  • Using temporary: A temporary table must be created to process the query.
  • Range checked for each record: MySQL cannot determine in advance which index from the table to use. For each combination of rows selected from previous tables, it checks the indexes in the table to see which one will be best. This is not great, but it is better than using no index at all.

Using filesort and Using temporary generally are the two indicators of worst performance.

The following query rewriting techniques can be useful:

  • If the keys value is NULL even when there are indexes available, you can try adding a USE INDEX option as a hint to the optimizer which index is relevant for the query. To force MySQL to use the index, use FORCE INDEX. To tell MySQL to ignore an index that it chose and choose a different one instead, use IGNORE INDEX. Each of these options is used in the FROM clause, following the table name containing the index you want to control. The option is followed by parantheses containing a comma-separated list of one or more index names. For example: SELECT Name FROM CountryList IGNORE INDEX(Population) WHERE Code < 'B' AND Population > 5000000. The keyword KEY may be used instead of INDEX in all three options.
  • If you want to force MySQL to join tables in a particular order, begin the query with SELECT STRAIGHT_JOIN, and then list the tables in the desired order in the FROM clause
  • Sometimes a table in a query has an index available, but the query is written in such a way that prevents the index from being used. If you can rewrite the query into an equivalent form that allows the use of the index, do so.
  • Another way to provide the optimizer with better information on which to base its decisions is to change the structure of your tables. If the possible_keys value is NULL in the output from EXPLAIN, it means MySQL finds no applicable index. See whether an index can be added to columns that identify which records to retrieve. For example, if you perform a join by matching a column in one table with a column in another, but neither of the columns is indexed, try indexing them.
  • Keep table index statistics up to date to help MySQL choose optimal indexes. If the table is a MyISAM or InnoDB table, you can update its statistics with the ANALYZE TABLE statement. As a table's contents change, the statistics go out of date and become less useful to the optimizer in making good decisions about query execution strategies. You should run ANALYZE TABLE more frequently for tables that change often than for those that are updated rarely.

Be careful when using EXPLAIN to analyze a statement that includes a subquery in the FROM clause, if the subquery itself is slow. For such a subquery, MySQL must execute it to determine what it returns so that the optimizer can formulate an execution plan for the outer query.


What are the differences between using index, using index condition and using where?

Can anybody explain me the difference between "using index", "using index condition" and "using where" in the EXTRA column information of EXPLAIN

query?"

Using index: the columns selected are part of an index that is used to return the results, there is no need then to read the full table record and the columns will be returned using the already read index record.

Using index condition: The so called index condition push-down. It is an optimization to try to avoid reading the full table record when the access method is using an index that is part of the where condition, if the index does not satisfy this partial condition the other conditions checks will be skipped, this is valid with AND conditions. Not sure if the reverse is valid with OR operator, that is if the index satsfies the condition the record should be read anyway so that the condition check is skipped.

Using where: the where condition is not checked at storage engine level but all records are returned and filtered by mysql

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License