For MyISAM tables, the OPTIMIZE TABLE tb_name also update index statistics (so we don't need to issue another ANALYZE TABLE statement)
MyISAM Index Caching:
MySQL uses a key cache to buffer index information for MyISAM tables in memory. The key cache improves performance by reducing the need to read index block from disk each time they are accessed. By default, there is a single key cache so all MyISAM tables share it.
MySQL uses the key cache automatically for MyISAM tables, but you can control cache behavior in several ways:
- One way to improve server performance is to make the key cache larger so that the server can hold more index information in memory.
- You can create additional key caches and assign specific tables to them. If a table is heavily used, and you want to make sure that its index information is never displaced from the cache by indexes from other tables, create a separate cache and dedicate it for use by the table.
- You can preload a table's indexes into the cache to which it is assigned. This causes the server to load the index all at once, which is more efficient than having to read blocks individually as they are needed.
The following example shows how to create a key cache for use by a particular table:
1. Create the key cache. Each cache is associated with a set of system variables that are grouped as components of a structured system variable. Each structured variable has a name, so you refer to a component variable using cache_name.var_name syntax. To create a key cache, assign a value to any of its component values. Thus, a cache named city_cache with a size of 4MB is created as follows:
mysql> SET GLOBAL city_cache.key_buffer_size = 4194304;
Key cache system variables are global, so the GLOBAL keyword is necessary. You must have SUPER privilege to set global variables.
2. Assign the City table to the city_cache key cache by using a CACHE INDEX statement:
mysql> CACHE INDEX world.City IN city_cache
3. Once the table has been assigned to city_cache, MySQL discards any index information for the table that currently is in the default key cache and begins using the new cache for queries that refer to the table.
If you want to preload the table's indexes into the cache immediately, use a LOAD INDEX INTO CACHE statement:
mysql> LOAD INDEX INTO CACHE world.City
The above example demonstrates how to setup a key cache at runtime by issuing the appropriate SQL statement manually. To configure the cache every time the server starts, put the statements in an initialization file and use an —init-file option that names the file. Supposed that the data directory is /usr/local/mysql/data. Create a file named server.init in that directory and place the following statements in the file:
SET GLOBAL city_cache.key_buffer_size = 4194304; CACHE INDEX world.City IN city_cache LOAD INDEX INTO CACHE world.City
and put the following lines in an option file:
When the server starts, it will read and execute the statements in the file, causing the city_cache key cache to be set up.
The preceding discussion demonstrates how to associate a single table with a key cache, but you need not create a cache for each table. You might create a cache and associate a group of tables with it, such as all the tables used by a particular application.
More on MyISAM Key Cache:
The key cache (key buffer) is a resource in which the server caches index blocks that it reads from MyISAM tables. The key_buffer_size system variable controls the size of the key cache.
Indexes speed up retrievals. If we can keep index in memory and reuse them, performance is even better. When MySQL needs to read an index block, it check first whether the block is in its key cache. If so, it can satisfy the read requests immediately using a block in the cache. If not, it reads the block from disk, and put it in the key cache. If the key cache is full when a block needs to be read, the server discards a block already in the cache to make room for the new block.
The ideal situation is for MySQL to consistently find the index blocksi that it needs in the cache without having to read from disk. In other words, Key_reads should remain as low as possible relative to Key_read_requests.
The size of the key cache is set using key_buffer_size system variable, and the effectiveness can be measured using Key_reads and Key_read_requests status variable.
Miss rate: Key_reads / Key_read_requests
Key cache efficiency: 1 - (Key_reads / Key_read_requests)
We want the miss rate to be as close to zero as possible, and the efficiency as close to one as possible.
If we have memory available, we can improve the key cache effectiveness by increasing the value of key_buffer_size system variable. Its default value is 8MB.
If you use MyISAM tables exclusively, set key_buffer_size up to 30-40% of available memory. Right size depends on amount of indexes, data size and workload – remember MyISAM uses OS cache to cache the data so you need to leave memory for it as well, and data can be much larger than indexes in many cases. Check however if all of key_buffer is used over time – it is not rare to see key_buffer being set to 4G while combined size of .MYI files is just 1GB. This would be just a waste. If you use few MyISAM tables you’ll want to keep it lower but still at least 16-32Mb so it is large enough to accommodate indexes for temporary tables which are created on disk. See http://www.mysqlperformanceblog.com/2006/09/29/what-to-tune-in-mysql-server-after-installation/
The key buffer is shared by all threads.
MySQL does not cache data reads, you will have to leave some room for the OS filesystem cache. See http://www.interworx.com/support/docs/nodeworx/mysql/howto-mysql-options
Table symlinking: To distribute disk activity, use table symlinking to move some of your MyISAM tables to different disks than the one where the data directory is located.
MyISAM row-storage formats
MyISAM supports three row formats for storing table contents. These row-storage formats have an impact on query efficiency. The three allowable formats are fixed-length, dynamic-length, and compressed.
With fixed-length row format, every row in a table has the same size. Consequently, every row in the table's data file is stored at a position that is a multiple of the row size. This makes it easier to look up rows, with the result that MySQL typically can process fixed-row tables more quickly than dynamic-row tables. However, fixed-row tables on average take more space than dynamic-row tables. Fixed-row tables are not very subject to fragmentation when delete occurs, because the hole left by any deleted row can be exactly filled by any new row.
With dynamic-length row format, rows in a table use varying amounts of storage. As a result, rows are not stored at fixed positions within data files. Each row has extra information that indicates how long the row is, and it's also possible for a row to be stored non-contiguously with different pieces in different locations. This makes retrievals more complex, and thus slower. Dynamic-row tables generally take less space than fixed-row tables. However, if a table is updated frequently, this storage format can result in fragmentation and wasted space. It can be useful to run OPTIMIZE TABLE from time to time to defragment the table.
Compressed tables are packed to save space and stored in optimized form that allows quick retrievals. Compressed tables are read-only, so this table format cannot be used for tables that will be updated. To create a compressed table, use the myisampack utility. It can create compressed tables from either fixed-row or dynamic-row MyISAM tables, and can compress columns of any data type.
Before MySQL 5, MyISAM table could use fixed-row table format only if the table contained no columns with variable-length data types (VARCHAR, VARBINARY, TEXT, or BLOB). If any column had a variable-length type, dynamic-row format was used. As of MySQL 5, fixed-row format can be used as long as the table does not contain any TEXT or BLOB columns.
To specify a row format explicitly when creating a new table, include a ROW_FORMAT table option (the value can be FIXED or DYNAMIC:
CREATE TABLE t (c CHAR(50)) ROW_FORMAT = FIXED;
To convert a MyISAM table from one format to another, use the ROW_FORMAT option with ALTER TABLE:
ALTER TABLE t ROW_FORMAT = DYNAMIC;
To determine what storage format a table has, use the SHOW TABLE STATUS statement and examine the value of Row_format field:
mysql> SHOW TABLE STATUS LIKE 'Country' \G
Only the myisampack utility can set the format to Compressed.
You can also obtain storage format information from the INFORMATION_SCHEMA database:
mysql> SELECT TABLE_NAME, ROW_FORMAT FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'world';
Using Compressed MyISAM Tables
A fixed-row or dynamic-row MyISAM table can be converted to compressed form to save storage space. In many cases, compressing a table improves lookup speed as well because the compression operation optimizes the internal structure of the table to make retrieval faster. However compressed table must be read-only, therefore cannot be updated.
To be a good candidate for compression, a table should contain records that will not be updated in the future, such as archival data or log records. If you log records into different tables by year or month, you can compress all the log tables except the one for the current year or month. To treat the tables as a single logical table, group them by using a MERGE table.
A compressed table is read-only, so a MyISAM table should be compressed only if its content will not change after it has been populated. If you must modify a compressed table, you can uncompress it, modify it, and compress it again.
To compress a MyISAM table, use the myisampack utility. It is also necessary to use myisamchk afterward to update the indexes.
Procedure to compress a table:
- Back up the tables (just in case): mysqldump world > world.sql
- Stop the server
- Change location into the database directory where the world tables are store, and then use myisampack to compress them: myisampack Country City CountryLanguage
- After compressing a table, you should run myisamchk to rebuild the indexes: myisamchk -rq Country City CountryLanguage
- Restart the server
If we want to access how effective a table-packing operation is, use SHOW TABLE STATUS before an after. The Data_length and Index_length values should be small afterward:
mysql> SHOW TABLE STATUS FROM world LIKE 'City' \G
Procedure to uncompress a table:
- Back up the tables (just in case): mysqldump world > world.sql
- Stop the server
- Change location into the database directory where the world tables are store, and then use: myisamchk —unpack table_name
- Restart the server
Another way to un-compress a table is to dump it, drop it, and re-create it. Do this while the server is running. For example, if the Country table is compressed, you can uncompress it:
shell> mysqldump world Country > dump.sql shell> mysql world < dump.sql
By default, mysqldump output written to the dump file includes a DROP TABLE statement. When you process the file with mysql, that statement drops the compressed table, and the rest of the dump file re-create the table in uncompressed form.
Splitting Dynamic-row MyISAM Tables
If a dynamic-row MyISAM table contains a mix of fixed-length and variable-length columns but many of the queries on the table access only its fixed-length columns, it is sometime possible to gain advantages both of fixed-row tables (faster retrieval) and of dynamic-row tables (lower storage requirements) by splitting the table into two tables. Use a fixed-row table to hold the fixed-length columns and a dynamic-row table to hold the variable-length columns. Use the following procedure to split a table into two tables:
- Make sure that the table contains a primary key that allows each record to be uniquely identified (You might use an AUTO_INCREMENT column)
- Create a second table that has columns for all the variable-length columns in the original table, plus a column to store values from the primary key of the original table. This column should be a primary key as well, but should not be an AUTO_INCREMENT column. Specify the ROW_FORMAT = DYNAMIC option when you create the table
- Copy the content of the primary key column and the variable-length columns from the original table to the second table
- Use ALTER TABLE to drop the variable-length columns from the original table. Include a ROW_FORMAT = FIXED option as well, to make sure the table is converted to fixed-row format.
After splitting the table, queries that retrieve only fixed-length columns can use the fixed-row table, and will be quicker. For queries that retrieve both fixed-length and variable-length columns, we need to rewrite the queries (join the two tables using the primary key)
Keep Optimizer Information Up to Date
Use the ANALYZE TABLE statement:
ANALYZE TABLE table_name;
Update the index statistics after a table has been loaded initially, and periodically thereafter if the table continues to be modified.
If the table contains columns with variable-length data types such as BLOB, TEXT, or VARCHAR, updates and deletes can cause the table to become fragmented. Optimizing the table periodically reorganizes its contents by defragmenting it to eliminate wasted space and coalescing values that might have gotten split into non-contiguous pieces. To optimize a MyISAM table, use OPTIMIZE TABLE:
OPTIMIZE TABLE table_name;
For MyISAM tables, the OPTIMIZE TABLE statement also updates index statistics, so you don't need to use ANALYZE TABLE if you are already using OPTIMIZE TABLE.
FULLTEXT searching is a feature that can be used with MyISAM tables. FULLTEXT indexes are designed to make text searching fast and easy. They have the following characteristics:
- Each column in a FULLTEXT index must have a non-binary string data type (CHAR, VARCHAR, or TEXT). We cannot use binary string data types (BINARY, VARBINARY, or BLOB)
- FULLTEXT indexes can be case sensitive or not, depending on the collation of the indexed columns
- The syntax for defining a full-text index is much like that for other indexes: an index-type keyword (FULLTEXT), an optional index name, and a parenthesized list of one or more column names to be indexed. A FULLTEXT index may be created with CREATE TABLE, added to the table with ALTER TABLE or CREATE INDEX, and dropped from a table with ALTER TABLE or DROP INDEX.
- Column prefixes are not applicable to FULLTEXT indexes, which always index entire columns. If you specify a prefix length for a column in a FULLTEXT index, MySQL ignores it.
- FULLTEXT index indexes can be constructed on multiple columns, allowing searches to be conducted simultaneously on all indexed columns. However, leftmost index prefixes are not applicable for FULLTEXT indexes. You must construct one index for every column or combination of columns you want to search. Suppose that you want to search for text sometimes only in column c1 and sometimes in both c1 and c2. You must construct two FULLTEXT indexes: one on column c1 and another on columns c1 and c2.
CREATE TABLE t (name CHAR(40), FULLTEXT(name)); ALTER TABLE t ADD FULLTEXT name_idx (name); ALTER TABLE t DROP INDEX name_idx; CREATE FULLTEXT INDEX name_idx ON t (name); DROP INDEX name_idx ON t;
To perform a FULLTEXT search, use MATCH() and AGAINST(). For example, to search the table t for records that contain 'Wendell' in the name column:
SELECT * FROM t WHERE MATCH(name) AGAINST('Wendell');
The MATCH operator names the column or columns you want to search. There must be a FULLTEXT index on exactly those columns. If you want to search different sets of columns, you'll need one FULLTEXT index for each set. If a table 'people' has 'name' and 'address' columns and you want to search them either separately or together, three FULLTEXT indexes are needed:
CREATE TABLE people (name CHAR(40), address CHAR(40), FULLTEXT(name), FULLTEXT(address), FULLTEXT(name,address));
These indexes allow queries such as the following to be formulated:
SELECT * FROM people WHERE MATCH(name) AGAINST('string'); SELECT * FROM people WHERE MATCH(address) AGAINST('string'); SELECT * FROM people WHERE MATCH(name,address) AGAINST('string');
MyISAM Maximum Row Count
Internally, MyISAM storage engine represents pointers to rows within a table using values that take from two to seven bytes each. The size for a given table is determined at table creation time, but can be changed with ALTER TABLE.
Before MySQL 5, the default pointer size was 4 bytes, which allows for up to 4GB of data in MyISAM table. In MySQL 5, the default pointer size is six bytes, which allows for up to 256TB of data. You can provide hints to MyISAM about how large the table might become, or set the pointer size directly. With larger pointer size, MyISAM tables can contain up to 65,536TB of data.
When you expect a table to contain many rows, MAX_ROWS is useful for telling MyISAM that it needs to use larger internal row pointers so that the amount of table data can be larger than 256TB allowed by the default 6 bytes pointer size. Conversely, if you know a table will be small, specifying a small MAX_ROWS value tells MyISAM to use smaller pointers. This save space and improve table processing efficiency.
To provide the server a hint when you create the table, specify an option in the CREATE TABLE. You can change the option later with ALTER TABLE. The following statement indicates to MySQL that the table must be able to contain at least 2,000,000 rows:
CREATE TABLE t (i INT) MAX_ROWS = 2000000;
If a table reaches the number of row limit allowed by its row pointer size, a 'data file full' error occurs and you cannot add any more rows. This error is not related to running out of disk space or reaching the maximum file size allowed by MyISAM or the filesystem. It means that you need to increase the row pointer size:
ALTER TABLE t MAX_ROWS = 4000000;
MAX_ROWS = n does not place an absolute limit of n on the number of rows a table can contain. It means that the table must be able to contain at least n rows. The table might well be able to hold more than n rows.
A related option, AVG_ROW_LENGTH, also gives the server information that it can use to estimate how large the table may become. This option might be helpful for tables with variable-length columns. It is not necessary for tables with fixed-length columns because the server know how long each row is.
The MAX_ROWS and AVG_ROW_LENGTH options may be used separately or together. For example, if a table has a BIGINT column (8 byte each) and a VARCHAR(200) column where you expect the average string length to be 100 bytes, you can specify an AVG_ROW_LENGTH value of 108. If you also want to make sure that the table can hold 4 million rows, create it like this:
CREATE TABLE t (i BIGINT, c VARCHAR(200)) AVG_ROW_LENGTH = 108 MAX_ROWS = 4000000;
Using MAX_ROWS and AVG_ROW_LENGTH does not allow the size of MyISAM table files to be expanded beyond the limit imposed by the filesystem.
To determine the values of MAX_ROWS and AVG_ROW_LENGTH for a table, use SHOW TABLE STATUS and check the Create_options field of the output. If the field is empty, the options have never been set explicitly.
mysql> SHOW TABLE STATUS like 't' \G
The default row pointer size is determined from the value of the myisam_data_pointer_size system variable. This variable has a value of 6 initially, but you can set it to any value from 2 to 7. For example, if you routinely create tables that must be larger than 256TB, make the value larger than 6. You can do this at server start-up by setting the value in an option file: