How does mysqldump and mysqlimport work?
Each client program (mysqldump and mysqlimport) examines its arguments to determine what you want it to do, and then constructs appropriate SQL statements and sends them to the server on your behalf.
What are the purpose of LOAD DATA INFILE and SELECT … INTO OUTFILE?
LOAD DATA INFILE reads records from a data file, and inserts them into a table. SELECT … INTO OUTFILE writes the records to a file. The two statements are not quite opposites. LOAD DATA INFILE imports a file into a single table, whereas SELECT … INTO OUTFILE can write a result set that may be produced by selecting from multiple tables.
How can we use SELECT INTO OUTFILE?
SELECT column1, column2 FROM mydatabase.mytable INTO OUTFILE "c:/path/outfile.txt";
The INTO OUTFILE clause comes before the FROM clause:
SELECT * INTO OUTFILE 'Country.txt' FROM Country;
The INTO OUTFILE clause changes the operation of the SELECT statement in several ways:
- The output produced by a SELECT … INTO OUTFILE statement never leaves the server host. Instead of sending the result over the network to the client, the server writes it to a file on the server host. To prevent files from being overwritten, either accidentally or maliciously, the server requires that the output file not already exist.
- The statement causes the server to write a new file on the server host, so you must connect to the server using an account that has the FILE privilege.
- The file is created with filesystem access permissions that make it owned by MySQL server but world-readable.
- The output file contains one line per row selected by the statement. By default, column values are delimited by tab characters and lines are terminated with newlines, but you can control the output format by adding format specifiers after the filename
How can we use LOAD DATA INFILE?
LOAD DATA INFILE "C:/path/outfile.txt" IGNORE INTO TABLE mydatabase.myfulltexttable
LOAD DATA INFILE file_name INTO TABLE table_name IGNORE 1 LINES SELECT ... INTO OUTFILE
The simplest form of the LOAD DATA INFILE statement specifies only the name of the data file and the table into which to load the file:
LOAD DATA INFILE 'file_name' INTO TABLE table_name;
The filename is given as a quoted string. On Windows, the pathname separator character is the backslash, but MySQL treats the backslash as the escape character in strings. To deal with this issue, write separators in Windows pathname either as '/' or as '\\'.
MySQL assumes, unless told otherwise, that the file is located on the server, and that it has the default file format (columns separated by tab characters and terminated by \n newline characters), and that each input line contains a value for each column in the table.
With LOAD DATA INFILE you can also specify:
- which columns to load
- whether to skip or transform data values before loading them
- whether to ignore lines at the beginning of the data file
- how to handle duplicate records
- the format of the data file
LOAD DATA [LOCAL] INFILE 'file_name' [IGNORE | REPLACE] INTO TABLE table_name format_specifiers [IGNORE n LINES] [(column_list)] [SET (assignment_list)]
The rules for interpreting the filename are somewhat different for the server host and client host. Without the local keyword, MySQL looks for the data file located on the server host and interpretes the pathname as follows:
- If you refer to the file by its full name, the server looks for the file in that exact location.
- If you specify a relative name with a single component, the server looks for the file in the database directory for the default database.
- If you specify a relative pathname with more than one component, the server interprets the name relative to its data directory.
If you use LOCAL to read a data file located locally on the client host, pathname interpretation is simpler:
- If you refer to the file by its full name, the client program looks for the file in that exact location.
- If you specify a relative pathname, the client program looks for the file relative to its current directory.
How does LOAD DATA INFILE behave if we do not specify the order of the columns?
By default, LOAD DATA INFILE assumes that data values in input lines are present in the same order as the columns in the table. If the data file contains more columns than the table, MySQL ignores the excess data values. If the data file contains too few columns, each missing columns is set to its default value in the table.
If input lines don't contain values for every table column, or the data values are not in the same order as table columns, add a comma-separated list of column names within parentheses at the end of the LOAD DATA INFILE statement. This tells MySQL how columns in the table correspond to successive columns in the data file. A column list is useful in two ways:
- If the rows of the data file don't contain a value for every column in the table, a column list indicates which columns are present in the file.
- If the order of the columns in the data file does not correspond to the order of the columns in the table, a column list tells MySQL how to match up columns properly.
Each item in the column list can be a table column name, or a user variable.
Is it possible to skip columns in the data file or transform data values read from file with LOAD DATA INFILE?
It is possible to skip columns in the data file, or to transform data values read from the file before inserting them into the table. These features are available by specifying user variables in the column list and the optional SET clause. To assign an input data column to a user variable rather than to a table column, provide the name of a user variable in the column list. If you assign the column to a user variable but do nothing else with the variable, the effect is to ignore the column. By including a SET clause, you can use expression that transform the value before inserting it.
LOAD DATA INFILE '/tmp/people2.txt' INTO TABLE subscriber (@skip,@first,@last,address) SET name=CONCAT(@first, ' ', @last);
When loading a file that is located on the server host, how does LOAD DATA INFILE handles records that contain duplicate unique keys?
- By default, an input record that cause a duplicate-key violation results in an error and the rest of the file isn't loaded. Records processed up to that point are loaded into the table.
- If you specify the IGNORE keyword following the filename, records that cause duplicate-key violations are ignored and no error occurs. LOAD DATA INFILE processes the entire file, loads all records not containing duplicate keys, and discards the rest.
- If you specify the REPLACE keyword after the filename, records that cause duplicate-key violations replace any records already in the table that contain the duplicate key values. LOAD DATA INFILE processes the entire file and loads all its records into the table.
For data files located on the client host, duplicate unique key handling is similar, except that it default to ignore records that contain duplicate keys.
Data File Format Specifiers:
LOAD DATA INFILE and SELECT … INTO OUTFILE assumes a default data file format in which column values are separated by tab characters, and records are terminated by newlines. If a data file to be read by LOAD DATA INFILE has a different column separator or line terminator, you must indicate what the format is so that MySQL can read the file content correctly.
The format specifiers supported by LOAD DATA INFILE and SELECT INTO OUTFILE don't enable you to characterize individual columns. For LOAD DATA INFILE, format specifiers are listed after the table name. For SELECT INTO OUTFILE, they follow the output filename. The syntax for format specifiers:
FIELDS TERMINATED BY 'string' ENCLOSED BY 'char' ESCAPED BY 'char' LINES TERMINATED BY 'string'
The TERMINATED BY, ENCLOSED BY, and ESCAPED BY parts of the FIELDS clause may be given in any order. You need not specify all three parts.
A variation of ENCLOSED BY is OPTIONALLY ENCLOSED BY. This is the same as ENCLOSED for LOAD DATA INFILE, but different for SELECT INTO OUTFILE: The presence of OPTIONALLY causes output value quoting only for string column, not for all columns.
The ESCAPED BY option controls only the handling of values in the data file.
LOAD DATA INFILE '/tmp/data.txt' INTO TABLE t FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\r';
SELECT * INTO OUTFILE '/tmp/data-out.txt' FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\r' FROM t;
A NULL value indicates the absence of a value or an unknown value, which is difficult to represent literally in a data file. For import and export purposes, MySQL use the convention of representing NULL by \N. For LOAD DATA INFILE, a \N appearing unquoted by itself as a column value is interpreted as NULL. MySQL users sometimes assume that an empty value in an input file will be handled as a NULL value, but that isn't true. For SELECT … INTO OUTFILE, MySQL writes NULL values to the output file as \N.
mysqlimport provides command-line interface for the LOAD DATA INFILE statement. It examines options given on the command line, connects to the server, and for each file named, issue a LOAD DATA INFILE statement.
shell> mysqlimport options dbname input_file ...
mysqlimport uses each filename to determine the name of the corresponding table into which the file's content should be loaded. The program does this by stripping off any file name extension, and using the result as the table name. Each table to be loaded by mysqlimport must already exist.
--ignore --replace --local --lines-terminated-by=string --fields-terminated-by=string --fields-enclosed-by=char --fields-optionally-enclosed-by=char --fields-escaped-by=char
shell> mysqlimport --fields-enclosed-by='"' world City.txt
The above command work on most Unix shell, which allow the double quote character to be quoted within single quotes. This doesn't work on Windows, where you must specify a double quote within a double-quoted string by escaping it:
shell> mysqlimport --fields-enclosed-by="\"" world City.txt
For LOAD DATA LOCAL INFILE to work, the server must be configured to allow local files to be transferred to it.
To use mysqldump to export tables as data files, specify —tab=dir_name (or -T dir_name) option on the command line. This option causes mysqldump to issue SELECT INTO OUTFILE statements. For each table, mysqldump writes a file containing a CREATE TABLE statement that you can use to re-create the table before reloading the data file into it.
To export multiple tables, name all of them following the database name:
shell> mysqldump --tab=dir_name options dbname tbl_name ...
If you don't specify any table names, mysqldump exports all tables in the database.
The output of mysqldump, when used with the —tab=dir_name option, consists of a .sql file containing the CREATE TABLE statement, and a .txt file containing the table data.
To reload data exported by mysqldump —tab option, change location into the dump directory, and:
shell> mysql world < City.sql shell> mysqlimport world City.txt
Using —tab to produce tab-delimited dump file is much faster than creating SQL-format files, but you should keep in mind:
- The CREATE TABLE statement for each table is sent by the server to mysqldump, which writes it to the file named table_name.sql in the dump directory on the client host.
- Table contents are written directly by the server into a file named table_name.txt in the dump directory on the server host.
- To minimize confusion, run mysqldump on the server host, specify the dump directory using it full pathname so that mysqldump and the server both interpret it as the same location, and specify a dump directory that is writable both to you and to the server.
- The MySQL account that you use for connecting to the server must have the FILE privileges
- To create only the data files and not the .sql files, use —no-create-info