The open source RDBMS MySQL provides a great tool mysqdump to dump database. Its the official sql dump utitlity for MySQL database. It makes the life of dba so easy that he can backup and restore the database within just two commands. But sometimes due to the lacking of the infrastructure you can not dump and restore that easily. Specially when you are dealing with huge amount of data. Our database grows over time. Few hundred GBs are quite common. But if you run a software for long it might get in to Tera byte range. The problem starts when size is this huge.
Backing up and restoring such huge database is not easy. It takes a lot of time to both backup and restore. Not to mention, if something breaks up the operation the output will most time corrupted. For backup operation the output is a dump file. This will be corrupted if backup is interrupted. Just think if it breaks at 99%. It feels like pulling your hair! Think if it was a restore operation. Your database will be corrupted. Usually we do restore on production server. If the restore fails, that’ll be a disaster.
To come round this, Its better to backup and restore part by part. This way your database will less likely be corrupted. Here is some ways to achieve this,
- Split the database by tables. When backing up, split the backup operation by tables. Just do some balance on the table size and then backup on each table. For example, If you have 3 tables 10 GB each and 2 tables 30 GB each then you can backup tables in 3 groups.
- Group 1. All the 3 tables 10 GBs each totaling 30 GB
mysqldump database1 table1 table2 table3 > table1-3.sql
- Group 2. One of the two 30 GB tables.
mysqldump database1 table4 > table4.sql
- Group 3. One of the two 30 GB tables.
mysqldump database1 table5 > table5.sql
- auto_increment. If your table has auto column you can split by any number of chunks. See the examples.
- Split the odd and even rows
mysqldump --where "id%2=0" database1 table4 > table4_even.sql mysqldump --where "id%2=1" database1 table4 > table4_odd.sql
- Using any number
mysqldump --where "id%4=0" database1 table5 > table5_0.sql mysqldump --where "id%4=1" database1 table5 > table5_1.sql mysqldump --where "id%4=2" database1 table5 > table5_2.sql mysqldump --where "id%4=3" database1 table5 > table5_3.sql
- limit clause. Even you are using where clause, with a little hack you can use limit clause. This allows true partition of the tables. For example if you have 18000 rows in a table you can dump in two parts like this,
mysqldump --where "1 LIMIT 0, 10000" database1 table5 > table5_part_1.sql mysqldump --where "1 LIMIT 10000, 10000" database1 table5 > table5_part_2.sql
- Others. As the –where switch allows any sql condition you can use any criteria. You can split a user table by sex (male and female), by age group, by any enum column you are using, by date span etc. Some examples,
mysqldump --where "sex='M'" database1 user > user_m.sql mysqldump --where "sex='F'" database1 user > user_f.sql mysqldump --where "account='ACTIVE'" database1 user > user_active.sql mysqldump --where "account='INACTIVE'" database1 user > user_inactive.sql mysqldump --where "year(date) <= 2008" database1 payment > payment_prior_2009.sql mysqldump --where "year(date) > 2008" database1 payment > payment_2009_and_after.sql
6 thoughts on “Splitting Huge MySQL dump for easy restore”
Something you forgot, you need to include “–skip-add-locks –no-create-info” otherwise the tables will get dropped and locked when you try and import the data again. not good!
Nice examples, I’ll be trying ‘B.’ the limit clause very soon, looks like I may need to add ‘–no-create-info’ to the commands other than the first ?
After the first mysqldump you MUST add ‘–skip-add-locks –no-create-info’ options for successive mysqldump commands.
Thanks, that hint with “where” was very useful!
It is possible to combine a dump and a restore on one line by using a pipe ‘|’ to pass the output of the dump directly to mysql basically bypassing the file. This may initially seem a bit redundant, but we can use this method to copy a database to another server or even create a duplicate copy.
I’ve created MySQLDumpSplitter.java which, unlike bash scripts, works on Windows. It’s
available here https://github.com/Verace/MySQLDumpSplitter. Some good ideas for MySQL dumps though – I’ll have to revisit the docco!